7.9. Utilities
Utils.py: contains basic functions reused in various contexts in other modules
- class minedatabase.utils.Chunks(it: Iterable, chunk_size: int = 1, return_list: bool = False)
A class to chunk an iterator up into defined sizes.
- next() Union[List[chain], chain]
Returns the next chunk from the iterable. This method is not thread-safe.
- Returns
- next_sliceUnion[List[chain], chain]
Next chunk.
- class minedatabase.utils.StoichTuple(stoich, c_id)
- property c_id
Alias for field number 1
- property stoich
Alias for field number 0
- minedatabase.utils.convert_sets_to_lists(obj: dict) dict
Recursively converts dictionaries that contain sets to lists.
- Parameters
- objdict
Input object to convert sets from.
- Returns
- dict
dictionary with no sets.
- minedatabase.utils.file_to_dict_list(filepath: str) list
Accept a path to a CSV, TSV or JSON file and return a dictionary list.
- Parameters
- filepathstr
File to load into a dictionary list.
- Returns
- list
Dictionary list.
- minedatabase.utils.get_atom_count(mol: rdkit.Chem.rdchem.Mol, radical_check: bool = False) Counter
Takes a mol object and returns a counter with each element type in the set.
- Parameters
- molrdkit.Chem.rdchem.Mol
Mol object to count atoms for.
- radical_checkbool, optional
Check for radical electrons and count if present.
- Returns
- atomscollections.Counter
Count of each atom type in input molecule.
- minedatabase.utils.get_compound_hash(smi: str, cpd_type: str = 'Predicted', inchi_blocks: int = 1) Tuple[str, Optional[str]]
Create a hash string for a given compound.
This function generates an unique identifier for a compound, ensuring a normalized SMILES. The compound hash is generated by sanitizing and neutralizing the SMILES and then generating a hash from the sha1 method in the haslib.
- The hash is prepended with a character depending on the type. Default value is “C”:
Coreactant: “X”
Target Compound: “T”
Predicted Compound: “C”
- Parameters
- smistr
The SMILES of the compound.
- cpd_typestr, optional
The Compound Type, by default ‘Predicted’.
- Returns
- Tuple[str, Union[str, None]]
Compound hash, InChI-Key.
- minedatabase.utils.get_dotted_field(input_dict: dict, accessor_string: str) dict
Gets data from a dictionary using a dotted accessor-string.
- Parameters
- input_dictdict
A nested dictionary.
- accessor_stringstr
The value in the nested dict.
- Returns
- dict
Data from the dictionary.
- minedatabase.utils.get_fp(smi: str) rdkit.Chem.AllChem.RDKFingerprint
Generate default RDKFingerprint.
- Parameters
- smistr
SMILES of the molecule.
- Returns
- AllChem.RDKFingerprint
Default fingerprint of the molecule.
- minedatabase.utils.get_reaction_hash(reactants: List[StoichTuple], products: List[StoichTuple]) Tuple[str, str]
Hashes reactant and product lists.
Generates a unique ID for a given reaction for use in MongoDB.
- Parameters
- reactantsList[StoichTuple]
List of reactants.
- productsList[StoichTuple]
List of products.
- Returns
- Tuple[str, str]
Reaction hash and SMILES.
- minedatabase.utils.get_size(obj_0)
Recursively iterate to sum size of object & members.
- minedatabase.utils.mongo_ids_to_mine_ids(mongo_ids: List[str], core_db) int
Convert mongo ID to a MINE ID for a given compound.
- Parameters
- mongo_idList[str]
List of IDs in Mongo (hashes).
- core_dbMINE
Core database connection. Type annotation not present to avoid circular imports.
- Returns
- mine_idint
MINE ID.
- minedatabase.utils.neutralise_charges(mol: rdkit.Chem.rdchem.Mol, reactions=None) rdkit.Chem.rdchem.Mol
Neutralize all charges in an rdkit mol.
- Parameters
- molrdkit.Chem.rdchem.Mol
Molecule to neutralize.
- reactionslist, optional
patterns to neutralize, by default None.
- Returns
- molrdkit.Chem.rdchem.Mol
Neutralized molecule.
- minedatabase.utils.postsanitize_smiles(smiles_list)
Postsanitize smiles after running SMARTS. :returns tautomer list of list of smiles
- minedatabase.utils.prevent_overwrite(write_path: str) str
Prevents overwrite of existing output files by appending “_new” when needed.
- Parameters
- write_pathstr
Path to write.
- Returns
- str
Updated path to write.
- minedatabase.utils.save_dotted_field(accessor_string: str, data: dict)
Saves data to a dictionary using a dotted accessor-string.
- Parameters
- accessor_stringstr
A dotted path description, e.g. “DBLinks.KEGG”.
- datadict
The value to be stored.
- Returns
- dict
The nested dictionary.