7.9. Utilities

Utils.py: contains basic functions reused in various contexts in other modules

class minedatabase.utils.Chunks(it: Iterable, chunk_size: int = 1, return_list: bool = False)

A class to chunk an iterator up into defined sizes.

next() Union[List[chain], chain]

Returns the next chunk from the iterable. This method is not thread-safe.

Returns
next_sliceUnion[List[chain], chain]

Next chunk.

class minedatabase.utils.StoichTuple(stoich, c_id)
property c_id

Alias for field number 1

property stoich

Alias for field number 0

minedatabase.utils.convert_sets_to_lists(obj: dict) dict

Recursively converts dictionaries that contain sets to lists.

Parameters
objdict

Input object to convert sets from.

Returns
dict

dictionary with no sets.

minedatabase.utils.file_to_dict_list(filepath: str) list

Accept a path to a CSV, TSV or JSON file and return a dictionary list.

Parameters
filepathstr

File to load into a dictionary list.

Returns
list

Dictionary list.

minedatabase.utils.get_atom_count(mol: rdkit.Chem.rdchem.Mol, radical_check: bool = False) Counter

Takes a mol object and returns a counter with each element type in the set.

Parameters
molrdkit.Chem.rdchem.Mol

Mol object to count atoms for.

radical_checkbool, optional

Check for radical electrons and count if present.

Returns
atomscollections.Counter

Count of each atom type in input molecule.

minedatabase.utils.get_compound_hash(smi: str, cpd_type: str = 'Predicted', inchi_blocks: int = 1) Tuple[str, Optional[str]]

Create a hash string for a given compound.

This function generates an unique identifier for a compound, ensuring a normalized SMILES. The compound hash is generated by sanitizing and neutralizing the SMILES and then generating a hash from the sha1 method in the haslib.

The hash is prepended with a character depending on the type. Default value is “C”:
  1. Coreactant: “X”

  2. Target Compound: “T”

  3. Predicted Compound: “C”

Parameters
smistr

The SMILES of the compound.

cpd_typestr, optional

The Compound Type, by default ‘Predicted’.

Returns
Tuple[str, Union[str, None]]

Compound hash, InChI-Key.

minedatabase.utils.get_dotted_field(input_dict: dict, accessor_string: str) dict

Gets data from a dictionary using a dotted accessor-string.

Parameters
input_dictdict

A nested dictionary.

accessor_stringstr

The value in the nested dict.

Returns
dict

Data from the dictionary.

minedatabase.utils.get_fp(smi: str) rdkit.Chem.AllChem.RDKFingerprint

Generate default RDKFingerprint.

Parameters
smistr

SMILES of the molecule.

Returns
AllChem.RDKFingerprint

Default fingerprint of the molecule.

minedatabase.utils.get_reaction_hash(reactants: List[StoichTuple], products: List[StoichTuple]) Tuple[str, str]

Hashes reactant and product lists.

Generates a unique ID for a given reaction for use in MongoDB.

Parameters
reactantsList[StoichTuple]

List of reactants.

productsList[StoichTuple]

List of products.

Returns
Tuple[str, str]

Reaction hash and SMILES.

minedatabase.utils.get_size(obj_0)

Recursively iterate to sum size of object & members.

minedatabase.utils.mongo_ids_to_mine_ids(mongo_ids: List[str], core_db) int

Convert mongo ID to a MINE ID for a given compound.

Parameters
mongo_idList[str]

List of IDs in Mongo (hashes).

core_dbMINE

Core database connection. Type annotation not present to avoid circular imports.

Returns
mine_idint

MINE ID.

minedatabase.utils.neutralise_charges(mol: rdkit.Chem.rdchem.Mol, reactions=None) rdkit.Chem.rdchem.Mol

Neutralize all charges in an rdkit mol.

Parameters
molrdkit.Chem.rdchem.Mol

Molecule to neutralize.

reactionslist, optional

patterns to neutralize, by default None.

Returns
molrdkit.Chem.rdchem.Mol

Neutralized molecule.

minedatabase.utils.postsanitize_smiles(smiles_list)

Postsanitize smiles after running SMARTS. :returns tautomer list of list of smiles

minedatabase.utils.prevent_overwrite(write_path: str) str

Prevents overwrite of existing output files by appending “_new” when needed.

Parameters
write_pathstr

Path to write.

Returns
str

Updated path to write.

minedatabase.utils.save_dotted_field(accessor_string: str, data: dict)

Saves data to a dictionary using a dotted accessor-string.

Parameters
accessor_stringstr

A dotted path description, e.g. “DBLinks.KEGG”.

datadict

The value to be stored.

Returns
dict

The nested dictionary.