7.5. Pickaxe¶
Pickaxe.py: Create network expansions from reaction rules and compounds.
This module generates new compounds from user-specified starting compounds using a set of SMARTS-based reaction rules.
- class minedatabase.pickaxe.Pickaxe(rule_list: Optional[str] = None, coreactant_list: Optional[str] = None, explicit_h: bool = False, kekulize: bool = True, neutralise: bool = True, errors: bool = True, inchikey_blocks_for_cid: int = 1, database: Optional[str] = None, database_overwrite: bool = False, mongo_uri: bool = 'mongodb://localhost:27017', image_dir: Optional[str] = None, quiet: bool = True, react_targets: bool = True, filter_after_final_gen: bool = True, prune_between_gens: bool = False)¶
Class to generate expansions with compounds and reaction rules.
This class generates new compounds from user-specified starting compounds using a set of SMARTS-based reaction rules. It may be initialized with a text file containing the reaction rules and coreactants or this may be done on an ad hoc basis.
- Parameters
- rule_liststr
Filepath of rules.
- coreactant_liststr
Filepath of coreactants.
- explicit_hbool, optional
Whether rules utilize explicit hydrogens, by default True.
- kekulizebool, optional
Whether or not to kekulize compounds before reaction, by default True.
- neutralisebool, optional
Whether or not to neutralise compounds, by default True.
- errorsbool, optional
Whether or not to print errors to stdout, by default True.
- inchikey_blocks_for_cidint, optional
How many blocks of the InChI key to use for the compound id, by default 1.
- databasestr, optional
Name of the database where to save results, by default None.
- database_overwritebool, optional
Whether or not to erase existing database in event of a collision, by default False.
- mongo_uribool, optional
uri for the mongo client, by default ‘mongodb://localhost:27017’.
- image_dirstr, optional
Filepath where images should be saved, by default None.
- quietbool, optional
Whether to silence warnings, by default False.
- react_targetsbool, optional
Whether or not to apply reactions to generated compounds that match targets, by default True.
- filter_after_final_genbool, optional
Whether to apply filters after final expansion, by default True.
- prune_between_gensbool, optional
Whether to prune network between generations if using filters
- Attributes
- operators: dict
Reaction operators to transform compounds with.
- coreactants: dict
Coreactants required by the operators.
- compounds: dict
Compounds in the pickaxe network.
- reactions: dict
Reactions in the pickaxe network.
- generation: int
The current generation
- explicit_hbool
Whether rules utilize explicit hydrogens.
- kekulizebool
Whether or not to kekulize compounds before reaction.
- neutralisebool
Whether or not to neutralise compounds.
- fragmented_molsbool
Whether or not to allow fragmented molecules.
- radical_checkbool
Whether or not to check and remove radicals.
- image_dirstr, optional
Filepath where images should be saved.
- errorsbool
Whether or not to print errors to stdout.
- quietbool
Whether or not to silence warnings.
- filters: List[object]
A list of filters to apply during the expansion.
- targetsdict
Molecules to be targeted during expansions.
- target_smiles: List[str]
The SMILES of all the targets.
- react_targetsbool
Whether or not to react targets when generated.
- filter_after_final_genbool
Whether or not to filter after the last expansion.
- prune_between_gensbool, optional
Whether to prune network between generations if using filters.
- mongo_uristr
The connection string to the mongo database.
- cid_num_inchi_blocksint
How many blocks of the inchi-blocks to use to generate the compound id.
- assign_ids() None ¶
Assign a numerical ID to compounds (and reactions).
Assign IDs that are unique only to the CURRENT run.
- find_minimal_set(white_list: Set[str]) Tuple[set, set] ¶
Find the minimal set of compounds and reactions given a white list.
Given a whitelist this function finds the minimal set of compound and reactions ids that comprise the set.
- Parameters
- white_listSet[str]
List of compound_ids to use to filter reaction network to.
- Returns
- Tuple[set, set]
The filtered compounds and reactions.
- load_compound_set(compound_file: Optional[str] = None, id_field: str = 'id') str ¶
Load compounds for expansion into pickaxe.
- Parameters
- compound_filestr, optional
Filepath of compounds, by default None.
- id_fieldstr, optional
Header value of compound id in input file, by default ‘id’.
- Returns
- str
List of SMILES that were succesfully loaded into pickaxe.
- Raises
- ValueError
No file specified for loading.
- load_pickled_pickaxe(fname: str) None ¶
Load pickaxe from pickle.
Load pickled pickaxe object.
- Parameters
- fnamestr
filename to read (must be .pk).
- load_targets(target_compound_file: Optional[str], id_field: str = 'id') None ¶
Load targets into pickaxe.
- Parameters
- target_compound_filestr
Filepath of target compounds.
- id_fieldstr, optional
Header value of compound id in input file, by default ‘id’.
- pickle_pickaxe(fname: str) None ¶
Pickle key pickaxe items.
Pickle pickaxe object to be loaded in later.
- Parameters
- fnamestr
filename to save (must be .pk).
- prune_network(white_list: list, print_output: str = True) None ¶
Prune the reaction network to a list of targets.
Prune the predicted reaction network to only compounds and reactions that terminate in a specified white list of compounds.
- Parameters
- white_listlist
A list of compound ids to filter the network to.
- print_outputbool
Whether or not to print output
- prune_network_to_targets() None ¶
Prune the reaction network to the target compounds.
Prune the predicted reaction network to only compounds and reactions that terminate in the target compounds.
- save_to_mine(processes: int = 1, indexing: bool = True, write_core: bool = False) None ¶
Save pickaxe run to MINE database.
- Parameters
- processesint, optional
Number of processes to use, by default 1.
- indexingbool, optional
Whether or not to add indexes, by default True.
- write_corebool, optional
Whether or not to write to core database, by default False.
- transform_all(processes: int = 1, generations: int = 1) None ¶
Transform compounds with reaction operators.
Apply reaction rules to compounds and generate a specified number of new generations.
- Parameters
- processesint, optional
Number of processes to run in parallel, by default 1.
- generationsint, optional
Number of generations to create, by default 1.
- write_compound_output_file(path: str, dialect: str = 'excel-tab') None ¶
Write compounds to an output file.
- Parameters
- pathstr
Path to write data.
- dialectstr, optional
Dialect of the output, by default ‘excel-tab’.
- write_reaction_output_file(path: str, delimiter: str = '\t') None ¶
Write all reaction data to the specified path.
- Parameters
- pathstr
Path to write data.
- delimiterstr, optional
Delimiter for the output file, by default ‘t’.