7.5. Pickaxe
Pickaxe.py: Create network expansions from reaction rules and compounds.
This module generates new compounds from user-specified starting compounds using a set of SMARTS-based reaction rules.
- class minedatabase.pickaxe.Pickaxe(rule_list: Optional[str] = None, coreactant_list: Optional[str] = None, explicit_h: bool = False, kekulize: bool = False, neutralise: bool = True, errors: bool = True, inchikey_blocks_for_cid: int = 1, database: Optional[str] = None, database_overwrite: bool = False, mongo_uri: bool = 'mongodb://localhost:27017', image_dir: Optional[str] = None, quiet: bool = True, react_targets: bool = True, filter_after_final_gen: bool = True, prune_between_gens: bool = False)
Class to generate expansions with compounds and reaction rules.
This class generates new compounds from user-specified starting compounds using a set of SMARTS-based reaction rules. It may be initialized with a text file containing the reaction rules and coreactants or this may be done on an ad hoc basis.
- Parameters
- rule_liststr
Filepath of rules.
- coreactant_liststr
Filepath of coreactants.
- explicit_hbool, optional
Whether rules utilize explicit hydrogens, by default True.
- kekulizebool, optional
Whether or not to kekulize compounds before reaction, by default False.
- neutralisebool, optional
Whether or not to neutralise compounds, by default True.
- errorsbool, optional
Whether or not to print errors to stdout, by default True.
- inchikey_blocks_for_cidint, optional
How many blocks of the InChI key to use for the compound id, by default 1.
- databasestr, optional
Name of the database where to save results, by default None.
- database_overwritebool, optional
Whether or not to erase existing database in event of a collision, by default False.
- mongo_uribool, optional
uri for the mongo client, by default ‘mongodb://localhost:27017’.
- image_dirstr, optional
Filepath where images should be saved, by default None.
- quietbool, optional
Whether to silence warnings, by default False.
- react_targetsbool, optional
Whether or not to apply reactions to generated compounds that match targets, by default True.
- filter_after_final_genbool, optional
Whether to apply filters after final expansion, by default True.
- prune_between_gensbool, optional
Whether to prune network between generations if using filters
- Attributes
- operators: dict
Reaction operators to transform compounds with.
- coreactants: dict
Coreactants required by the operators.
- compounds: dict
Compounds in the pickaxe network.
- reactions: dict
Reactions in the pickaxe network.
- generation: int
The current generation
- explicit_hbool
Whether rules utilize explicit hydrogens.
- kekulizebool
Whether or not to kekulize compounds before reaction.
- neutralisebool
Whether or not to neutralise compounds.
- fragmented_molsbool
Whether or not to allow fragmented molecules.
- radical_checkbool
Whether or not to check and remove radicals.
- image_dirstr, optional
Filepath where images should be saved.
- errorsbool
Whether or not to print errors to stdout.
- quietbool
Whether or not to silence warnings.
- filters: List[object]
A list of filters to apply during the expansion.
- targetsdict
Molecules to be targeted during expansions.
- target_smiles: List[str]
The SMILES of all the targets.
- react_targetsbool
Whether or not to react targets when generated.
- filter_after_final_genbool
Whether or not to filter after the last expansion.
- prune_between_gensbool, optional
Whether to prune network between generations if using filters.
- mongo_uristr
The connection string to the mongo database.
- cid_num_inchi_blocksint
How many blocks of the inchi-blocks to use to generate the compound id.
- assign_ids() None
Assign a numerical ID to compounds (and reactions).
Assign IDs that are unique only to the CURRENT run.
- find_minimal_set(white_list: Set[str]) Tuple[set, set]
Find the minimal set of compounds and reactions given a white list.
Given a whitelist this function finds the minimal set of compound and reactions ids that comprise the set.
- Parameters
- white_listSet[str]
List of compound_ids to use to filter reaction network to.
- Returns
- Tuple[set, set]
The filtered compounds and reactions.
- load_compound_set(compound_file: Optional[str] = None, id_field: str = 'id') str
Load compounds for expansion into pickaxe.
- Parameters
- compound_filestr, optional
Filepath of compounds, by default None.
- id_fieldstr, optional
Header value of compound id in input file, by default ‘id’.
- Returns
- str
List of SMILES that were succesfully loaded into pickaxe.
- Raises
- ValueError
No file specified for loading.
- load_pickled_pickaxe(fname: str) None
Load pickaxe from pickle.
Load pickled pickaxe object.
- Parameters
- fnamestr
filename to read (must be .pk).
- load_targets(target_compound_file: Optional[str], id_field: str = 'id') None
Load targets into pickaxe.
- Parameters
- target_compound_filestr
Filepath of target compounds.
- id_fieldstr, optional
Header value of compound id in input file, by default ‘id’.
- pickle_pickaxe(fname: str) None
Pickle key pickaxe items.
Pickle pickaxe object to be loaded in later.
- Parameters
- fnamestr
filename to save (must be .pk).
- prune_network(white_list: list, print_output: str = True) None
Prune the reaction network to a list of targets.
Prune the predicted reaction network to only compounds and reactions that terminate in a specified white list of compounds.
- Parameters
- white_listlist
A list of compound ids to filter the network to.
- print_outputbool
Whether or not to print output
- prune_network_to_targets() None
Prune the reaction network to the target compounds.
Prune the predicted reaction network to only compounds and reactions that terminate in the target compounds.
- save_to_SBML(file_name: str, save_reactions_uniprot: bool = False, uniprot_save_style: str = 'grouped') None
Save pickaxe run to an SBML file.
This function saves the species and reactions to an SBML file with annotations. Specifically, the species will be annotated with their SMILES and reactions annotated with their operator and uniprot ids (if available and desired).
- Parameters
- file_namestr
The file name to save the SBML at.
- save_reactions_uniprotbool
Whether or not to save the uniprot ids for a reaction operator (if available)
- uniprot_save_stylestr
- The stle to save uniprot ids in. There are two options:
grouped : all uniprot information is stored in a semicolon delimited list individual : uniprot information is saved individually as a link to the uniprot website
- save_to_mine(processes: int = 1, indexing: bool = True, write_core: bool = False) None
Save pickaxe run to MINE database.
- Parameters
- processesint, optional
Number of processes to use, by default 1.
- indexingbool, optional
Whether or not to add indexes, by default True.
- write_corebool, optional
Whether or not to write to core database, by default False.
- transform_all(processes: int = 1, generations: int = 1) None
Transform compounds with reaction operators.
Apply reaction rules to compounds and generate a specified number of new generations.
- Parameters
- processesint, optional
Number of processes to run in parallel, by default 1.
- generationsint, optional
Number of generations to create, by default 1.
- write_compound_output_file(path: str, dialect: str = 'excel-tab') None
Write compounds to an output file.
- Parameters
- pathstr
Path to write data.
- dialectstr, optional
Dialect of the output, by default ‘excel-tab’.
- write_reaction_output_file(path: str, delimiter: str = '\t') None
Write all reaction data to the specified path.
- Parameters
- pathstr
Path to write data.
- delimiterstr, optional
Delimiter for the output file, by default ‘t’.