7.5. Pickaxe¶

Pickaxe.py: Create network expansions from reaction rules and compounds.

This module generates new compounds from user-specified starting compounds using a set of SMARTS-based reaction rules.

class minedatabase.pickaxe.Pickaxe(rule_list: Optional[str] = None, coreactant_list: Optional[str] = None, explicit_h: bool = False, kekulize: bool = True, neutralise: bool = True, errors: bool = True, inchikey_blocks_for_cid: int = 1, database: Optional[str] = None, database_overwrite: bool = False, mongo_uri: bool = 'mongodb://localhost:27017', image_dir: Optional[str] = None, quiet: bool = True, react_targets: bool = True, filter_after_final_gen: bool = True, prune_between_gens: bool = False)¶

Class to generate expansions with compounds and reaction rules.

This class generates new compounds from user-specified starting compounds using a set of SMARTS-based reaction rules. It may be initialized with a text file containing the reaction rules and coreactants or this may be done on an ad hoc basis.

Parameters

rule_liststr: Filepath of rules.
coreactant_liststr: Filepath of coreactants.
explicit_hbool, optional: Whether rules utilize explicit hydrogens, by default True.
kekulizebool, optional: Whether or not to kekulize compounds before reaction, by default True.
neutralisebool, optional: Whether or not to neutralise compounds, by default True.
errorsbool, optional: Whether or not to print errors to stdout, by default True.
inchikey_blocks_for_cidint, optional: How many blocks of the InChI key to use for the compound id, by default 1.
databasestr, optional: Name of the database where to save results, by default None.
database_overwritebool, optional: Whether or not to erase existing database in event of a collision, by default False.
mongo_uribool, optional: uri for the mongo client, by default ‘mongodb://localhost:27017’.
image_dirstr, optional: Filepath where images should be saved, by default None.
quietbool, optional: Whether to silence warnings, by default False.
react_targetsbool, optional: Whether or not to apply reactions to generated compounds that match targets, by default True.
filter_after_final_genbool, optional: Whether to apply filters after final expansion, by default True.
prune_between_gensbool, optional: Whether to prune network between generations if using filters

Attributes

operators: dict: Reaction operators to transform compounds with.
coreactants: dict: Coreactants required by the operators.
compounds: dict: Compounds in the pickaxe network.
reactions: dict: Reactions in the pickaxe network.
generation: int: The current generation
explicit_hbool: Whether rules utilize explicit hydrogens.
kekulizebool: Whether or not to kekulize compounds before reaction.
neutralisebool: Whether or not to neutralise compounds.
fragmented_molsbool: Whether or not to allow fragmented molecules.
radical_checkbool: Whether or not to check and remove radicals.
image_dirstr, optional: Filepath where images should be saved.
errorsbool: Whether or not to print errors to stdout.
quietbool: Whether or not to silence warnings.
filters: List[object]: A list of filters to apply during the expansion.
targetsdict: Molecules to be targeted during expansions.
target_smiles: List[str]: The SMILES of all the targets.
react_targetsbool: Whether or not to react targets when generated.
filter_after_final_genbool: Whether or not to filter after the last expansion.
prune_between_gensbool, optional: Whether to prune network between generations if using filters.
mongo_uristr: The connection string to the mongo database.
cid_num_inchi_blocksint: How many blocks of the inchi-blocks to use to generate the compound id.

assign_ids() → None¶

Assign a numerical ID to compounds (and reactions).

Assign IDs that are unique only to the CURRENT run.

find_minimal_set(white_list: Set[str]) → Tuple[set, set]¶

Find the minimal set of compounds and reactions given a white list.

Given a whitelist this function finds the minimal set of compound and reactions ids that comprise the set.

Parameters

white_listSet[str]: List of compound_ids to use to filter reaction network to.

Returns

Tuple[set, set]: The filtered compounds and reactions.

load_compound_set(compound_file: Optional[str] = None, id_field: str = 'id') → str¶

Load compounds for expansion into pickaxe.

Parameters

compound_filestr, optional: Filepath of compounds, by default None.
id_fieldstr, optional: Header value of compound id in input file, by default ‘id’.

Returns

str: List of SMILES that were succesfully loaded into pickaxe.

Raises

ValueError: No file specified for loading.

load_pickled_pickaxe(fname: str) → None¶

Load pickaxe from pickle.

Load pickled pickaxe object.

Parameters

fnamestr: filename to read (must be .pk).

load_targets(target_compound_file: Optional[str], id_field: str = 'id') → None¶

Load targets into pickaxe.

Parameters

target_compound_filestr: Filepath of target compounds.
id_fieldstr, optional: Header value of compound id in input file, by default ‘id’.

pickle_pickaxe(fname: str) → None¶

Pickle key pickaxe items.

Pickle pickaxe object to be loaded in later.

Parameters

fnamestr: filename to save (must be .pk).

prune_network(white_list: list, print_output: str = True) → None¶

Prune the reaction network to a list of targets.

Prune the predicted reaction network to only compounds and reactions that terminate in a specified white list of compounds.

Parameters

white_listlist: A list of compound ids to filter the network to.
print_outputbool: Whether or not to print output

prune_network_to_targets() → None¶

Prune the reaction network to the target compounds.

Prune the predicted reaction network to only compounds and reactions that terminate in the target compounds.

save_to_mine(processes: int = 1, indexing: bool = True, write_core: bool = False) → None¶

Save pickaxe run to MINE database.

Parameters

processesint, optional: Number of processes to use, by default 1.
indexingbool, optional: Whether or not to add indexes, by default True.
write_corebool, optional: Whether or not to write to core database, by default False.

transform_all(processes: int = 1, generations: int = 1) → None¶

Transform compounds with reaction operators.

Apply reaction rules to compounds and generate a specified number of new generations.

Parameters

processesint, optional: Number of processes to run in parallel, by default 1.
generationsint, optional: Number of generations to create, by default 1.

write_compound_output_file(path: str, dialect: str = 'excel-tab') → None¶

Write compounds to an output file.

Parameters

pathstr: Path to write data.
dialectstr, optional: Dialect of the output, by default ‘excel-tab’.

write_reaction_output_file(path: str, delimiter: str = '\t') → None¶

Write all reaction data to the specified path.

Parameters

pathstr: Path to write data.
delimiterstr, optional: Delimiter for the output file, by default ‘t’.