7.5. Pickaxe

Pickaxe.py: Create network expansions from reaction rules and compounds.

This module generates new compounds from user-specified starting compounds using a set of SMARTS-based reaction rules.

class minedatabase.pickaxe.Pickaxe(rule_list: Optional[str] = None, coreactant_list: Optional[str] = None, explicit_h: bool = False, kekulize: bool = False, neutralise: bool = True, errors: bool = True, inchikey_blocks_for_cid: int = 1, database: Optional[str] = None, database_overwrite: bool = False, mongo_uri: bool = 'mongodb://localhost:27017', image_dir: Optional[str] = None, quiet: bool = True, react_targets: bool = True, filter_after_final_gen: bool = True, prune_between_gens: bool = False)

Class to generate expansions with compounds and reaction rules.

This class generates new compounds from user-specified starting compounds using a set of SMARTS-based reaction rules. It may be initialized with a text file containing the reaction rules and coreactants or this may be done on an ad hoc basis.

Parameters
rule_liststr

Filepath of rules.

coreactant_liststr

Filepath of coreactants.

explicit_hbool, optional

Whether rules utilize explicit hydrogens, by default True.

kekulizebool, optional

Whether or not to kekulize compounds before reaction, by default False.

neutralisebool, optional

Whether or not to neutralise compounds, by default True.

errorsbool, optional

Whether or not to print errors to stdout, by default True.

inchikey_blocks_for_cidint, optional

How many blocks of the InChI key to use for the compound id, by default 1.

databasestr, optional

Name of the database where to save results, by default None.

database_overwritebool, optional

Whether or not to erase existing database in event of a collision, by default False.

mongo_uribool, optional

uri for the mongo client, by default ‘mongodb://localhost:27017’.

image_dirstr, optional

Filepath where images should be saved, by default None.

quietbool, optional

Whether to silence warnings, by default False.

react_targetsbool, optional

Whether or not to apply reactions to generated compounds that match targets, by default True.

filter_after_final_genbool, optional

Whether to apply filters after final expansion, by default True.

prune_between_gensbool, optional

Whether to prune network between generations if using filters

Attributes
operators: dict

Reaction operators to transform compounds with.

coreactants: dict

Coreactants required by the operators.

compounds: dict

Compounds in the pickaxe network.

reactions: dict

Reactions in the pickaxe network.

generation: int

The current generation

explicit_hbool

Whether rules utilize explicit hydrogens.

kekulizebool

Whether or not to kekulize compounds before reaction.

neutralisebool

Whether or not to neutralise compounds.

fragmented_molsbool

Whether or not to allow fragmented molecules.

radical_checkbool

Whether or not to check and remove radicals.

image_dirstr, optional

Filepath where images should be saved.

errorsbool

Whether or not to print errors to stdout.

quietbool

Whether or not to silence warnings.

filters: List[object]

A list of filters to apply during the expansion.

targetsdict

Molecules to be targeted during expansions.

target_smiles: List[str]

The SMILES of all the targets.

react_targetsbool

Whether or not to react targets when generated.

filter_after_final_genbool

Whether or not to filter after the last expansion.

prune_between_gensbool, optional

Whether to prune network between generations if using filters.

mongo_uristr

The connection string to the mongo database.

cid_num_inchi_blocksint

How many blocks of the inchi-blocks to use to generate the compound id.

assign_ids() None

Assign a numerical ID to compounds (and reactions).

Assign IDs that are unique only to the CURRENT run.

find_minimal_set(white_list: Set[str]) Tuple[set, set]

Find the minimal set of compounds and reactions given a white list.

Given a whitelist this function finds the minimal set of compound and reactions ids that comprise the set.

Parameters
white_listSet[str]

List of compound_ids to use to filter reaction network to.

Returns
Tuple[set, set]

The filtered compounds and reactions.

load_compound_set(compound_file: Optional[str] = None, id_field: str = 'id') str

Load compounds for expansion into pickaxe.

Parameters
compound_filestr, optional

Filepath of compounds, by default None.

id_fieldstr, optional

Header value of compound id in input file, by default ‘id’.

Returns
str

List of SMILES that were succesfully loaded into pickaxe.

Raises
ValueError

No file specified for loading.

load_pickled_pickaxe(fname: str) None

Load pickaxe from pickle.

Load pickled pickaxe object.

Parameters
fnamestr

filename to read (must be .pk).

load_targets(target_compound_file: Optional[str], id_field: str = 'id') None

Load targets into pickaxe.

Parameters
target_compound_filestr

Filepath of target compounds.

id_fieldstr, optional

Header value of compound id in input file, by default ‘id’.

pickle_pickaxe(fname: str) None

Pickle key pickaxe items.

Pickle pickaxe object to be loaded in later.

Parameters
fnamestr

filename to save (must be .pk).

prune_network(white_list: list, print_output: str = True) None

Prune the reaction network to a list of targets.

Prune the predicted reaction network to only compounds and reactions that terminate in a specified white list of compounds.

Parameters
white_listlist

A list of compound ids to filter the network to.

print_outputbool

Whether or not to print output

prune_network_to_targets() None

Prune the reaction network to the target compounds.

Prune the predicted reaction network to only compounds and reactions that terminate in the target compounds.

save_to_SBML(file_name: str, save_reactions_uniprot: bool = False, uniprot_save_style: str = 'grouped') None

Save pickaxe run to an SBML file.

This function saves the species and reactions to an SBML file with annotations. Specifically, the species will be annotated with their SMILES and reactions annotated with their operator and uniprot ids (if available and desired).

Parameters
file_namestr

The file name to save the SBML at.

save_reactions_uniprotbool

Whether or not to save the uniprot ids for a reaction operator (if available)

uniprot_save_stylestr
The stle to save uniprot ids in. There are two options:

grouped : all uniprot information is stored in a semicolon delimited list individual : uniprot information is saved individually as a link to the uniprot website

save_to_mine(processes: int = 1, indexing: bool = True, write_core: bool = False) None

Save pickaxe run to MINE database.

Parameters
processesint, optional

Number of processes to use, by default 1.

indexingbool, optional

Whether or not to add indexes, by default True.

write_corebool, optional

Whether or not to write to core database, by default False.

transform_all(processes: int = 1, generations: int = 1) None

Transform compounds with reaction operators.

Apply reaction rules to compounds and generate a specified number of new generations.

Parameters
processesint, optional

Number of processes to run in parallel, by default 1.

generationsint, optional

Number of generations to create, by default 1.

write_compound_output_file(path: str, dialect: str = 'excel-tab') None

Write compounds to an output file.

Parameters
pathstr

Path to write data.

dialectstr, optional

Dialect of the output, by default ‘excel-tab’.

write_reaction_output_file(path: str, delimiter: str = '\t') None

Write all reaction data to the specified path.

Parameters
pathstr

Path to write data.

delimiterstr, optional

Delimiter for the output file, by default ‘t’.