7.2. Databases

Databases.py: This file contains MINE database classes including database loading and writing functions.

class minedatabase.databases.MINE(name: str, uri: str = 'mongodb://localhost:27017/')

This class provides an interface to the MongoDB and some useful functions.

Parameters
namestr

Name of the database to work with.

uristr, optional

uri of the mongo server, by default “mongodb://localhost:27017/”.

Attributes
clientpymongo.MongoClient

client connection to the MongoDB.

compoundsCollection

Compounds collection.

core_compoundsCollection

Core compounds collection.

meta_dataCollection

Metadata collection.

modelsCollection

Models collection.

namestr

Name of the database

operatorsCollection

Operators collection.

reactionsCollection

Reactions collection.

target_compoundsCollection

Target compounds collection.

uristr

MongoDB connection string.

add_reaction_mass_change(reaction: Optional[str] = None) Optional[float]

Calculate the change in mass between reactant and product compounds.

This is useful for discovering compounds in molecular networking. If no reaction is specified then mass change of each reaction in the database will be calculated.

Parameters
reactionstr, optional

Reaction ID to calculate the mass change for, by default None.

Returns
float, optional

Mass change of specified reaction. None if masses not all found.

build_indexes() None

Build indexes for efficient querying of the database.

generate_image_files(path: str, query: Optional[dict] = None, dir_depth: int = 0, img_type: str = 'svg:-a,nosource,w500,h500', convert_r: bool = False) None

Generates image files for compounds in database using ChemAxon’s MolConvert.

Parameters
pathstr

Target directory for image file.

querydict, optional

Query to limit number of files generated, by default None.

dir_depthint, optional

The number of directory levels to split the compounds into for files system efficiency. Ranges from 0 (all in top level directory) to the length of the file name (40 for MINE hashes), by default 0.

img_typestr, optional

Type of image file to be generated. See molconvert documentation for valid options, by default ‘svg:-a,nosource,w500,h500’.

convert_rbool, optional

Convert R in the smiles to *, by default False.

minedatabase.databases.establish_db_client(uri: Optional[str] = None) MongoClient

Establish a connection to a mongo database given a URI.

Uses the provided URI to connect to a mongoDB. If none is given the default URI is used when using pymongo.

Parameters
uristr, optional

URI to connect to mongo DB, by default None.

Returns
pymongo.MongoClient

Connection to the specified mongo instance.

Raises
IOError

Attempt to connect to database timed out.

minedatabase.databases.write_compounds_to_mine(compounds: List[dict], db: MINE, chunk_size: int = 10000, processes: int = 1) None

Write compounds to reaction collection of MINE.

Parameters
compoundsList[dict]

Dictionary of compounds to write.

dbMINE

MINE object to write compounds with.

chunk_sizeint, optional

Size of chunks to break compounds into when writing, by default 10000.

processesint, optional

Number of processors to use, by default 1.

minedatabase.databases.write_core_compounds(compounds: List[dict], db: MINE, mine: str, chunk_size: int = 10000, processes=1) None

Write core compounds to the core compound database.

Calculates and formats compounds into appropriate form to insert into the core compound database in the mongo instance. Core compounds are attempted to be inserted and collisions are detected on the database. The list of MINEs a given compound is found in is updated as well.

Parameters
compoundsdict

List of compound dictionaries to write.

dbMINE

MINE object to write core compounds with.

minestr

Name of the MINE.

chunk_sizeint, optional

Size of chunks to break compounds into when writing, by default 10000.

processesint, optional

The number of processors to use, by default 1.

minedatabase.databases.write_reactions_to_mine(reactions: List[dict], db: MINE, chunk_size: int = 10000) None

Write reactions to reaction collection of MINE.

Parameters
reactionsList[dict]

Dictionary of reactions to write.

dbMINE

MINE object to write reactions with.

chunk_sizeint, optional

Size of chunks to break reactions into when writing, by default 10000.

minedatabase.databases.write_targets_to_mine(targets: List[dict], db: MINE, chunk_size: int = 10000) None

Write target compounds to target collection of MINE.

Parameters
targetsList[dict]

Listt of target dictionaries to write.

dbMINE

MINE object to write targets with.

chunk_sizeint, optional

Size of chunks to break compounds into when writing, by default 10000.