MolecularDiffusion.runmodes.data.ase_ops

ASE database operations module. Handles merging, inspecting, splitting, and sampling.

Attributes

Functions

inspect_db(db_path[, output_dir, keys_to_plot, ...])

Inspects an ASE DB, printing stats and optionally plotting distributions.

is_clean(row)

Verifies that the atom order in ASE atoms and RDKit mol from mol_block are identical.

merge_dbs(input_dir, output_db[, recursive, pattern])

Merges multiple ASE databases into one.

sample_db(input_db, output[, output_type, fraction, ...])

Samples a random fraction or number of entries from an ASE database.

split_db(db_path, output_dir[, n_splits])

Splits a DB into N smaller DBs.

verify_datapoint(atoms, mol_block)

Verifies that ASE Atoms match RDKit Mol block.

Module Contents

MolecularDiffusion.runmodes.data.ase_ops.inspect_db(db_path: pathlib.Path, output_dir: pathlib.Path = None, keys_to_plot: List[str] = None, limit_print: int = 10)

Inspects an ASE DB, printing stats and optionally plotting distributions.

MolecularDiffusion.runmodes.data.ase_ops.is_clean(row)

Verifies that the atom order in ASE atoms and RDKit mol from mol_block are identical.

MolecularDiffusion.runmodes.data.ase_ops.merge_dbs(input_dir: pathlib.Path, output_db: pathlib.Path, recursive: bool = False, pattern: str = '*.db')

Merges multiple ASE databases into one.

MolecularDiffusion.runmodes.data.ase_ops.sample_db(input_db: pathlib.Path, output: pathlib.Path, output_type: str = None, fraction: float = None, number: int = None, seed: int = None, verify_clean: bool = False)

Samples a random fraction or number of entries from an ASE database.

MolecularDiffusion.runmodes.data.ase_ops.split_db(db_path: pathlib.Path, output_dir: pathlib.Path, n_splits: int = 2)

Splits a DB into N smaller DBs.

MolecularDiffusion.runmodes.data.ase_ops.verify_datapoint(atoms, mol_block)

Verifies that ASE Atoms match RDKit Mol block.

MolecularDiffusion.runmodes.data.ase_ops.Chem = None
MolecularDiffusion.runmodes.data.ase_ops.logger