MolecularDiffusion.cli.data

Data CLI module for MolCraft. Exposes data preparation, augmentation, and ASE operations via CLI commands.

Attributes

Classes

VariadicOption

A Click Option that consumes all remaining arguments until the next flag.

Functions

annotate_cmd(db, tag, value)

Annotate an ASE database with a tag.

ase_ops_group()

ASE database operations.

augment()

Data augmentation commands.

charge_cmd(input, output, max_h, fraction, db)

Augment data by random charge modification.

compile_cmd(source, db, natoms, csv, sdf, fraction, seed)

Compile molecular data into an ASE database.

data()

Data processing utilities.

distortion_cmd(input, output, sigma, fraction, freeze)

Augment data by random coordinate distortion.

featurize_cmd(method, input, output, format, readout, ...)

Featurize 3D molecules into vectors.

generate_blocks_cmd(source, sdf, natoms, csv, ...)

Generate Mol Blocks, SMILES, and properties (SA, SC).

inspect_cmd(db, output, keys, limit)

Inspect an ASE database and optionally plot statistics.

merge_cmd(input, output, recursive)

Merge multiple ASE databases.

prepare()

Data preparation commands.

sample_cmd(input, output, fraction, number, seed, verify)

Sample entries from an ASE database.

size_cmd(input, output, s_start, t_start, s_end, ...)

Augment data to balance molecule sizes.

split_cmd(db, output, n)

Split an ASE database.

Module Contents

class MolecularDiffusion.cli.data.VariadicOption(*args, **kwargs)

Bases: click.Option

A Click Option that consumes all remaining arguments until the next flag.

add_to_parser(parser, ctx)
MolecularDiffusion.cli.data.annotate_cmd(db, tag, value)

Annotate an ASE database with a tag.

MolecularDiffusion.cli.data.ase_ops_group()

ASE database operations.

MolecularDiffusion.cli.data.augment()

Data augmentation commands.

MolecularDiffusion.cli.data.charge_cmd(input, output, max_h, fraction, db)

Augment data by random charge modification.

MolecularDiffusion.cli.data.compile_cmd(source, db, natoms, csv, sdf, fraction, seed)

Compile molecular data into an ASE database.

MolecularDiffusion.cli.data.data()

Data processing utilities.

MolecularDiffusion.cli.data.distortion_cmd(input, output, sigma, fraction, freeze)

Augment data by random coordinate distortion.

MolecularDiffusion.cli.data.featurize_cmd(method, input, output, format, readout, smilify_method, radius, nbits, rcut, nmax, lmax)

Featurize 3D molecules into vectors.

MolecularDiffusion.cli.data.generate_blocks_cmd(source, sdf, natoms, csv, fraction, indices, method)

Generate Mol Blocks, SMILES, and properties (SA, SC).

MolecularDiffusion.cli.data.inspect_cmd(db, output, keys, limit)

Inspect an ASE database and optionally plot statistics.

MolecularDiffusion.cli.data.merge_cmd(input, output, recursive)

Merge multiple ASE databases.

MolecularDiffusion.cli.data.prepare()

Data preparation commands.

MolecularDiffusion.cli.data.sample_cmd(input, output, fraction, number, seed, verify)

Sample entries from an ASE database.

MolecularDiffusion.cli.data.size_cmd(input, output, s_start, t_start, s_end, t_end, strength, decay, invert, plot_prefix)

Augment data to balance molecule sizes.

MolecularDiffusion.cli.data.split_cmd(db, output, n)

Split an ASE database.

MolecularDiffusion.cli.data.logger