MolecularDiffusion.runmodes.data.featurization¶
Featurization module for MolCraft. Handles generating vectorial representations (Morgan fingerprints, SOAP) from 3D molecular data.
Attributes¶
Functions¶
|
Main featurization driver. |
|
Generates Morgan fingerprints for a list of entries. |
|
Generates SOAP descriptors. |
|
Saves features and indices. |
Module Contents¶
- MolecularDiffusion.runmodes.data.featurization.featurize(input_source: str, output_dir: str, method: str = 'morgan', format: str = 'npy', readout: str = 'mean', smilify_method: str = 'hybrid', radius: int = 2, nbits: int = 2048, rcut: float = 5.0, nmax: int = 8, lmax: int = 6)¶
Main featurization driver.
- MolecularDiffusion.runmodes.data.featurization.generate_morgan(entries: List[Dict], radius: int = 2, nbits: int = 2048, smilify_method: str = 'hybrid', n_jobs: int = 1) Tuple[numpy.ndarray, List[str]]¶
Generates Morgan fingerprints for a list of entries. Entries is a list of dicts with ‘file’ (path) or ‘data’ (atoms/coords) and ‘id’.
- MolecularDiffusion.runmodes.data.featurization.generate_soap(entries: List[Dict], rcut: float = 5.0, nmax: int = 8, lmax: int = 6, readout: str = 'mean', species: List[str] = None) Tuple[numpy.ndarray, List[str]]¶
Generates SOAP descriptors.
- MolecularDiffusion.runmodes.data.featurization.save_features(features: numpy.ndarray, ids: List[str], output_dir: pathlib.Path, format: str = 'npy')¶
Saves features and indices.
- MolecularDiffusion.runmodes.data.featurization.Chem = None¶
- MolecularDiffusion.runmodes.data.featurization.logger¶
- MolecularDiffusion.runmodes.data.featurization.save_safetensors = None¶