MolecularDiffusion.runmodes.analyze.featurize¶
Molecular featurization backends for 3D XYZ structures.
Backends¶
soap SOAP descriptor via dscribe — CPU, no external model needed. uma UMA backbone embeddings — requires vendored fairchem/src + checkpoint.
Attributes¶
Classes¶
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
|
Helper class that provides a standard way to create an ABC using |
Functions¶
|
Discover XYZ files, featurize, save outputs, return (N, D) array. |
Module Contents¶
- class MolecularDiffusion.runmodes.analyze.featurize.BaseFeaturizer¶
Bases:
abc.ABCHelper class that provides a standard way to create an ABC using inheritance.
- abstractmethod featurize(atoms_list: list[ase.Atoms]) numpy.ndarray¶
Return float32 array of shape (N, D).
- class MolecularDiffusion.runmodes.analyze.featurize.SOAPFeaturizer(species: list[str], r_cut: float = 6.0, n_max: int = 8, l_max: int = 6, sigma: float = 0.1, pooling: str = 'mean', n_jobs: int = 1)¶
Bases:
BaseFeaturizerHelper class that provides a standard way to create an ABC using inheritance.
- featurize(atoms_list: list[ase.Atoms]) numpy.ndarray¶
Return float32 array of shape (N, D).
- desc¶
- n_jobs = 1¶
- pooling = 'mean'¶
- class MolecularDiffusion.runmodes.analyze.featurize.SSL3DFeaturizer(checkpoint: str | pathlib.Path, device: str | None = None, batch_size: int = 16, pooling: str = 'mean', edge_radius: float = 5.0, atom_vocab: list[str] | None = None)¶
Bases:
BaseFeaturizerHelper class that provides a standard way to create an ABC using inheritance.
- featurize(atoms_list: list[ase.Atoms]) numpy.ndarray¶
Return float32 array of shape (N, D).
- atom_vocab¶
- batch_size = 16¶
- device = 'cuda'¶
- edge_radius = 5.0¶
- pooling = 'mean'¶
- class MolecularDiffusion.runmodes.analyze.featurize.UMAFeaturizer(checkpoint: str | pathlib.Path, task_name: str = 'omol', device: str | None = None, batch_size: int = 8, pooling: str = 'mean', scalar_only: bool = True, charge: int = 0, spin: int = 1)¶
Bases:
BaseFeaturizerHelper class that provides a standard way to create an ABC using inheritance.
- featurize(atoms_list: list[ase.Atoms]) numpy.ndarray¶
Return float32 array of shape (N, D).
- kwargs¶
- MolecularDiffusion.runmodes.analyze.featurize.run_featurize(input_dir: str | pathlib.Path, backend: str = 'soap', output_path: str | pathlib.Path | None = None, extensions: collections.abc.Sequence[str] = DEFAULT_STRUCTURE_EXTENSIONS, recursive: bool = False, species: list[str] | None = None, autodetect_species: bool = False, r_cut: float = 6.0, n_max: int = 8, l_max: int = 6, sigma: float = 0.1, pooling: str = 'mean', n_jobs: int = 1, checkpoint: str | pathlib.Path = 'training_outputs/uma-s-1p2.pt', task_name: str = 'omol', device: str | None = None, batch_size: int = 8, scalar_only: bool = True, charge: int = 0, spin: int = 1, ssl3d_checkpoint: str | pathlib.Path | None = None, ssl3d_edge_radius: float = 5.0, ssl3d_atom_vocab: list[str] | None = None) numpy.ndarray¶
Discover XYZ files, featurize, save outputs, return (N, D) array.
- MolecularDiffusion.runmodes.analyze.featurize.DEFAULT_SOAP_SPECIES = ['H', 'B', 'C', 'N', 'O', 'F', 'Al', 'Si', 'P', 'S', 'Cl', 'As', 'Se', 'Br', 'I', 'Hg', 'Bi']¶