MolecularDiffusion.modules.models.tabasco.chem.utils

Functions

attempt_sanitize(mol)

Run Chem.SanitizeMol and return None on failure.

largest_component(→ List[rdkit.Chem.Mol])

Return the largest connected component for each molecule.

reorder_molecule_by_smiles(→ rdkit.Chem.Mol | None)

Renumber atoms to follow canonical SMILES indexing.

write_xyz_file(→ None)

Write an XYZ file.

Module Contents

MolecularDiffusion.modules.models.tabasco.chem.utils.attempt_sanitize(mol: rdkit.Chem.Mol)

Run Chem.SanitizeMol and return None on failure.

Parameters:

mol – RDKit Mol to sanitize.

Returns:

Sanitised molecule or None if RDKit raises.

Return type:

Chem.Mol | None

Credits: Charles Harris

MolecularDiffusion.modules.models.tabasco.chem.utils.largest_component(molecules: List[rdkit.Chem.Mol]) List[rdkit.Chem.Mol]

Return the largest connected component for each molecule.

Parameters:

molecules – Iterable of RDKit Mol objects with 3-D coordinates.

Returns:

Each entry is the fragment with the most atoms for the corresponding input molecule.

Return type:

list[Chem.Mol]

From: https://github.com/jostorge/diffusion-hopping/blob/main/diffusion_hopping/analysis/util.py

MolecularDiffusion.modules.models.tabasco.chem.utils.reorder_molecule_by_smiles(mol: rdkit.Chem.Mol) rdkit.Chem.Mol | None

Renumber atoms to follow canonical SMILES indexing.

Parameters:

mol – RDKit Mol. If None, the function returns None.

Returns:

Renumbered copy of the molecule.

Return type:

Chem.Mol | None

Raises:

ValueError – If canonicalisation or substructure matching fails.

MolecularDiffusion.modules.models.tabasco.chem.utils.write_xyz_file(coords: torch.Tensor, atom_types: List[str], filename: str) None

Write an XYZ file.

Parameters:
  • coords – Array-like of shape (N, 3) with Cartesian coordinates.

  • atom_types – Sequence of length N with element symbols.

  • filename – Path to the output .xyz file.