MolecularDiffusion.runmodes.analyze.xtb_optimization

Attributes

Functions

check_neutrality(→ bool)

Checks if a molecule described in an XYZ file is neutral using xTB.

check_xyz(→ tuple[bool, int, bool])

Performs a series of checks on an XYZ file to validate its molecular structure.

get_xtb_optimized_xyz(→ list[str])

Optimizes all XYZ files in a given input directory using xTB or OpenBabel and saves them

optimize_molecule(→ str | None)

Optimizes the geometry of a molecule from an XYZ file using xTB or OpenBabel.

Module Contents

MolecularDiffusion.runmodes.analyze.xtb_optimization.check_neutrality(filename: str, charge: int = -1, timeout: int = 180) bool

Checks if a molecule described in an XYZ file is neutral using xTB.

This function executes the xtb command with the –ptb (print properties) flag and parses its log output to detect if xTB reports a mismatch between the number of electrons and spin multiplicity, which indicates a non-neutral molecule. Temporary xTB output files are cleaned up afterwards.

Note: This function assumes xtb is installed and accessible in the system’s PATH. This functionality could potentially be integrated with other molecular property calculation modules if available.

Parameters:
  • filename (str) – The path to the XYZ file of the molecule to check.

  • charge (int) – The molecular charge to use for the xTB calculation. Defaults to -1.

  • timeout (int) – The maximum time in seconds to wait for the xTB process to complete.

Returns:

True if the molecule is inferred to be neutral based on xTB’s output,

False otherwise.

Return type:

bool

MolecularDiffusion.runmodes.analyze.xtb_optimization.check_xyz(filename: str, connector_dicts: dict = None, scale_factor: float = 1.3) tuple[bool, int, bool]

Performs a series of checks on an XYZ file to validate its molecular structure.

This includes checking for zero coordinates, graph connectivity, and optionally, the degree of specific ‘connector’ nodes.

Parameters:
  • filename (str) – The path to the XYZ file to be checked.

  • connector_dicts (dict, optional) – A dictionary where keys are node indices and values are lists of expected degrees for those nodes. Used to validate connectivity at specific points in the molecule. Defaults to None.

  • scale_factor (float, optional) – The scaling factor for covalent radii in edge correction. Defaults to 1.3.

Returns:

A tuple containing:
  • is_connected (bool): True if the molecule’s graph is fully connected.

  • num_components (int): The number of connected components in the graph.

  • match_n_degree (bool): True if all specified connector nodes have degrees matching their expected values in connector_dicts, False otherwise.

Return type:

tuple[bool, int, bool]

MolecularDiffusion.runmodes.analyze.xtb_optimization.get_xtb_optimized_xyz(input_directory: str, output_directory: str = None, charge: int = -1, level: str = 'gfn1', timeout: int = 240, scale_factor: float = 1.3, optimize_all: bool = True, csv_path: str = None, filter_column: str = None) list[str]

Optimizes all XYZ files in a given input directory using xTB or OpenBabel and saves them to an output directory.

This function iterates through all .xyz files, performs initial structural checks (connectivity, zero coordinates, and optional degree checks), and then attempts to optimize valid structures using optimize_molecule. It skips files that already have an optimized counterpart in the output directory.

Parameters:
  • input_directory (str) – The path to the directory containing the input XYZ files.

  • output_directory (str, optional) – The path to the directory where optimized XYZ files will be saved. If None, optimized files are saved in the input_directory. Defaults to None.

  • charge (int, optional) – The molecular charge to use for xTB optimizations. Defaults to -1.

  • level (str, optional) – The calculation level (e.g., “gfn1”, “gfn2”, “gfn-ff”, “mmff94”). Defaults to “gfn1”.

  • timeout (int, optional) – The maximum time in seconds to wait for each xTB process. Defaults to 240.

  • scale_factor (float, optional) – The scaling factor for covalent radii in edge correction. Defaults to 1.3.

  • optimize_all (bool, optional) – If True, optimizes all files regardless of existing optimized versions.

  • csv_path (str, optional) – Path to a CSV file to filter which XYZ files to optimize.

  • filter_column (str, optional) – The column name in the CSV to filter by (values must be 1).

Returns:

A list of paths to the successfully optimized XYZ files.

Return type:

list[str]

MolecularDiffusion.runmodes.analyze.xtb_optimization.optimize_molecule(filename: str, charge: int, level: str, timeout: int) str | None

Optimizes the geometry of a molecule from an XYZ file using xTB or OpenBabel.

This function attempts to optimize the molecule with a specified charge and calculation level. If the optimization is successful, it moves the output file to a new name based on the input filename.

Parameters:
  • filename (str) – The path to the input XYZ file.

  • charge (int) – The molecular charge to use for the calculation (xTB only).

  • level (str) – The calculation level (e.g., “gfn1”, “gfn2”, “gfn-ff”, “mmff94”).

  • timeout (int) – The maximum time in seconds to wait for the process to complete.

Returns:

The path to the optimized XYZ file if successful,

otherwise None if it times out or fails to produce output.

Return type:

str | None

MolecularDiffusion.runmodes.analyze.xtb_optimization.parser