Installation¶
Prerequisites¶
Python 3.11
A CUDA-capable GPU is recommended for training
Step-by-step¶
# 1. Create and activate a new environment
conda create -n molcraft python=3.11 -y
conda activate molcraft
# 2. Install MolCraftDiffusion with a compute backend
# GPU/CUDA:
pip install molcraftdiffusion[gpu] \
--find-links https://data.pyg.org/whl/torch-2.6.0+cu124.html
# CPU-only:
pip install molcraftdiffusion[cpu] \
--extra-index-url https://download.pytorch.org/whl/cpu \
--find-links https://data.pyg.org/whl/torch-2.6.0+cpu.html
The base package does not install every data-processing or analysis dependency. Add the feature groups you need:
# Data preparation, augmentation, and featurization commands
pip install 'molcraftdiffusion[data]'
# Analysis and post-processing commands (metrics, compare, xyz2mol, xtb-electronic, featurize SOAP)
pip install 'molcraftdiffusion[analyze]'
# xTB is used by optimize, compare, and xtb-electronic — best installed from conda-forge:
conda install -c conda-forge xtb==6.7.1 -y
If an optional command is called without its dependencies, MolCraftDiffusion exits with a warning and an install hint such as pip install 'molcraftdiffusion[analyze]'.
Development / editable install¶
git clone https://github.com/pregHosh/MolCraftDiffusion
cd MolCraftDiffusion
pip install -e .[gpu] \
--find-links https://data.pyg.org/whl/torch-2.6.0+cu124.html
# Add optional groups for editable development when needed:
pip install -e '.[data]'
pip install -e '.[analyze]'
Optional dependencies¶
# Data utilities (includes dscribe for SOAP featurization)
pip install 'molcraftdiffusion[data]'
# Analyze utilities (PoseBusters/RDKit/OpenBabel Python bindings)
pip install 'molcraftdiffusion[analyze]'
# Optional: needed for geometric-shape metrics in
# `MolCraftDiff analyze metrics --metrics {core,geom_revised,all}`
pip install cosymlib
# xTB executable for xTB-backed analysis
conda install -c conda-forge xtb==6.7.1 -y
UMA featurization backend¶
The featurize --backend uma command uses a pretrained UMA model from fairchem.
fairchem is not installed as a pip package — the source tree is vendored into
the repository and loaded at runtime.
Clone it into the repo root before using the UMA backend:
# from the MolCraftDiffusion repo root
git clone https://github.com/pregHosh/fairchem fairchem
A pretrained UMA checkpoint is also required. Download uma-s-1p2.pt from
Hugging Face and place it at:
training_outputs/uma-s-1p2.pt
or pass a custom path with --checkpoint /path/to/checkpoint.pt.
If the fairchem source tree is not found at runtime, MolCraftDiffusion will print an explicit error with the clone instruction above. You can also set:
export MOLCRAFT_REPO_ROOT=/path/to/MolCraftDiffusion
to point to the repo root when running from a different working directory.
Verifying the installation¶
MolCraftDiff --help
You should see a list of all available commands: train, generate, predict, eval-predict, analyze, data.
Pre-trained models¶
Pre-trained checkpoints are available on Hugging Face. We recommend starting from these for any downstream application.