MolCraftDiffusion

A unified generative-AI framework for 3D molecular design.

MolCraftDiffusion is an open-source framework for 3D molecular generation using diffusion models, designed for data-driven molecular design and computational chemistry. The framework enables researchers to train generative models that produce chemically meaningful 3D molecular structures while supporting property optimization, scaffold modification, and exploration of chemical space.

By combining modular training pipelines, flexible guidance strategies, and integrated analysis tools, MolCraftDiffusion provides a complete workflow for developing and deploying molecular diffusion models in research applications such as drug discovery, catalyst discovery, materials design, and molecular property optimization.

Workflow overview

GitHub PyPI arXiv DOI Weights Dataset Demo


Key Features

MolCraftDiffusion is built with modularity at its core, offering an all-in-one, systematic workflow entirely driven by a unified CLI and YAML configuration files.

  • Data Module — Preprocess, compile, and manage raw .xyz files into unified .db (ASE Database) pipelines, and annotate properties.

  • Training & Fine-Tuning Module — Flexibly train (or fine-tune) diffusion models, property regressors, and time-aware guidance models.

  • Generation & Guidance Module — Generate 3D molecules using a variety of guidance mechanisms:

    • Unconditional Generation: Generate 3D molecules without any specific constraints or guidance.

    • Property-Targeted Guidance: Steer generation toward desired properties using Classifier-Free Guidance (CFG), Gradient Guidance (GG), or a hybrid approach.

    • Structure-Guided Generation: Perform inpainting (scaffold decoration) and outpainting (fragment extension) with precise 3D geometric constraints.

  • Analysis & Evaluation Module — Assess the quality of generated 3D molecules. Includes tools for structural validity metrics, xTB geometry optimization, RMSD comparisons, and quantum-chemical property calculation/prediction.


Quick Start

# Train a diffusion model
MolCraftDiff train my_config

# Generate molecules
MolCraftDiff generate my_gen_config

# Analyse outputs
MolCraftDiff analyze metrics -i generated_molecules/

Ready-to-use template configuration files for every workflow are provided in docs/cfg_examples/. Copy the relevant file, fill in your paths, and run:

# Example: unconditional generation with the template
cp docs/cfg_examples/gen_unconditional.yaml my_gen.yaml
# edit my_gen.yaml → set chkpt_directory
MolCraftDiff generate my_gen

Contents

Getting Started

API Reference