Tutorial 7: Property-Directed Generation¶
This tutorial covers advanced generation techniques that steer the process towards desired chemical properties.
Contents¶
Introduction: The concept of directing generation with external models or guidance schemes.
Classifier-Free Guidance (CFG): How to use CFG to amplify the effect of training conditions.
Gradient Guidance (GG): How to use a trained regressor model (from Tutorial 2) to guide generation towards a specific property value.
Hybrid CFG/GG Guidance: How to combine both CFG and GG for multi-objective guidance.
1. Introduction¶
Property-directed generation allows you to guide the diffusion model to generate molecules with specific desired properties. This is achieved by providing an additional signal to the model during the sampling process. This tutorial covers three main techniques for property-directed generation. You can create your experiment configuration files in any directory, as the base templates are bundled with the package.
2. Classifier-Free Guidance (CFG)¶
Classifier-Free Guidance is a technique that amplifies the learned conditional distribution of the diffusion model. It uses two forward passes of the model: one with the condition and one without. The difference between the two outputs is then used to guide the generation process.
Configuration¶
The configuration for CFG typically inherits from the interference: gen_cfg template.
Parameter |
Description |
|---|---|
|
Must be set to |
|
A list of positive target values for the properties specified in |
|
(Optional) A list of values to use as a “negative prompt”. The model is guided away from these property values. |
|
A list of property names that the model was trained on. |
|
A scaling factor that controls the strength of the guidance. A higher value will result in a stronger push towards the target properties. |
Example my_cfg.yaml¶
defaults:
- tasks: diffusion
- interference: gen_cfg # Base template bundled with package
- _self_
name: "akatsuki"
chkpt_directory: "models/edm_formed_s1t1/"
atom_vocab: [H,B,C,N,O,F,Al,Si,P,S,Cl,As,Se,Br,I,Hg,Bi]
diffusion_steps: 600
seed: 9
interference:
num_generate: 100
target_values: [3,1.5]
property_names: ["S1_exc", "T1_exc"]
output_path: generated_mol
condition_configs:
cfg_scale: 1
negative_target_value: [1.0, 3.0] # Push away from S1=1.0, T1=3.0
Running CFG Generation¶
MolCraftDiff generate my_cfg
3. Gradient Guidance (GG)¶
Gradient Guidance uses a separate, pre-trained regressor or guidance model (like the one from Tutorial 2 or 3) to estimate the gradient of a desired property with respect to the molecule’s latent representation. This gradient is then used to guide the diffusion process towards molecules with the desired property value.
Configuration¶
The configuration for GG typically inherits from the interference: gen_gg template.
Parameter |
Description |
|---|---|
|
Must be set to |
|
Specifies the guidance model to use. This is configured using Hydra’s instantiation syntax. |
|
A scaling factor for the gradient. |
|
The maximum norm of the gradient to prevent exploding gradients. |
|
A learning rate scheduler for the guidance. |
|
The timestep at which to start applying the guidance. |
|
The timestep at which to stop applying the guidance. |
|
The number of backward steps to take for the guidance. |
Example my_gg.yaml¶
defaults:
- tasks: diffusion
- interference: gen_gg # Base template bundled with package
- _self_
name: "akatsuki"
chkpt_directory: "models/edm_formed_s1t1/"
atom_vocab: [H,B,C,N,O,F,Al,Si,P,S,Cl,As,Se,Br,I,Hg,Bi]
diffusion_steps: 600
seed: 9
interference:
num_generate: 100
output_path: generated_mol
condition_configs:
cfg_scale: 0
target_function:
_target_: scripts.gradient_guidance.sf_energy_score.SFEnergyScore
_partial_: true
chkpt_directory: trained_models/egcl_guidance_s1t1.ckpt
gg_scale: 1e-3
max_norm: 1e-3
scheduler:
_target_: scripts.gradient_guidance.scheduler.CosineAnnealing
_partial_: true
T_max: 1000
eta_min: 0
guidance_ver: 2
guidance_at: 1
guidance_stop: 0
n_backwards: 0
Running GG Generation¶
MolCraftDiff generate my_gg
4. Hybrid CFG/GG Guidance¶
It is also possible to combine CFG and GG to guide the generation with both the internal conditional model and an external guidance model.
Configuration¶
The configuration for hybrid CFG/GG typically inherits from the interference: gen_cfggg template. It combines the parameters from both CFG and GG.
Example my_cfggg.yaml¶
defaults:
- tasks: diffusion
- interference: gen_cfggg # Base template bundled with package
- _self_
name: "akatsuki"
chkpt_directory: "models/edm_formed_s1t1/"
atom_vocab: [H,B,C,N,O,F,Al,Si,P,S,Cl,As,Se,Br,I,Hg,Bi]
diffusion_steps: 600
seed: 9
interference:
num_generate: 100
target_values: [3,1.5]
property_names: ["S1_exc", "T1_exc"]
output_path: generated_mol
condition_configs:
cfg_scale: 1
target_function:
_target_: scripts.gradient_guidance.sf_energy_score.SFEnergyScore
_partial_: true
chkpt_directory: trained_models/egcl_guidance_s1t1.ckpt
gg_scale: 1e-3
max_norm: 1e-3
scheduler:
_target_: scripts.gradient_guidance.scheduler.CosineAnnealing
_partial_: true
T_max: 1000
eta_min: 0
guidance_ver: 2
guidance_at: 1
guidance_stop: 0
n_backwards: 3
Running Hybrid CFG/GG Generation¶
MolCraftDiff generate my_cfggg