Tutorial: Transfer Learning¶

Transfer Learning (TL) retrains a prior on a focused SMILES dataset, producing an agent biased toward that chemical series. The output model can be used directly for sampling or as the starting point for a reinforcement learning run.

When to Use TL¶

You have a set of known active molecules for a target and want the generator to produce similar structures.
You want to reduce the RL exploration burden by starting from a focused agent rather than a broad prior.
You don’t have a well-defined scoring function for RL but want to bias generation toward a specific chemical series.

Input Data¶

TL reads SMILES from a plain text file, one per line. The first column is used; a second column (e.g. compound name or ID) is ignored.

Split your data into training and validation sets before running — REINVENT4 does not do this automatically. A typical split is 80/20.

Key Parameters¶

Parameter	Description
`input_model_file`	Prior or checkpoint to start from
`smiles_file`	Training SMILES (first column used)
`validation_smiles_file`	Validation SMILES for monitoring overfitting
`output_model_file`	Path for the resulting agent (`.model`)
`num_epochs`	Number of training epochs; start with 50–100 and adjust based on validation loss
`save_every_n_epochs`	Frequency of intermediate checkpoints; useful for resuming or selecting the best epoch
`batch_size`	Number of SMILES per gradient step; smaller batches give noisier but faster updates
`tb_logdir`	Directory for TensorBoard logs (training and validation loss curves)

Configuration¶

Reinvent¶

run_type = "transfer_learning"
device = "cuda:0"          # or "cpu"
tb_logdir = "tb_TL"        # TensorBoard log directory

[parameters]
input_model_file = "priors/reinvent.prior"
smiles_file = "doc/data/tl_reinvent.smi"
validation_smiles_file = "doc/data/tl_reinvent_val.smi"
output_model_file = "tl_agent.model"

num_epochs = 50
save_every_n_epochs = 10   # write a checkpoint every N epochs
batch_size = 50

Mol2Mol¶

Mol2Mol is a conditional generator: at inference time it takes an input molecule and generates similar ones. TL therefore trains on (source, target) pairs rather than individual SMILES. The pairs block controls how these pairs are constructed from your dataset using Tanimoto similarity.

run_type = "transfer_learning"
device = "cuda:0"
tb_logdir = "tb_TL"

[parameters]
input_model_file = "priors/mol2mol_medium_similarity.prior"
smiles_file = "doc/data/tl_reinvent.smi"
validation_smiles_file = "doc/data/tl_reinvent_val.smi"
output_model_file = "tl_mol2mol.model"

num_epochs = 50
save_every_n_epochs = 10
batch_size = 50

pairs.type = "tanimoto"
pairs.lower_threshold = 0.7   # only pair molecules with Tanimoto >= 0.7
pairs.upper_threshold = 1.0   # set < 1.0 to exclude identical molecules from pairing
pairs.min_cardinality = 1     # discard source molecules that have fewer than N valid targets
pairs.max_cardinality = 199   # discard source molecules that have more than N valid targets (prevents very promiscuous sources from dominating training)

Note: This tutorial covers Reinvent and Mol2Mol only. LibInvent and LinkInvent use a two-column SMILES format and TL only affects the learned part (R-groups or linker) rather than the constrained scaffold — making it of limited practical value in most cases.

Output¶

File	Description
`output_model_file`	Final trained agent — use as `model_file` in sampling or as `agent_file` in RL
Checkpoints (`*.chkpt`)	Intermediate snapshots saved every `save_every_n_epochs`
TensorBoard logs	Loss curves for training and validation (written to `tb_logdir`)

What to Check¶

Training and validation loss converge: a widening gap means overfitting — stop earlier or use a checkpoint with lower validation loss.
Valid SMILES rate stays high: sample from the trained agent; a drop below ~90% means the model drifted too far from the prior.
Sampled structures resemble training set: if outputs are too dissimilar or degenerate, reduce num_epochs.