MolecularDiffusion.modules.tasks.metrics

Functions

QED(pred)

Quantitative estimation of drug-likeness.

SA(pred)

Synthetic accesibility score.

accuracy(pred, target)

Classification accuracy.

area_under_prc(pred, target)

Area under precision-recall curve (PRC).

area_under_roc(pred, target)

Area under receiver operating characteristic curve (ROC).

chemical_validity(pred)

Chemical validity of molecules.

f1_max(pred, target)

F1 score with the optimal threshold.

logP(pred)

Logarithm of partition coefficient between octanol and water for a compound.

matthews_corrcoef(pred, target)

Matthews correlation coefficient between prediction and target.

no_rdkit_log()

Context manager to suppress all rdkit loggings.

pearsonr(pred, target)

Pearson correlation between prediction and target.

penalized_logP(pred)

Logarithm of partition coefficient, penalized by cycle length and synthetic accessibility.

r2(pred, target)

\(R^2\) regression score.

spearmanr(pred, target)

Spearman correlation between prediction and target.

variadic_accuracy(input, target, size)

Classification accuracy for categories with variadic sizes.

variadic_area_under_prc(pred, target, size)

Area under precision-recall curve (PRC) for sets with variadic sizes.

variadic_area_under_roc(pred, target, size)

Area under receiver operating characteristic curve (ROC) for sets with variadic sizes.

variadic_top_precision(pred, target, size, k)

Top-k precision for sets with variadic sizes.

Module Contents

MolecularDiffusion.modules.tasks.metrics.QED(pred)

Quantitative estimation of drug-likeness.

Parameters:

pred (PackedMolecule) – molecules to evaluate

MolecularDiffusion.modules.tasks.metrics.SA(pred)

Synthetic accesibility score.

Parameters:

pred (PackedMolecule) – molecules to evaluate

MolecularDiffusion.modules.tasks.metrics.accuracy(pred, target)

Classification accuracy.

Suppose there are \(N\) sets and \(C\) categories.

Parameters:
  • pred (Tensor) – prediction of shape \((N, C)\)

  • target (Tensor) – target of shape \((N,)\)

MolecularDiffusion.modules.tasks.metrics.area_under_prc(pred, target)

Area under precision-recall curve (PRC).

Parameters:
  • pred (Tensor) – predictions of shape \((n,)\)

  • target (Tensor) – binary targets of shape \((n,)\)

MolecularDiffusion.modules.tasks.metrics.area_under_roc(pred, target)

Area under receiver operating characteristic curve (ROC).

Parameters:
  • pred (Tensor) – predictions of shape \((n,)\)

  • target (Tensor) – binary targets of shape \((n,)\)

MolecularDiffusion.modules.tasks.metrics.chemical_validity(pred)

Chemical validity of molecules.

Parameters:

pred (PackedMolecule) – molecules to evaluate

MolecularDiffusion.modules.tasks.metrics.f1_max(pred, target)

F1 score with the optimal threshold.

This function first enumerates all possible thresholds for deciding positive and negative samples, and then pick the threshold with the maximal F1 score.

Parameters:
  • pred (Tensor) – predictions of shape \((B, N)\)

  • target (Tensor) – binary targets of shape \((B, N)\)

MolecularDiffusion.modules.tasks.metrics.logP(pred)

Logarithm of partition coefficient between octanol and water for a compound.

Parameters:

pred (PackedMolecule) – molecules to evaluate

MolecularDiffusion.modules.tasks.metrics.matthews_corrcoef(pred, target)

Matthews correlation coefficient between prediction and target.

Definition follows matthews_corrcoef for K classes in sklearn. For details, see: https://scikit-learn.org/stable/modules/model_evaluation.html#matthews-corrcoef

Parameters:
  • pred (Tensor) – prediction of shape :math: (N, K)

  • target (Tensor) – target of shape :math: (N,)

MolecularDiffusion.modules.tasks.metrics.no_rdkit_log()

Context manager to suppress all rdkit loggings.

MolecularDiffusion.modules.tasks.metrics.pearsonr(pred, target)

Pearson correlation between prediction and target.

Parameters:
  • pred (Tensor) – prediction of shape :math: (N,)

  • target (Tensor) – target of shape :math: (N,)

MolecularDiffusion.modules.tasks.metrics.penalized_logP(pred)

Logarithm of partition coefficient, penalized by cycle length and synthetic accessibility.

Parameters:

pred (PackedMolecule) – molecules to evaluate

MolecularDiffusion.modules.tasks.metrics.r2(pred, target)

\(R^2\) regression score.

Parameters:
  • pred (Tensor) – predictions of shape \((n,)\)

  • target (Tensor) – targets of shape \((n,)\)

MolecularDiffusion.modules.tasks.metrics.spearmanr(pred, target)

Spearman correlation between prediction and target.

Parameters:
  • pred (Tensor) – prediction of shape :math: (N,)

  • target (Tensor) – target of shape :math: (N,)

MolecularDiffusion.modules.tasks.metrics.variadic_accuracy(input, target, size)

Classification accuracy for categories with variadic sizes.

Suppose there are \(N\) samples, and the number of categories in all samples is summed to \(B\).

Parameters:
  • input (Tensor) – prediction of shape \((B,)\)

  • target (Tensor) – target of shape \((N,)\). Each target is a relative index in a sample.

  • size (Tensor) – number of categories of shape \((N,)\)

MolecularDiffusion.modules.tasks.metrics.variadic_area_under_prc(pred, target, size)

Area under precision-recall curve (PRC) for sets with variadic sizes.

Suppose there are \(N\) sets, and the sizes of all sets are summed to \(B\).

Parameters:
  • pred (Tensor) – prediction of shape \((B,)\)

  • target (Tensor) – target of shape \((B,)\).

  • size (Tensor) – size of sets of shape \((N,)\)

MolecularDiffusion.modules.tasks.metrics.variadic_area_under_roc(pred, target, size)

Area under receiver operating characteristic curve (ROC) for sets with variadic sizes.

Suppose there are \(N\) sets, and the sizes of all sets are summed to \(B\).

Parameters:
  • pred (Tensor) – prediction of shape \((B,)\)

  • target (Tensor) – target of shape \((B,)\).

  • size (Tensor) – size of sets of shape \((N,)\)

MolecularDiffusion.modules.tasks.metrics.variadic_top_precision(pred, target, size, k)

Top-k precision for sets with variadic sizes.

Suppose there are \(N\) sets, and the sizes of all sets are summed to \(B\).

Parameters:
  • pred (Tensor) – prediction of shape \((B,)\)

  • target (Tensor) – target of shape \((B,)\)

  • size (Tensor) – size of sets of shape \((N,)\)

  • k (LongTensor) – the k in “top-k” for different sets of shape \((N,)\)