MolecularDiffusion.modules.models.tabasco.data.components.lmdb_base

Classes

BaseLMDBDataset

Helper class that provides a standard way to create an ABC using

Module Contents

class MolecularDiffusion.modules.models.tabasco.data.components.lmdb_base.BaseLMDBDataset(split: str, lmdb_dir: str, transform: Callable | None = None, single_sample: bool = False, limit_samples: int | None = None, pre_filter: List[Callable] | None = None)

Bases: torch.utils.data.Dataset, abc.ABC

Helper class that provides a standard way to create an ABC using inheritance.

Credits: Charlie Harris. Dataset class for LMDB-based datasets.

Parameters:
  • split (str) – The split of the dataset to use (e.g., ‘train’, ‘val’, ‘test’).

  • transform (Callable, optional) – A function/transform that takes in a sample and returns a transformed version. Defaults to None.

  • single_sample (bool, optional) – Whether to return a single sample. Defaults to False.

  • limit_samples (int, optional) – The number of samples to limit the dataset to. Defaults to None.

  • pre_filter (Callable, optional) – A function that takes in a sample and returns True if the sample should be included in the dataset, and False otherwise. Defaults to None.

update_data(update_fn: Callable)

Update data in the LMDB database using a provided update function.

Parameters:

update_fn (Callable) – A function that takes a single data item and returns an updated version of it.

db = None
index = None
keys = None
limit_samples = None
lmdb_dir
lmdb_path
pre_filter = None
single_sample = False
split
transform = None