MolecularDiffusion.utils.comm¶

Attributes¶

`cpu_group`
`gpu_group`

Functions¶

`cat`(obj[, dst])	Concatenate any nested container of tensors along the 0-th axis.
`get_cpu_count`()	Get the number of CPUs on this node.
`get_group`(device)	Get the process group corresponding to the given device.
`get_rank`()	Get the rank of this process in distributed processes.
`get_world_size`()	Get the total number of distributed processes.
`init_process_group`(backend[, init_method])	Initialize CPU and/or GPU process groups.
`reduce`(obj[, op, dst])	Reduce any nested container of tensors.
`stack`(obj[, dst])	Stack any nested container of tensors. The new dimension will be added at the 0-th axis.
`synchronize`()	Synchronize among all distributed processes.

Module Contents¶

MolecularDiffusion.utils.comm.cat(obj, dst=None)¶

Concatenate any nested container of tensors along the 0-th axis.

Parameters:

obj (Object) – any container object. Can be nested list, tuple or dict.
dst (int, optional) – rank of destination worker. If not specified, broadcast the result to all workers.

Example:

>>> # assume 4 workers
>>> rank = comm.get_rank()
>>> rng = torch.arange(10)
>>> obj = {"range": rng[rank * (rank + 1) // 2: (rank + 1) * (rank + 2) // 2]}
>>> obj = comm.cat(obj)
>>> assert torch.allclose(obj["range"], rng)

MolecularDiffusion.utils.comm.get_cpu_count()¶: Get the number of CPUs on this node.

MolecularDiffusion.utils.comm.get_group(device)¶

Get the process group corresponding to the given device.

Parameters:: device (torch.device) – query device

MolecularDiffusion.utils.comm.get_rank()¶

Get the rank of this process in distributed processes.

Return 0 for single process case.

MolecularDiffusion.utils.comm.get_world_size()¶

Get the total number of distributed processes.

Return 1 for single process case.

MolecularDiffusion.utils.comm.init_process_group(backend, init_method=None, **kwargs)¶

Initialize CPU and/or GPU process groups.

Parameters:

backend (str) – Communication backend. Use nccl for GPUs and gloo for CPUs.
init_method (str, optional) – URL specifying how to initialize the process group

MolecularDiffusion.utils.comm.reduce(obj, op='sum', dst=None)¶

Reduce any nested container of tensors.

Parameters:

obj (Object) – any container object. Can be nested list, tuple or dict.
op (str, optional) – element-wise reduction operator. Available operators are sum, mean, min, max, product.
dst (int, optional) – rank of destination worker. If not specified, broadcast the result to all workers.

Example:

>>> # assume 4 workers
>>> rank = comm.get_rank()
>>> x = torch.rand(5)
>>> obj = {"polynomial": x ** rank}
>>> obj = comm.reduce(obj)
>>> assert torch.allclose(obj["polynomial"], x ** 3 + x ** 2 + x + 1)

MolecularDiffusion.utils.comm.stack(obj, dst=None)¶

Stack any nested container of tensors. The new dimension will be added at the 0-th axis.

Parameters:

obj (Object) – any container object. Can be nested list, tuple or dict.
dst (int, optional) – rank of destination worker. If not specified, broadcast the result to all workers.

Example:

>>> # assume 4 workers
>>> rank = comm.get_rank()
>>> x = torch.rand(5)
>>> obj = {"exponent": x ** rank}
>>> obj = comm.stack(obj)
>>> truth = torch.stack([torch.ones_like(x), x, x ** 2, x ** 3]
>>> assert torch.allclose(obj["exponent"], truth))

MolecularDiffusion.utils.comm.synchronize()¶: Synchronize among all distributed processes.

MolecularDiffusion.utils.comm.cpu_group = None¶

MolecularDiffusion.utils.comm.gpu_group = None¶