Utilities API
- class layeredrl.utils.ToDeviceReplayBuffer(target_device=device(type='cpu'), *args, **kwargs)[source]
Bases:
VectorReplayBufferReplay buffer that moves batch to target device after sampling.
- __getitem__(index: slice | int | List[int] | ndarray) Batch[source]
Return a data batch: self[index].
- layeredrl.utils.sample_truncated_normal(loc, scale, lower_bound, upper_bound, n_samples, device)[source]
Sample from a truncated normal distribution.
Note that everything is assumed to have a batch dimension. Sampling is done via the inverse CDF method rather than rejection sampling.
- layeredrl.utils.get_normal_prob(x: Tensor, mean: Tensor, std: Tensor) Tensor[source]
Get the probability (density) of x under a Normal distribution with the given mean and standard deviation.
- Parameters:
x – The value.
mean – The mean of the Gaussian.
std – The standard deviation of the Normal distribution.
- Returns:
The probability (density) of x under the Normal distribution.
- layeredrl.utils.get_normal_log_prob(x: Tensor, mean: Tensor, std: Tensor, beta: float = 0.0) Tensor[source]
Get the log probability of x under a 1D Normal distribution with the given mean and standard deviation.
Can multiply a scale invariance factor to the second term of the log probability to remove the gradient of the log probability with respect to the scale of x and mean.
No sum over the last dimension is performed.
- Parameters:
x – The value.
mean – The mean of the Gaussian.
std – The standard deviation of the Normal distribution.
beta – The beta parameter for eta-NLL.
- Returns:
The log probability of x under the Normal distribution.
- layeredrl.utils.get_writer(logdir: Path | str, wandb_project: str | None = None, wandb_name: str | None = None, id: str | None = None)[source]
Get a TensorBoard writer for logging.
If a project name is given, also log to Weights & Biases.
- Parameters:
logdir – The path to the directory where to store the Weights & Biases and TensorBoard logs. If None, no logs are written.
wandb_project – The name of the Weights & Biases project to log to. If None, no logs are written to Weights & Biases.
wandb_name – The name of the run in Weights & Biases. If None, a two-word name is generated by wandb.
id – The id of the run in Weights & Biases (that will be continued). If None, a new run is created.
- Returns:
A TensorBoard writer or None if logdir is None.
- class layeredrl.utils.VideoLogger(log_dir: str, fps: int = 10, max_episodes: int = 10, camera_name: str | None = None)[source]
Bases:
objectClass for logging video clips of rollouts.
- __init__(log_dir: str, fps: int = 10, max_episodes: int = 10, camera_name: str | None = None)[source]
Initialize the logger.
- Parameters:
log_dir – The directory to save the videos.
fps – The frames per second of the video.
- class layeredrl.utils.RangesMap(ranges: List[Tuple[int | None, int | None]])[source]
Bases:
objectKeeps only specified index ranges of the input tensor.
- class layeredrl.utils.ConcatMap(maps: List[Callable])[source]
Bases:
objectConcatenates multiple maps.
- layeredrl.utils.to_torch(array: ndarray | Tensor, device: device = device(type='cpu')) Tensor[source]
Convert an array to a torch tensor.
- Parameters:
array – The array.
- Returns:
The torch tensor.
- layeredrl.utils.to_numpy(array: ndarray | Tensor, device: device = device(type='cpu')) ndarray[source]
Convert an array to a numpy array.
- Parameters:
array – The array.
- Returns:
The numpy array.
- layeredrl.utils.copy_torch_or_numpy(array: ndarray | Tensor) ndarray[source]
Copy a torch tensor or numpy array.
- Parameters:
array – The array.
- Returns:
The copy.
- layeredrl.utils.get_key_indices(input_space: Dict) Tuple[Dict, Dict][source]
Get index ranges that correspond to keys of a Dict space in flattened vector.
Also returns a dictionary containing the shapes of these observation components.
- Returns:
- A dictionary with keys corresponding to the observation components and
values being tuples of the form (start, end), where start and end are the indices at which the observation component starts and ends. The nested dictionary structure of the observation is preserved.
- A dictionary of the same structure but with values being the shapes
of the observation components.
- layeredrl.utils.get_obs_indices(env: Env) Tuple[Dict, Dict][source]
Get index ranges that correspond to keys of the observation Dict space in flattened vector.
Also returns a dictionary containing the shapes of these observation components.
- Parameters:
env – The environment with Dict observation space.
- Returns:
- A dictionary with keys corresponding to the observation components and
values being tuples of the form (start, end), where start and end are the indices at which the observation component starts and ends. The nested dictionary structure of the observation is preserved.
- A dictionary of the same structure but with values being the shapes
of the observation components.
- class layeredrl.utils.RunningBatchNorm(num_features: int, eps=1e-05, momentum=0.1, device=None, dtype=None, track_mean=True, freeze_after=None)[source]
Bases:
ModuleBatch normalization using running estimates of mean and variance.
- __init__(num_features: int, eps=1e-05, momentum=0.1, device=None, dtype=None, track_mean=True, freeze_after=None)[source]
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x: Tensor)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class layeredrl.utils.Standardizer(latent_state_dim: int, device: device)[source]
Bases:
ModuleStandardize a random vector (trained externally).
- __init__(latent_state_dim: int, device: device)[source]
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class layeredrl.utils.RampSchedule(start: float, end: float, warm_up: int, duration: int)[source]
Bases:
Schedule
- class layeredrl.utils.ZeroThenLinearSchedule(zero_duration: int, duration: int, start_value: float, end_value: float)[source]
Bases:
Schedule
- class layeredrl.utils.PiecewiseLinearSchedule(points: List[Tuple[int, float]])[source]
Bases:
Schedule