Optimizers API

class layeredrl.optimizers.Optimizer(n_samples: int, device: device = device(type='cpu'))[source]

Bases: ABC

__init__(n_samples: int, device: device = device(type='cpu'))[source]

Initialize the optimizer.

Parameters:

n_samples – The number of samples to use for optimization, for example in CEM. If the algorithm does not use samples, this should be 1. This is defined for all optimizers in order to have a consistent number of dimensions for the input to the cost function.
device – The device to use.

abstractmethod optimize(cost: Callable[[Tensor], Tensor]) → Tuple[Tensor, Dict][source]

Optimize the given cost function (starting from the initial guess) and return the optimal x.

Note that everything is assumed to have a batch dimension. That includes x and the cost. The cost function can vary across the batch dimension.

Parameters:: cost – A function that takes in a tensor x and returns a tensor with a scalar cost for each environment instance.
Returns:: The optimal x, and an info dict.

abstractmethod reset(initial_x: Tensor, **kwargs) → None[source]

Reset the optimizer to the given initial guess.

If this is not called, the optimizer will derive the initial guess from its internal state.

Parameters:: initial_x – The initial guess for the optimal x.

class layeredrl.optimizers.CEM(n_iterations: int, n_samples: int, initial_sigma: Tensor, elite_ratio: float = 0.2, lower_bound: Tensor | None = None, upper_bound: Tensor | None = None, clip: bool = False, momentum: float = 0.0, return_mode: str = 'mean', record_samples: bool = False, *args, **kwargs)[source]

Bases: Optimizer

__init__(n_iterations: int, n_samples: int, initial_sigma: Tensor, elite_ratio: float = 0.2, lower_bound: Tensor | None = None, upper_bound: Tensor | None = None, clip: bool = False, momentum: float = 0.0, return_mode: str = 'mean', record_samples: bool = False, *args, **kwargs)[source]

Initialize the CEM optimizer.

Parameters:

n_iterations – The number of iterations to run CEM for.
n_samples – The number of samples to use draw in each iteration of CEM.
initial_sigma – The initial standard deviation of the samples. Shape: (batch_size, x_dim)
elite_ratio – The ratio of samples to keep per iteration.
lower_bound – The lower bound of the samples.
upper_bound – The upper bound of the samples. Either both lower_bound and upper_bound must be None or neither.
clip – Whether to clip the samples to the bounds.
momentum – Momentum factor for updating the mean.
return_mode – Whether to return the mean of the elite samples (“mean”), the best sample (“best”), or a random sample from the last Gaussian distribution (“random”).
record_samples – Whether to record the samples drawn during optimization.

optimize(cost: Callable[[Tensor], Tensor], verbose: bool = False) → Tuple[Tensor, Dict][source]

Optimize the given cost function using the Cross Entropy Method and return the optimal x.

Note that everything is assumed to have a batch dimension. That includes x and the cost.

Parameters:: cost – A function that takes in a tensor x and returns a tensor with a scalar cost for each particle in each environment instance. Output shape: (batch_size, n_samples)
Returns:: (batch_size, dim) An info dict containing the keys “mean” and “std” for the mean and standard deviation of the final distribution.
Return type:: The optimal x. Shape

reset(initial_x: Tensor) → None[source]

Reset the optimizer to the given initial guess.

If this is not called, the optimizer will derive the initial guess from its internal state.

Parameters:: initial_x – The initial guess for the optimal x. Shape: (batch_size, dim)

class layeredrl.optimizers.ICEM(beta: float = 1.0, action_dim: int = 2, *args, **kwargs)[source]

Bases: CEM

__init__(beta: float = 1.0, action_dim: int = 2, *args, **kwargs)[source]

Initialize the iCEM optimizer.

(see https://proceedings.mlr.press/v155/pinneri21a/pinneri21a.pdf)