Hierarchies API
- class layeredrl.hierarchies.Hierarchy(levels: List[Level], env: Env, env_obs_maps: List[Callable[[Tensor], Tensor]] | None = None, mapped_env_obs_shapes: List[Tuple] | None = None, keep_params: bool = False, device: device = device(type='cpu'), writer: SummaryWriter | None = None)[source]
Bases:
object- __init__(levels: List[Level], env: Env, env_obs_maps: List[Callable[[Tensor], Tensor]] | None = None, mapped_env_obs_shapes: List[Tuple] | None = None, keep_params: bool = False, device: device = device(type='cpu'), writer: SummaryWriter | None = None)[source]
Initialize the hierarchy. Makes sure the action a level emits fits the input expected by the level below it.
- Parameters:
levels – The levels of the hierarchy.
env – The environment.
obs_input_maps – A list of functions that map the environment observation to a vector that is provided to the corresponding level. This can be used to implement information hiding and for moving a trained level from one environment to another with a different observation space. If None, the identity map is used.
mapped_env_obs_shapes – The shapes of the output of the env_obs_map of each level. If None, the dimension of the environment observation space is used. If negative, the negative dimension is added to the dimension of the environment observation space. This is useful if the map from the environment observation to the level observation is dropping some components.
keep_params – Whether to keep the parameters of the levels instead of resetting them. Setting this to True is only valid if the levels were already initialized before.
device – The device to use.
writer – The TensorBoard writer to use for logging. If None, no logging is done.
- get_action(obs: Tensor) Tensor[source]
Get an action for the given observation.
Note that obs and the returned action have a batch dimension corresponding to environment instances.
The method descends the hierarchy from top to bottom, starting with the active level. From thereon, an action is obtained for each level which is then passed to the level below. The action of the lowest level is returned (to be executed in the environment).
- Parameters:
obs – The environment observation.
- Returns:
The action for the environment.
- get_copy() Hierarchy[source]
Return a copy of the hierarchy.
No models are copied, only the structure and state of the hierarchy.
The copy of the hierarchy can be used for testing rollouts without influencing the state of the original hierarchy, for example. Learning with the copy will influence the original hierarchy, however, and is not recommended.
- process_transition(obs_next: Tensor, rew: Tensor, terminated: Tensor, truncated: Tensor) None[source]
Process the environment transition and return control to the higher levels where appropriate.
Starting from the lowest level, the active level can pass control back to the level above. This can continue until a level stays active or the highest level is reached.
While ascending the hierarchy, register the (semi-MDP) transitions with the levels.
- Parameters:
obs_next – The next environment observation.
rew – The reward of the environment transition.
terminated – Whether the episode terminated. Tensor with one entry per environment instance.
truncated – Whether the episode was truncated. Tensor with one entry per environment instance.
- reset() None[source]
Reset the hierarchy.
Call this at the beginning of the session. The highest level is active at the beginning for all environment instances.
Do not call at the end of episodes.
- set_n_env_instances(n_env_instances: int, propagate_to_levels: bool = True) None[source]
Set the number of environment instances.
- Parameters:
n_env_instances – The number of environment instances.
- class layeredrl.hierarchies.RandomHierarchy(env, device)[source]
Bases:
HierarchyA hierarchy consisting of a single level returning random actions.
The actions are sampled uniformly from the action space of the environment if the action space is finite/a finite interval.
- __init__(env, device)[source]
Initialize the hierarchy. Makes sure the action a level emits fits the input expected by the level below it.
- Parameters:
levels – The levels of the hierarchy.
env – The environment.
obs_input_maps – A list of functions that map the environment observation to a vector that is provided to the corresponding level. This can be used to implement information hiding and for moving a trained level from one environment to another with a different observation space. If None, the identity map is used.
mapped_env_obs_shapes – The shapes of the output of the env_obs_map of each level. If None, the dimension of the environment observation space is used. If negative, the negative dimension is added to the dimension of the environment observation space. This is useful if the map from the environment observation to the level observation is dropping some components.
keep_params – Whether to keep the parameters of the levels instead of resetting them. Setting this to True is only valid if the levels were already initialized before.
device – The device to use.
writer – The TensorBoard writer to use for logging. If None, no logging is done.