Environments and Wrappers API
- class layeredrl.envs.LogRewWrapper(env, r_offset: float = 30.0, r_scale: float = 0.02)[source]
Bases:
RewardWrapperApplies log to the reward.
- __init__(env, r_offset: float = 30.0, r_scale: float = 0.02)[source]
Constructor for the Reward wrapper.
- Parameters:
env – Environment to be wrapped.
- reward(r: SupportsFloat) SupportsFloat[source]
Returns a modified environment
reward.- Parameters:
reward – The
envstep()reward- Returns:
The modified reward
- class layeredrl.envs.AntFlippedWrapper(env: Env[ObsType, ActType])[source]
Bases:
WrapperWrapper around Ant Maze environment that terminates when the Ant has flipped over.
- class layeredrl.envs.AntNoWallFlippedWrapper(env: Env[ObsType, ActType])[source]
Bases:
WrapperWrapper around Ant environment that terminates when the Ant has flipped over.
- class layeredrl.envs.AffineRewWrapper(env, r_offset: SupportsFloat = 0.0, r_scale: SupportsFloat = 1.0)[source]
Bases:
RewardWrapperApplies an affine linear transformation to the reward signal.
- __init__(env, r_offset: SupportsFloat = 0.0, r_scale: SupportsFloat = 1.0)[source]
Constructor for the Reward wrapper.
- Parameters:
env – Environment to be wrapped.
- reward(r: SupportsFloat) SupportsFloat[source]
Returns a modified environment
reward.- Parameters:
reward – The
envstep()reward- Returns:
The modified reward
- class layeredrl.envs.Maze2DEnv(maze_layout: ndarray | None = None, maze_size: Tuple[int, int] = (10, 10), cell_size: float = 1.0, start_pos: List[Tuple[float, float]] | None = None, goal_pos: List[Tuple[float, float]] | None = None, goal_radius: float = 0.3, max_velocity: float = 1.0, dt: float = 0.1, max_episode_steps: int = 400, dense_reward: bool = True, render_mode: str | None = None, pixel_size: int = 600)[source]
Bases:
EnvA simple 2D maze environment with a velocity-controlled point mass.
The agent controls velocity directly. The environment includes: - Collision detection with walls - Configurable maze layouts - Pygame-based visualization for rendering and planning overlays
- Observation:
- Type: Dict with keys:
‘observation’: Box(2) - current position [x, y]
‘achieved_goal’: Box(2) - current position [x, y]
‘desired_goal’: Box(2) - goal position [x, y]
- Each Box has:
Min: [0, 0] Max: [maze_width, maze_height]
- Action:
Type: Box(2) Num Action Min Max 0 x velocity -max_velocity max_velocity 1 y velocity -max_velocity max_velocity
- Reward:
Sparse reward of 1.0 when reaching the goal, 0.0 otherwise. Can be customized with a reward function.
- Episode Termination:
Agent reaches within goal_radius of the goal position
Episode length is greater than max_episode_steps
- __init__(maze_layout: ndarray | None = None, maze_size: Tuple[int, int] = (10, 10), cell_size: float = 1.0, start_pos: List[Tuple[float, float]] | None = None, goal_pos: List[Tuple[float, float]] | None = None, goal_radius: float = 0.3, max_velocity: float = 1.0, dt: float = 0.1, max_episode_steps: int = 400, dense_reward: bool = True, render_mode: str | None = None, pixel_size: int = 600)[source]
Initialize the Maze2D environment.
- Parameters:
maze_layout – Binary array where 1 = wall, 0 = free space. If None, creates empty maze.
maze_size – Size of the maze in cells (height, width) if maze_layout is None
cell_size – Size of each cell in world coordinates
start_pos – List of starting positions (x, y). If None, uses all empty cells.
goal_pos – List of goal positions (x, y). If None, uses all empty cells.
goal_radius – Distance threshold for reaching the goal
max_velocity – Maximum velocity magnitude in each dimension
dt – Time step for integration
max_episode_steps – Maximum steps per episode
dense_reward – If True, provide dense reward based on distance to goal
render_mode – “human” or “rgb_array”
pixel_size – Size of the rendering window in pixels
- reset(seed: int | None = None, options: Dict[str, Any] | None = None) Tuple[ndarray, Dict[str, Any]][source]
Reset the environment to initial state.
- layeredrl.envs.create_simple_maze(size: int = 10) ndarray[source]
Create a simple maze with some walls.