Agents names

walker_i for i in [0, 2]

Action Space

Box(-1.0, 1.0, (4,), float32)

Observation Space

Box(-inf, inf, (31,), float32)

Reward Space

Box([-2.100e+02 -1.567e-02], [-209.54 0. ], (2,), float32)



A sister environment to MO-Multiwalker, which is the MO adaptation of the Multiwalker environment from PettingZoo.

Observation Space

See PettingZoo documentation.

Action Space

The action space is a vector representing the force exerted at the 4 available joints (hips and knees), giving a continuous action space with a 4 element vector. The higher bound is 1, the lower bound is -1.

Reward Space

The reward space is a 2D vector where; the first value contains the following reward:

  • Maximizing distance traveled towards the end of the level during one step. [-0.46, 0.46]

and the second value contains:

  • A penalty based on the change of angle of the package, to avoid shaking the package. [-0.01567, 0]

Both these objectives are penalized with:

  • Penalty for agent falling. [-110, 0]

  • Penalty for the package falling. [-100, 0]

Episode Termination

The episode is terminated if the package is dropped. If terminate_on_fall is True (default), then environment is terminated if a single agent falls even if the package is still alive.


See PettingZoo documentation.