MO-MultiwalkerStability¶
Agents names |
|
Action Space |
Box(-1.0, 1.0, (4,), float32) |
Observation Space |
Box(-inf, inf, (31,), float32) |
Reward Space |
Box([-2.100e+02 -1.567e-02], [-209.54 0. ], (2,), float32) |
Import |
|
A sister environment to MO-Multiwalker, which is the MO adaptation of the Multiwalker environment from PettingZoo.
Observation Space¶
Action Space¶
The action space is a vector representing the force exerted at the 4 available joints (hips and knees), giving a continuous action space with a 4 element vector.
The higher bound is 1
, the lower bound is -1
.
Reward Space¶
The reward space is a 2D vector where; the first value contains the following reward:
Maximizing distance traveled towards the end of the level during one step.
[-0.46, 0.46]
and the second value contains:
A penalty based on the change of angle of the package, to avoid shaking the package.
[-0.01567, 0]
Both these objectives are penalized with:
Penalty for agent falling.
[-110, 0]
Penalty for the package falling.
[-100, 0]
Episode Termination¶
The episode is terminated if the package is dropped. If terminate_on_fall
is True
(default), then environment is terminated if a single agent falls even if the package is still alive.