Flatten Discrete to Box is problematic #355

rusu24edward · 2022-10-28T00:22:56Z

Wrappers in Abmarl make a deepcopy of the agents. The original agents in the simulation have their original spaces, and the new, copied agents have the new spaces. In the case of the flatten wrapper, these new spaces are Boxes. The flatten wrapper converts Discrete to Box as a one-hot encoding. Suppose the original space is Discrete(3), then:

0 maps to [1, 0, 0]
1 maps to [0, 1, 0]
2 maps to [0, 0, 1]

When we sample the action space for random actions, it samples the Box, which can produce any of the eight combination of 0s and 1s in a three-element array, namely:

[0, 0, 0],
[0, 0, 1], *
[0, 1, 0], *
[0, 1, 1],
[1, 0, 0], *
[1, 0, 1],
[1, 1, 0],
[1, 1, 1]

Only three of these eight that I’ve starred are useable in the strict sense of the mapping. The unflatten function for a Discrete space uses np.nonzero(x)[0][0], and here’s at table of what the above arrays map to:

+ ------------------ + ---------------- + --------------------------------------------- +
| In Flattened Space | np.nonzero(x)[0] | np.nonzero(x)[0][0] (aka discrete equivalent) |
+ ------------------ + ---------------- + --------------------------------------------- +
| 0, 0, 0            | Error            | Error                                         |
| 0, 0, 1            | [2]              | 2                                             |
| 0, 1, 0            | [1]              | 1                                             |
| 0, 1, 1            | [1, 2]           | 1                                             |
| 1, 0, 0            | [0]              | 0                                             |
| 1, 0, 1            | [0, 2]           | 0                                             |
| 1, 1, 0            | [0, 1]           | 0                                             |
| 1, 1, 1            | [0, 1, 2]        | 0                                             |
+ ------------------ + ---------------- + --------------------------------------------- +

Implications

Obviously, [0, 0, 0] will fail because there is no nonzero.
Importantly, only one eighth of the random samples will map to 2. One fourth will map to 1, and one half will map to 0. This has some important implications on exploration, especially if action 2 is the “correct action” throughout much of the simulation.

Solution

This is unique to Discrete spaces. Instead of mapping to a one-hot encoding we could just map to a box of a single element with the appropriate range. Discrete(n) maps to Box(0, n-1, (1,), int) instead of Box(0, 1, (n,), int).

The text was updated successfully, but these errors were encountered:

rusu24edward · 2022-10-31T19:01:10Z

Wait a bit to hear back from rllib and gym before making a change here.

rusu24edward added the bug label Oct 28, 2022

rusu24edward added this to the Coming soon! milestone Oct 28, 2022

rusu24edward mentioned this issue Nov 4, 2022

Abmarl 355 flatten discrete to box #356

Merged

rusu24edward closed this as completed in #356 Nov 4, 2022

rusu24edward self-assigned this Nov 4, 2022

rusu24edward modified the milestones: Coming soon!, Next release Nov 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flatten Discrete to Box is problematic #355

Flatten Discrete to Box is problematic #355

rusu24edward commented Oct 28, 2022 •

edited

Loading

rusu24edward commented Oct 31, 2022

Flatten Discrete to Box is problematic #355

Flatten Discrete to Box is problematic #355

Comments

rusu24edward commented Oct 28, 2022 • edited Loading

Implications

Solution

rusu24edward commented Oct 31, 2022

rusu24edward commented Oct 28, 2022 •

edited

Loading