GymWrapper observation spec

Hey!
I’m wrapping a simple MultiArmedBandit Gymnasium env I created.
The original gymnasium env always returns the state 0, and its observation space is set to spaces.Discrete(1).

However, when wrapping it with GymWrapper, the observations always give 1 instead of zero.

It seems the issue has something to do with TorchRL’s usage of OneHotDiscreteTensorSpec, which changes the state to 1 when calling spec.encode in TorchRL’s internals.

How can it be fixed?
Much appreciated.

Try categorical_action_encoding=True that should work

1 Like

Completely missed that flag, thanks!