Dear all, I am new to reinforcement learning and I wanna create a gym 4D environment, where it is a 468x225x182x54 space. And every location in this space has a unique reward (or penalty). And my agent (e.g. rabbit) can jump anywhere in this space and will be rewarded based on where it is. I thought my action space could be defined as
class CustomEnv(gym.Env): def __init__(self): self.action_space = gym.spaces.MultiDiscrete([468, 225, 182, 54])
print(CustomEnv.action_space.sample()) [172 54 101 37]
so, my agent collects the reward of the location [172 54 101 37].
I want the step function for episodes of the game be like a rabbit make a jump, collects the reward and return the reward and the state of where it jumped in this 4D space.
However, I don’t know how should I define my observation space and I really appreciate your help.