Mario RL example with newer versions of libraries

I’m working on the mario RL tutorial. I’ve found that it’s a bit out of date with the latest versions of the libraries. I’ve worked around a few issues like updating to the new env API where done is split into truncated or terminated.

next_state, reward, truncated, terminated, info = env.step(action)
done = truncated or terminated

I’m looking at this part where LazyFrames are converted to pytorch tensors and am getting errors.

state = torch.FloatTensor(state).cuda() if self.use_cuda else torch.FloatTensor(state)
expected sequence of length 4 at dim 1 (got 0)
  File "/home/user/stats/mario_rl/MadMario/mario_pytorch_example.py", line 342, in cache
    state = torch.FloatTensor(state).cuda() if self.use_cuda else torch.FloatTensor(state)
  File "/home/user/stats/mario_rl/MadMario/mario_pytorch_example.py", line 726, in <module>
    mario.cache(state, next_state, action, reward, done)

Can anyone suggest how to proceed with the conversion? I’ve tried suggestions to cast state using np.ndarray(state) but that didn’t work. Any suggestions? Thanks in advance.

Seems like the states that are returned by the environment have changed. They are now tuples with an extra info object.

state = self.observation(None), info

I was able to change the code to use just the observation part of the state to get things running. I will have to read more about what the info object contains. Seems like things are running. After I clean things up, is there any interest in an MR with the changes for the latest versions of things. I have also set things up in a poetry environment. The conda environment from the linked github was not working for me as I think some of the packages are missing/out of date.

PR is here: fix: update versions and readme by rohitpid · Pull Request #16 · yfeng997/MadMario · GitHub