Inplace operation errors when implementing A2C algorithm

reinforcement-learning

DungNguyen (Nguyen Quoc Dung (K16_HL)) March 4, 2022, 5:00am 5

I found out the problem. It’s because I do not reset the worker.data after every episode so the network keeps meeting the same data again and again. retain_graph is unnecessary anyway
Thanks for helping me solve the issue!

1 Like

"RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [64, 1]], which is output 0 of AsStridedBackward0, is at version 3; expected version 2 instead. Hint: the backtrace further a