Error with gradient backprop when not doing batch

Fredrik_Lundberg · June 26, 2021, 9:46am

Hi!
I have implemented a form of option critic network with batch update for the critic but iterative for the actor. The first update is done after a batch has accumulated on both and then iterative on the actor but intermittently on the critic (when enough new samples have accumulated). This first update works fine but then when I only update the actor I get the following where the trackback reveals this to be the error:
nonspatial_latent_output = torch.unsqueeze(self.latent_nonspatial(nonspatial_input),0)
…
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [262176, 256]], which is output 0 of TBackward, is at version 2; expected version 1 instead

The only difference between the batch update is that that variable is then squeezed, but from what I gater neither squeeze nor unsqueeze should be inplace operations.

Any help would be greatly appreciated!!

Ayman_Al_Jabri · June 29, 2021, 10:06am

Try this:

nonspatial_latent_output = self.latent_nonspatial(nonspatial_input)
nonspatial_latent_output = nonspatial_latent_output.unsqueeze(0)

Fredrik_Lundberg · June 29, 2021, 4:47pm

Thank you for your reply, I think I tried it but it did not work. I have however found a solution, I was also using that feature network to produce the state for the next observation before computing gradients. When I changed that so that I only had put one observation through the feature network before computing the gradients it did not raise the error anymore. Not quite sure why since I did not use that state for any loss or anything, but maybe it confused its histories since there were no two observation having been put through the feature network.