RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [64, 3]], which is output 0 of TBackward, is at version 4; expected version 3 instead

TMRajeh · November 21, 2022, 4:22pm

I’m facing this problem. some help, please?
I tried many solutions before, but they didn’t work. I have a number “N” of agents, and each agent owns an independent actor and critic. Each agent has different states according to the label given to each agent.

all_agents = []

all_agents.append (Agent (actor_dims, critic_dims))

for agent_idx, agent in enumerate (all_agents):
    i = agent.agent_label
    critic_value_ = agent.target_critic.forward (states_[i], new_actions_cluster[i]).flatten ()

    critic_value = agent.critic.forward (states[i], old_actions_cluster[i]).flatten ()

    target = rewards[:, agent_idx] + agent.gamma * critic_value_

    critic_loss= F.mse_loss (critic_value.float (), target.float ())

    agent.critic.optimizer.zero_grad ()
    critic_loss.backward (retain_graph=True)

    actor_loss = agent.critic.forward (states[i], mu_cluster[i]).flatten ()
    actor_loss = -(T.mean (actor_loss))

    agent.actor.optimizer.zero_grad ()
    actor_loss.backward ()

    agent.critic.optimizer.step ()
    agent.actor.optimizer.step ()```

##################################
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [64, 3]], which is output 0 of TBackward, is at version 4; expected version 3 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!```

ptrblck · November 22, 2022, 12:06am

Double post from here.