Multi-Agent Advantage calculation is leading to in-place gradient error

KFrank · June 30, 2023, 2:41am

Hi Aaron!

_aadharna:

This has led to an in-place error that’s killing the gradient and pytorch’s anomaly detection = True stack trace shows me the value function output from my NN.

…

Full error if anyone wants it:

/home/roque/miniconda3/envs/mapo/lib/python3.9/site-packages/torch/autograd/__init__.py:197: UserWarning: Error detected in AddmmBackward0. Traceback of forward call that caused the error:
...
  File "/mnt/d/PycharmProjects/ubc/mapo/c4_train.py", line 184, in get_action_and_value
    return action, probs.log_prob(action), probs.entropy(), self.critic(x), logits.cpu().detach()
...
  File "/mnt/d/PycharmProjects/ubc/mapo/c4/ppo.py", line 248, in update_model
    loss.backward(retain_graph=True)
...
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [256, 1]], which is output 0 of AsStridedBackward0, is at version 4; expected version 3 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

Starting with the forward-call traceback, look at the last couple of lines
in your code (that are then followed by calls into pytorch infrastructure).

I would certainly drill down into what self.critic (x) is doing.

[Edit: Some further words of clarification / explanation: As I’ve come to
understand it, anomaly detection’s forward-call traceback flags the
operation in the forward pass whose backward pass is being blocked
by the inplace modification of some tensor required by the backward
pass (rather than flagging the operation that modifies that tensor).

Given that you are using retain_graph = True, I speculate that you
are doing something like:

critic_loss.backward (retain_graph = True)
critic_optimizer.step()   # modifies critic's parameters inplace
...
actor_loss.backward()     # where actor loss depends on critic
actor_optimizer.step()

If so, you will try to backpropagate again through critic, which has had
its parameters modified inplace by its optimizer. Whether or not modifying
a tensor inplace will cause an inplace-modification error depend on the
details of whether that tensor is needed in the backward-pass computation,
but it is likely that at least some of critic’s parameters will be needed in
the backward pass.

If you have identified the tensor that is causing the inplace-modification
error – likely one of critic’s parameters – print out its ._version before
and after calling critic_optimizer.step(). If you can’t identify the tensor
in question, you can still test this theory by commenting out the call to
critic_optimizer.step() and see if this particulate inplace-modification
error goes away. (There may be others.)]

Also, look at the inplace-modification error itself. It is telling you that a
FloatTensor of shape [256, 1] is the tensor that is being modified
inplace. Where in your code do you have a tensor of that shape (that
occurs somewhere in the forward pass)? Look closely at how it is being
used and if you can see an inplace modification.

Note that the error message is complaining that it should be of “version 3”
rather than of “version 4.” Print out the tensor’s ._version property at
various strategic places in your code. The inplace modification is occurring
somewhere between ._version values of 3 and 4. You can insert
intermediate ._version print statements to perform a binary search to
locate exactly where the inplace modification is occurring.

For example, you might try printing out ._version just before and just
after the call to self.critic (x) upon which the forward-call traceback
casts some suspicion.

Note also that you are calling .backward (retain_graph = True).
First, make sure that this is correct logic for your use case. If it is, be
aware that calling optimizer.step() performs an inplace modification
of the parameters being optimized by optimizer. Again, you can check
this by printing out ._version for the problematic tensor before and after
the call to optimizer.step().

For some examples that illustrate these inplace-modification debugging
techniques, see this post:

"RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [64, 1]], which is output 0 of AsStridedBackward0, is at version 3; expected version 2 instead. Hint: the backtrace further a autograd

Hi Fahmyadan and Sangyoon! Here are some suggestions about how to track down (and maybe fix) inplace-modification errors. Note that an inplace modification in the forward pass is not necessarily* an error – it depends on whether and how the tensor that was modified is used in the backward pass. Note that inplace operations can be useful for saving memory – if you replace an innocent inplace operation with an out-of-place equivalent, your training will use more memory (and, to a minor e…

Good luck!

K. Frank