Unable to locate inplace operation which is causing a RuntimeError in gradient computation

saintlyk1d · July 4, 2021, 12:01pm

Hello,
So I’m getting the following error :

/home/code-base/runtime/dev/InverseCooking_1.7/src/modules/multihead_attention.py:94: UserWarning: Output 0 of SplitBackward is a view and is being modified inplace. This view is an output of a function that returns multiple views. Inplace operators on such views are being deprecated and will be forbidden starting from version 1.8. Consider using `unsafe_` version of the function that produced this view or don't modify this view inplace. (Triggered internally at  /pytorch/torch/csrc/autograd/variable.cpp:547.)
  q *= self.scaling
Traceback (most recent call last):
  File "train.py", line 536, in <module>
    main(args)
  File "train.py", line 462, in main
    loss.backward()
  File "/home/code-base/runtime/Internship/.myenv/lib/python3.6/site-packages/torch/tensor.py", line 245, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/home/code-base/runtime/Internship/.myenv/lib/python3.6/site-packages/torch/autograd/__init__.py", line 147, in backward
    allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.LongTensor [1, 20]] is at version 31; expected version 30 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

The following is the backtrace printed when I use the torch.autograd.set_detect_anomaly(True) method :

[W python_anomaly_mode.cpp:104] Warning: Error detected in EmbeddingBackward. Traceback of forward call that caused the error:  File "train.py", line 536, in <module>
    main(args)
  File "train.py", line 385, in main
    losses = model(img_inputs, captions, ingr_gt, keep_cnn_gradients=keep_cnn_gradients)
  File "/home/code-base/runtime/Internship/.myenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)  File "/home/code-base/runtime/dev/InverseCooking_1.7/src/model.py", line 245, in forward    outputs, ids = self.recipe_decoder(ingr_feat, target_ingr_mask_array[i], captions[i], img_features_multi[i], incremental_state = incremental_state, use_last_word = False)
  File "/home/code-base/runtime/Internship/.myenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl    result = self.forward(*input, **kwargs)
  File "/home/code-base/runtime/dev/InverseCooking_1.7/src/modules/transformer_decoder.py", line 311, in forward    x = self.embed_scale *self.embed_tokens(captions)
  File "/home/code-base/runtime/Internship/.myenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/code-base/runtime/Internship/.myenv/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 158, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)  File "/home/code-base/runtime/Internship/.myenv/lib/python3.6/site-packages/torch/nn/functional.py", line 1916, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
 (function _print_stack)

I’m having trouble locating the inplace operation that is supposedly happening. Will be a great help, if someone could help me out here.
Thanks in advance!