Hi Zhouker!
This line of code calling self.qkv().reshape().permute()
doesn’t match the call to
self.qkv()
quoted in the forward-call traceback produced by detect_anomaly()
.
Are you sure that you’re looking at the right sections or versions of your code here?
This looks like a problem. As I understand it, x
(at this point in the code) is a tensor in
the computation graph.
Note that writing into a tensor with indexing is an inplace modification, so it looks like
x[i] = ...
is introducing an inplace modification into the computation graph.
What is the shape of x
at this point as confirmed by printing out x.shape
? Does
it match the shape of the modified tensor reported in the original error message?
What are your print
statements showing for x._version
? Is it changing, indicating an
inplace modification? Do the values of x._version
line up with those in the original
error message?
You can find a discussion of how to debug inplace-modification errors in this post.
Inplace-modification errors can often be “fixed” (that is, swept under the rug) with pytorch’s
allow-mutation context manager.
Best.
K. Frank