One of the variables needed for gradient computation has been modified by an inplace operation:

I guess you might be facing a similar issue as described here and here.
Could you check if you are indeed trying to use stale gradients for already updated parameters?