One of the variables needed for gradient computation has been modified by an in-place operation

Check this. if it helps. Give me some time to go through your code. There is some niche way you are changing a tensor inplace.