RuntimeError: Variables needed for gradient computation has been modified by an inplace operation

This error is caused by the loss function shown below.
pos_predict is the data predicted by the model, pos_relaxed is the true value. The error is caused by this line

err += torch.sqrt(torch.sum(diff_pos[s] ** 2))

How can I rewrite it to a non-inplace operation form ?


def loss_func(model, batch, h):
    pos_relaxed = batch["pos_relaxed"]
    count_i_atom = batch["count_i_atom"]
    cell = batch["cell"]
    inv_cell = batch["inv_cell"]
    # pos_predict is the data predicted by the model
    pos_predict = model(h, batch)

    
    n_obj = len(count_i_atom) - 1
    frac_pos_a = torch.zeros_like(pos_relaxed)
    frac_pos_b = torch.zeros_like(pos_predict)
    for i in range(n_obj):
        s = slice(count_i_atom[i], count_i_atom[i+1])
        a_inv_cell = inv_cell[(i*3):(i*3+3), :]
        frac_pos_a[s] = torch.matmul(pos_relaxed[s], a_inv_cell)
        frac_pos_b[s] = torch.matmul(pos_predict[s], a_inv_cell)
    
    diff_pos = frac_pos_a - frac_pos_b
    flag = diff_pos > 0.5
    diff_pos[flag] -= 1.0
    flag = diff_pos < -0.5
    diff_pos[flag] += 1.0

    err = 0
    for i in range(n_obj):
        s = slice(count_i_atom[i], count_i_atom[i+1])
        a_cell = cell[(i*3):(i*3+3), :]
        diff_pos[s] = torch.matmul(diff_pos[s], a_cell)
        err += torch.sqrt(torch.sum(diff_pos[s] ** 2))
    
    return err / n_obj 

The error is

/mnt/d/software_install/pytorch/lib/python3.12/site-packages/torch/autograd/__init__.py:266: UserWarning: Error detected in PowBackward0. Traceback of forward call that caused the error:
  File "/mnt/e/workdir/ML_ADS/my_code/test_egnn.py", line 263, in <module>
    loss = loss_func(model, batch, h)
  File "/mnt/e/workdir/ML_ADS/my_code/test_egnn.py", line 210, in loss_func
    err += torch.sqrt(torch.sum(diff_pos[s] ** 2))
  File "/mnt/d/software_install/pytorch/lib/python3.12/site-packages/torch/_tensor.py", line 40, in wrapped
    return f(*args, **kwargs)
 (Triggered internally at /opt/conda/conda-bld/pytorch_1708025569485/work/torch/csrc/autograd/python_anomaly_mode.cpp:113.)
  Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
Traceback (most recent call last):
  File "/mnt/e/workdir/ML_ADS/my_code/test_egnn.py", line 265, in <module>
    loss.backward()
  File "/mnt/d/software_install/pytorch/lib/python3.12/site-packages/torch/_tensor.py", line 522, in backward
    torch.autograd.backward(
  File "/mnt/d/software_install/pytorch/lib/python3.12/site-packages/torch/autograd/__init__.py", line 266, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [86, 3]], which is output 0 of AsStridedBackward0, is at version 4; expected version 3 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

Assuming the mentioned line of code indeed causes the error, replace it with the out-of-place equivalent:

# err += torch.sqrt(torch.sum(diff_pos[s] ** 2))
err = err + torch.sqrt(torch.sum(diff_pos[s] ** 2))

Thanks for you reply. After trying and error,I find that adding an extra variable solves the problem:

for i in range(n_obj):
        s = slice(count_i_atom[i], count_i_atom[i+1])
        a_cell = cell[(i*3):(i*3+3), :]
        diff_pos_cartesian = torch.matmul(diff_pos[s], a_cell)
        err += torch.sqrt(torch.sum(diff_pos_cartesian ** 2))