Calculating cosine similarity loss from two tensors stored in register buffer

Hello, I’m trying to train a neural network related to the differentiable neural computer (DNC). The loss is determined by the cosine similarity loss of two slots from memory stored in the register buffer, which means it cannot be gradient descent. In the training process, the model will output the memory, and the loss is updated by

loss = cos_loss(memory[j].detach(), memory[k].detach())
loss.backward()

But I got the following error:

RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place operation.

If I use the with torch.no_grad(): to the cos_loss, will it still update the memory parameter?
I have no idea how to fix this error… Anyone has any idea? Thanks