I ran into a “RuntimeError: Checkpointing is not compatible with .grad(), please use .backward() if possible” in the following codes:
W = model.layers.conv.mlp[-1].weight train_loss = [loss1, loss2] for i, t in enumerate(tasks): gygw = torch.autograd.grad(train_loss[i], W, retain_graph=True) norms.append(torch.norm(torch.mul(Weights.weights[i], gygw)))
Was trying to calculate the gradient of W w r t loss as in pytorch-grad-norm/train.py at master · brianlan/pytorch-grad-norm · GitHub
Thank you very much for any help! Thanks!