``````optimizer = optim(model.parameters())
where `regular_loss.grad(batch[i])` is gradient of `regular_loss` w.r.t sample `i` in `batch`.
The problem is, when I call `loss.backward()`, since the `regular_loss.grad(batch[i])` doesn’t have `requires_grad`, so it is meaningless in optimizer (or its grads is 0).