Higher order gradient

I try to implement network with Higher order gradient.

torch.manual_seed(48)
for epoch in range(100):
    for i, (train_x,train_y) in enumerate(train_loader):

        train_x = train_x.to(device)
        train_y = train_y.to(device)

        mypredict = my_model(train_x)
        loss = myloss(mypredict,train_y)
        grad_norm = 0
        grad_params = torch.autograd.grad(loss, my_model.parameters(), create_graph=True)
        for grad in grad_params:
            grad_norm += grad.pow(2).sum()

        grad_norm = grad_norm.sqrt()

        # take the gradients wrt grad_norm. backward() will accumulate
        # the gradients into the .grad attributes

        # do an optimization step
        optim.zero_grad()
        grad_norm.backward()
        optim.step()


    print("Loss in Epoch {}/100 = {}".format(epoch+1,loss.item()))

but when I plot loss
plotLoss
Loss value is fluctuate

What should I do? Thank you :slight_smile:

Can you let us know why are you doing this?
You are essentially doing a rms of all the gradients for all the parameters.
I feel the gradient value would be too large in your case for every parameter update. Which might be causing the problem. Maybe if you could clip the gradients it might solve the problem.