I am using ADAM with LBFGS. The loss doesn’t change with each epoch when I try to use optimizer.step() with the closure function. If I use only ADAM with optimizer.step(), the loss function converges (albeit slowly which is why i decided to use LBFGS). Can you tell me where is my code wrong?

```
optimizer1 = torch.optim.Adam(net.parameters(),lr = 0.0001)
optimizer2 = torch.optim.LBFGS(net.parameters(),lr=0.001)
## Training
iterations = 10
loss_array = np.zeros((iterations))
for epoch in range(iterations):
def closure():
optimizer1.zero_grad() # to make the gradients zero
optimizer2.zero_grad() # to make the gradients zero
# # Data driven/boundary loss
# pt_x_bc1 = Variable(torch.from_numpy(x_bc1).float(), requires_grad=False).to(device)
# pt_x_bc2 = Variable(torch.from_numpy(x_bc2).float(), requires_grad=False).to(device)
# pt_u_bc = Variable(torch.from_numpy(u_bc).float(), requires_grad=False).to(device)
# net_bc_out1 = net(pt_x_bc1)
# net_bc_out2 = net(pt_x_bc2)
# mse_u1 = mse_cost_function(net_bc_out1, pt_u_bc)
# mse_u2 = mse_cost_function(net_bc_out2, pt_u_bc)
# mse_u = mse_u1 + mse_u2
## Physics informed loss
all_zeros = np.zeros((500,1))
pt_x_collocation = Variable(torch.from_numpy(x_collocation).float(), requires_grad=True).to(device)
pt_all_zeros = Variable(torch.from_numpy(all_zeros).float(), requires_grad=False).to(device)
f_out = f(pt_x_collocation, net) # output of f(x,t)
mse_pinn = mse_cost_function(f_out, pt_all_zeros)
## Training data loss
u_train = net(pt_x_collocation)
pt_u_true = Variable(torch.from_numpy(u_true).float(), requires_grad=False).to(device)
mse_training = mse_cost_function(u_train, pt_u_true)
# Combining the loss functions
loss = mse_pinn + mse_training
loss_array[epoch] = loss
loss.backward()
return loss
if epoch<5000:
optimizer1.step(closure)
else:
optimizer2.step(closure)
with torch.autograd.no_grad():
print(epoch,"Traning Loss:",loss.data)
```

This is the output:

0 Traning Loss: tensor(0.4883)

1 Traning Loss: tensor(0.4883)

2 Traning Loss: tensor(0.4883)

3 Traning Loss: tensor(0.4883)

4 Traning Loss: tensor(0.4883)

5 Traning Loss: tensor(0.4883)

6 Traning Loss: tensor(0.4883)

7 Traning Loss: tensor(0.4883)

8 Traning Loss: tensor(0.4883)

9 Traning Loss: tensor(0.4883)

Thanks