I need to calculate the gradient of each element of loss w.r.t parameters and use it to compute gradient vector and hessian matrix.
my code is below:
optimiser.zero_grad()
Error_Train_1 = PDE_loss(train_x, train_y, loss_func)
size_hess = q*p
MyHess = torch.zeros([size_hess,size_hess])
MyGrad = torch.zeros([size_hess,1])
for j in range(Error_Train_1.shape[0]):
for i in range(q+1):
jac = []
optimiser.zero_grad()
(Error_Train_1[j,i]).backward(retain_graph=True)
for params in self.parameters():
# in order to prevent stopping algorithm
if params.grad is None:
Pg = torch.zeros_like(params).reshape(-1)
else:
Pg = params.grad.reshape(-1)
if len(jac) == 0:
jac = Pg
else:
jac = torch.cat((jac, Pg))
jac = torch.reshape(jac, (1, -1))
Hess = torch.matmul(torch.transpose(jac, 0, 1), jac)
Par_grad = torch.transpose(jac, 0, 1)*(Error_Train_1[j,i])
MyGrad += Par_grad
MyHess += Hess
There are a lot of for loops and if conditions, which is not good for performance, and it takes a lot of time.
Could you please help me and let me know if there is any better way to handle if conditions or for loops?
Thanks.