Not getting symmetric hessian when calculating by using torch.autograd

Mrinmay_sen · September 16, 2022, 5:51am

Hi,
I am using following code to generate hessian. I have used L2 regularised categorical cross entropy as a loss function. So hessian should be symmetric. But i am getting symmetric.

import torch
torch.set_default_dtype(torch.float64)
import torchvision
import torchvision.models as models
from torchvision import transforms
import torch.utils.data
import torch.nn as nn
import numpy as np
device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’)
#defining data tranform
transform = transforms.Compose([transforms.ToTensor(), transforms.Lambda(lambda x: x.repeat(3, 1, 1) ), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

download trainset and testset

trainset = torchvision.datasets.MNIST(root=‘data’, train=True, download=True, transform=transform)
train_loader=torch.utils.data.DataLoader(trainset, batch_size=512, shuffle=True, num_workers=2)

testset = torchvision.datasets.MNIST(root=‘data’, train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=128, shuffle=True, num_workers=2)

#defining model
cnn = models.resnet18(pretrained=False)
classifier = nn.Linear(in_features=512, out_features=10, bias=True)
cnn.fc=classifier
cnn=cnn.to(device)

##loss function
loss_fn = nn.CrossEntropyLoss()

##flattering parameters
p= torch.cat([gi.data.view(-1) for gi in cnn.parameters()])

calculating hessian

for batch, (X, y) in enumerate(train_loader):
cnn.train()

 ###creating hessian matrix with zeros 
 H=torch.zeros(len(p), len(p)).to(device)
 X = X.to(device)
 y = y.to(device)
 pred=cnn(X)
 loss=loss_fn(pred, y) 
 l2_norm = sum(p.pow(2.0).sum() for p in cnn.parameters())
 loss = loss + l2_lambda * l2_norm

 env_grads = torch.autograd.grad(loss, cnn.parameters(), retain_graph=True, create_graph=True)
 g= torch.cat([gi.reshape(-1) for gi in env_grads])

 for i in range(len(p)):
       h_col=torch.autograd.grad(g[column_indx[i]], cnn.parameters(), retain_graph=True, create_graph=False)
       H[i]= torch.cat([gi.reshape(-1) for gi in h_col])

After creating H, i checked whether it is symmetric or not… But not getting symmetric. Please suggest where is the problem.
At first i used “torch.set_default_dtype(torch.float64)” but still not able to solve.
I will not be able to use PyTorch inbuilt hessian function to calculate hessian as I need only some part of hessian without calculation of whole hessian matrix.

KFrank · September 16, 2022, 10:21pm

Hi Mrinmay!

Mrinmay_sen:

 env_grads = torch.autograd.grad(loss, cnn.parameters(), retain_graph=True, create_graph=True)
 g= torch.cat([gi.reshape(-1) for gi in env_grads])

 for i in range(len(p)):
       h_col=torch.autograd.grad(g[column_indx[i]], cnn.parameters(), retain_graph=True, create_graph=False)
       H[i]= torch.cat([gi.reshape(-1) for gi in h_col])

In the code you posted you are using column_indx when you compute
the second derivative, but not when you compute the first. On its face,
this will make your result for H not symmetric. (I also note that column_indx
isn’t defined anywhere in your post.)

As an aside, you shouldn’t expect the Hessian you compute to be exactly
numerically symmetric (even though it is mathematically symmetric); due
to floating-point round-off error, it will be numerically slightly asymmetric.

Best.

K. Frank

Mrinmay_sen · September 17, 2022, 4:20am

Dear Frank,
column_indx is the list of columns which are randomly selected variable put of all the parameters. Only hessian with these columns i need to caculate

p= torch.cat([gi.data.view(-1) for gi in cnn.parameters()])
column_indx= np.random.choice(len(p), 10, replace= False)

Mrinmay_sen · September 17, 2022, 4:27am

But when i considered all the columns that is full hessian, it is also not symmetric.

cnn.train()
p= torch.cat([gi.data.view(-1) for gi in cnn.parameters()])
column_indx= np.random.choice(len(p), len(p), replace= False)
for batch, (X, y) in enumerate(train_loader):
cnn.train()
X = X.to(device)
y = y.to(device)
pred=cnn(X)
loss=loss_fn(pred, y)
env_grads = torch.autograd.grad(loss, cnn.parameters(), retain_graph=True, create_graph=True)
g= torch.cat([gi.reshape(-1) for gi in env_grads])
C=torch.zeros(len(p), len(p)).to(device)
cnn.zero_grad()
for i in range(len(p)):
h_col=torch.autograd.grad(g[column_indx[i]], cnn.parameters(), retain_graph=True, create_graph=False)
C[i]= torch.cat([gi.reshape(-1) for gi in h_col])

This is the code which i have used to calculate hessian.
Kindly have a look on the last two line of this code. Please check whether it is the correct approach or not to calculate hessian

KFrank · September 17, 2022, 2:42pm

Hi Mrinmay!

As I alluded to in my first post, when you use column_indx just for the
columns of your Hessian, you are, in effect permuting the columns of the
Hessian but not permuting the rows.

If you permute just the columns of a symmetric matrix, it won’t be symmetric
anymore.

Consider:

>>> import torch
>>> print (torch.__version__)
1.12.0
>>>
>>> _ = torch.manual_seed (2022)
>>>
>>> mat = torch.randn (5, 5)
>>> mat = mat + mat.T   # make symmetric matrix
>>> torch.equal (mat, mat.T)   # verify symmetric
True
>>>
>>> column_indx = torch.randperm (5)   # random permutation of column indices
>>>
>>> mat_cperm = mat[:, column_indx]   # permute columns
>>> torch.equal (mat_cperm, mat_cperm.T)   # verify asymmetric
False
>>>
>>> mat_cperm_rperm = mat_cperm[column_indx, :]   # also permute rows
>>> torch.equal (mat_cperm_rperm, mat_cperm_rperm.T)   # matrix is now symmetric again
True

Best.

K. Frank

Mrinmay_sen · September 17, 2022, 4:04pm

Hi Frank,

Thanks a lot for the solution.
Now i came to know about the problem. Thanks a lot again Frank.