Taking autograd of model.parameters()

I am trying to get the hessian after a model has completed estimating (to then use to calculate standard errors). Given a model object, my thought was that something like the following should work:

d1 = torch.autograd.grad(loss, model.parameters(), create_graph=True)[0]
d2 = torch.autograd.grad(loss, d1, create_graph=True)[0]

However, this gives the error “One of the differentiated Tensors appears to not have been used in the graph.” Could someone point me in the right direction?

The model class, for what it’s worth, is the following:

class MultinomialLogitWithEmbeddings(torch.nn.Module):

    def __init__(self, m, n_fe):
        super(MultinomialLogitWithEmbeddings,self).__init__()
        self.n_features = m
        self.n_fe_cats = n_fe        
        self.emb_dim = 1      
        
        # Layers 
        self.emb = torch.nn.Embedding(self.n_fe_cats, self.emb_dim) # Store FEs as an embeddings lookup 
        self.linear = torch.nn.Linear(self.n_features + self.emb_dim, 1, bias=False) # Linear utility layer
        self.lsmax = torch.nn.LogSoftmax(1) # Log soft max activation
        
        # Initialize the layer weights 
        nn.init.kaiming_normal_(self.linear.weight.data)
        
    def forward(self, x_cont, x_cat):    
        x = torch.cat([x_cont.float(), self.emb(x_cat).squeeze(dim=-1).float()], 2)        
        y_pred=self.linear(x)
        y_pred=self.lsmax(y_pred)
        return y_pred.squeeze()

How about using torch.autograd.functional.hessian — PyTorch 1.9.0 documentation?

The error comes from the fact that loss does not depend on d1.
You probably want to do something like d2 = torch.autograd.grad(d1, model.parameters(), create_graph=True)[0] , but I haven’t tested whether that works.

Ah, sorry, I should have clarified that it’s the first line that creates an error.

I’ve gotten a bit closer by doing:

d1 = torch.autograd.grad(loss, model.linear.weight, create_graph=True)

Although currently giving me another error: “one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [13, 1]], which is output 0 of TBackward, is at version 151; expected version 150 instead.” (Trying to sort out what’s gone awry now)

Here’s a solution that’s a bit hacky but should give you what you want:

def f(weight):
    m = MultinomialLogitWithEmbeddings(1, 1)
    m.linear.weight.data = weight
    x_cont = torch.ones(1, 1, 1)
    x_cat = torch.LongTensor([[[0]]])
    return m(x_cont, x_cat)

hessian = torch.autograd.functional.hessian(f, model.linear.weight)