How to calculate 2nd derivative of a likelihood function

That’s a bit tricky, I think. But it’s doable, of course.

It is tricky because PyTorch only allows you to compute derivatives of scalars with respect to multidimensional Tensors. Thus, you have to iterate through every single scalar parameter in your model (i.e., every entry in every parameter matrix) and compute the derivative of its derivative with respect to itself.

Honestly, I don’t know if there is an easy way to iterate through every entry in an n-D Tensor for arbitrary n (in numpy, there is nditer for this job). If you can find one, then the task is easy.

Let me show some pseudo-code (which is “pseudo” because the function iterator_over_tensor is unspecified).

import torch
from torch import Tensor
from torch.autograd import Variable
from torch.autograd import grad
from torch import nn

# some toy data
x = Variable(Tensor([4., 2.]), requires_grad=False)
y = Variable(Tensor([1.]), requires_grad=False)

# linear model and squared difference loss
model = nn.Linear(2, 1)
loss = torch.sum((y - model(x))**2)

# instead of using loss.backward(), use torch.autograd.grad() to compute gradients
loss_grads = grad(loss, model.parameters(), create_graph=True)

# compute the second order derivative w.r.t. each parameter
d2loss = []
for param, grd in zip(model.parameters(), loss_grads):
  for idx in iterator_over_tensor(param)
    drv = grad(grd[idx], param[idx], create_graph=True)
    d2loss.append(drv)
    print(param, drv)
6 Likes