How to create computational graphs for updated parameters

I want to create computation graphs for updated parameters so that I could calculate their higher-order derivatives later. The code below shows how I solve the problem if the model is not complex.

import torch

# 1: Define model parameters(w, T)
w = torch.tensor(1., requires_grad = True)
T = torch.tensor(2., requires_grad = True)

# 2: Define output(L)
L = T * w

# 3: Calculate dL/dw (by fixing T)
grad = torch.autograd.grad(L, w, create_graph=True)[0]

# 4: Update w --> w2
lr = 0.1
w2 = x - lr*grad
# I use w2 instead of using optimizer.step() to update w
# because I want to create a computational graph between w2, w and T

#5: Calculate the new value of output
L2 = T * w2
# Note that w2 depends on both w and T,
# w2 = w - lr*grad = w -lr*T

#6: Calculate dL2/dT (by fixing w)
dT = torch.autograd.grad(L2, T)[0]
#Because L2 = T*(w - lr*T), dL2/dT = w - 2*lr*T = 0.6

The problem is if my model is getting larger, I can’t create a new variable (#4) and assign it to a model (#5) without excluding the computational graphs. For example, I can only update the model parameters via

with torch.no_grad():
     for param in model.parameters():
        param -​= lr*param.grad

or use


instead. But both of these methods don’t create computational graphs for parameters. So, Is there a way that I can update parameters for a large model without excluding their computational graphs?


You might want to take a look at the higher library that is built to do just that.

1 Like

Thank you. I didn’t know they made a library for meta learning.