I want to create computation graphs for updated parameters so that I could calculate their higher-order derivatives later. The code below shows how I solve the problem if the model is not complex.

```
import torch
# 1: Define model parameters(w, T)
w = torch.tensor(1., requires_grad = True)
T = torch.tensor(2., requires_grad = True)
# 2: Define output(L)
L = T * w
# 3: Calculate dL/dw (by fixing T)
grad = torch.autograd.grad(L, w, create_graph=True)[0]
# 4: Update w --> w2
lr = 0.1
w2 = x - lr*grad
# I use w2 instead of using optimizer.step() to update w
# because I want to create a computational graph between w2, w and T
#5: Calculate the new value of output
L2 = T * w2
# Note that w2 depends on both w and T,
# w2 = w - lr*grad = w -lr*T
#6: Calculate dL2/dT (by fixing w)
dT = torch.autograd.grad(L2, T)[0]
#Because L2 = T*(w - lr*T), dL2/dT = w - 2*lr*T = 0.6
```

The problem is if my model is getting larger, I can’t create a new variable (#4) and assign it to a model (#5) without excluding the computational graphs. For example, I can only update the model parameters via

```
with torch.no_grad():
for param in model.parameters():
param -= lr*param.grad
```

or use

```
optimizer.step()
```

instead. But both of these methods don’t create computational graphs for parameters. So, Is there a way that I can update parameters for a large model without excluding their computational graphs?