Keep track of variable while function optimization using autograd

jocelynn · June 7, 2023, 5:03am

Optimizing this function using torch.autograd

def f(z):
    return (z*z).sum()

using the following code starting at some initial z.

z = torch.empty(3)
torch.nn.init.uniform_(z,-5,5)
z.requires_grad = True
optimizer = torch.optim.Adam([z], lr=0.1)
losses= []
zs =[] #to keep track of the variable as it gets optimized
print(z) #initial value of z
for i in range(50):
    optimizer.zero_grad()
    loss = f(z)
    loss.backward(retain_graph = True)
    optimizer.step()
    losses.append(loss.detach().numpy())
    zs.append(z.detach().data.numpy())
print(z)   #final value of z

which gives output:

tensor([-1.4410,  4.9476,  3.3615], requires_grad=True) 
tensor([-0.0837,  0.8619, -0.0701], requires_grad=True)

but the list zs has the value

[array([-0.08373591,  0.8618752 , -0.07005788], dtype=float32),
 array([-0.08373591,  0.8618752 , -0.07005788], dtype=float32),
...
 array([-0.08373591,  0.8618752 , -0.07005788], dtype=float32)]

which is the final value repeated. Why does this happen and how can I append values of z for each epoch into a list.

I’m new to PyTorch and learning the basics.

eqy · June 7, 2023, 6:27am

Note that calling .numpy() doesn’t actually create a copy as the data is shared if the tensor is on CPU (see torch.Tensor.numpy — PyTorch 2.0 documentation). You probably want something like .clone() instead:

import torch

def f(z):
    return (z*z).sum()

z = torch.empty(3)
torch.nn.init.uniform_(z,-5,5)
z.requires_grad = True
optimizer = torch.optim.Adam([z], lr=0.1)
losses= []
zs =[] #to keep track of the variable as it gets optimized
print(z) #initial value of z
for i in range(5):
    optimizer.zero_grad()
    loss = f(z)
    loss.backward(retain_graph = True)
    optimizer.step()
    losses.append(loss.detach().clone())
    zs.append(z.detach().data.clone())
print(z)   #final value of z
print(zs)

tensor([-1.8938, -3.1788, -4.9571], requires_grad=True)
tensor([-1.3970, -2.6804, -4.4581], requires_grad=True)
[tensor([-1.7938, -3.0788, -4.8571]), tensor([-1.6940, -2.9789, -4.7572]), tensor([-1.5945, -2.8791, -4.6573]), tensor([-1.4954, -2.7796, -4.5576]), tensor([-1.3970, -2.6804, -4.4581])]

jocelynn · June 13, 2023, 3:47am

thanks eqy, it is the solution I was looking for.