Torch parameter gradient is None

Ido_Avrahami · July 8, 2024, 9:15pm

I’m trying to optimize a torch parameter using a pre-trained network.
This is my code.

single_frame = image[0,:,:,:].to(device)
modif = torch.Tensor(1, 2, 34, 34).fill_(1).to(device)
modifier = torch.nn.Parameter(modif, requires_grad=True)
optimizer = torch.optim.Adam([modifier], lr=learning_rate)
criterion = torch.nn.CrossEntropyLoss()
iters = 10
for i in range(iters):

    # single_frame = single_frame.to(device)
    optimizer.zero_grad()
    modified_frame = single_frame + modifier
    if label.dim() == 0:
        label = label.unsqueeze(0).to(device)
    label = label.to(device)
    with torch.no_grad():
        output = model(modified_frame)
    output = output.requires_grad_(True)
    loss = criterion(output,label)
    loss.backward()
    optimizer.step()
    print(modifier.grad)

The output shows that the modifier does not change, and the print shows that modifier’s grad is None.
What might be the cause?

ptrblck · July 9, 2024, 1:01am

You are performing the forward pass of your model in a no_grad() context and are afterwards setting the .requires_grad attribute of the output to True, which won’t attach it to the computation graph. Perform the forward pass in the global context and it should work.