Change the one parameter of a mode and cause UserWarning

jerly-hjzhou · May 5, 2021, 8:18am

I change the first parameter of resnet-18 mannully and use the changed model to infer. And I got this warning information. UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. if param.grad is not None: But I couldn’t understand. Could anyone teach me why or what causes it?
My code is below.

import torch
import torch.nn.functional as F
from torchvision import datasets, transforms
import time
model = torch.load('resnet-18/resnet_18-cifar_10.pth')  # 加载模型
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(device)
for layerPara in model.parameters():
    layerPara[0][0][0][0] = -6.0173e+3
    break
model.to(device)
model.eval()
lossSum = 0.0
preLoss = 0.0
curLoss = 0.0
accuracy = 0.0
correctNum = 0
data_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])
image_datasets = datasets.CIFAR10(root='./data', train=False,
                                  download=True, transform=data_transforms)
test_loader = torch.utils.data.DataLoader(image_datasets, batch_size=500,
                                          shuffle=True, num_workers=0, prefetch_factor=2)
since = time.time()
for data, target in test_loader:
    data = data.to(device)
    target = target.to(device)
    output = model(data)
    lossSum = lossSum + F.cross_entropy(output, target, reduction='sum').item()
    pred = output.data.max(1, keepdim=True)[1]  # get the index of the max log-probability
    correctNum += pred.eq(target.data.view_as(pred)).cpu().sum()

ptrblck · May 5, 2021, 8:29am

I’m not sure where this warning is raised, since I cannot see any usage of the .grad attribute.
However, your current code should fail with:

RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place operation.

and you would need to manipulate the parameter in a torch.no_grad() block, which seems to work fine:

model = models.resnet18()

with torch.no_grad():
    for layerPara in model.parameters():
        layerPara[0][0][0][0] = -6.0173e+3
        break
    
x = torch.randn(2, 3, 224, 224)
target = torch.randint(0, 1000, (2,))
criterion = nn.CrossEntropyLoss()

out = model(x)
loss = criterion(out, target)
loss.backward()

jerly-hjzhou · May 5, 2021, 10:26am

You’re so kind. Thank you!