Hey all,
I’m a beginner experimenting with resnet50 transfer learning, and I’ve been getting the runtime error “element 0 of tensors does not require grad and does not have a grad_fn” when attempting to do a training run. I’m doing this in a google colab environment, and here’s the code:
train_session_epochs = 12
with torch.enable_grad():
for epoch in range(train_session_epochs):
running_loss = 0.0
cur_epochs = cur_epochs + 1
for inputs, labels in trainloader:
i = i+1
inputs = inputs.to(device)
labels = labels.to(device)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
Criterion is cross entropy loss, and I’m using SGD as the optimizer. I’ve managed to identify that the issue is that loss.requires_grad
for torch is false, and I’ve tried setting loss = Variable(criterion(outputs, labels), requires_grad=True)
, and using other methods to force requires_grad to be true, however, those are only workarounds that make the error go away and not result in loss reduction.
The strange thing is that I have a separate colab with nearly the exact same setup of training data, optimizer, model, etc., and that colab does not throw any errors and results in learning progress when I use it to train. The primary difference being that this colab file that I’m currently working on is attempting to use optimizer.load_state_dict
when loading the model to train further.
Am I missing anything? I’ll be happy to provide more code if needed.