RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn kitti dataset

allenpeng · February 3, 2020, 10:16am

Hello everyone:

I’m trying to paint the point cloud by seg result.

To do so, I loaded the deeplab3+ model when processing kitti dataset and inference the image, after points painted, I use these points to train 3d object detection model.

and I encounter the following bugs

File “./tools/train.py”, line 129, in
main()
File “./tools/train.py”, line 124, in main
logger=logger,
File “/root/Det3D/det3d/torchie/apis/train.py”, line 343, in train_detector
trainer.run(data_loaders, cfg.workflow, cfg.total_epochs, local_rank=cfg.local_rank)
File “/root/Det3D/det3d/torchie/trainer/trainer.py”, line 536, in run
epoch_runner(data_loaders[i], self.epoch, **kwargs)
File “/root/Det3D/det3d/torchie/trainer/trainer.py”, line 411, in train
self.call_hook(“after_train_iter”)
File “/root/Det3D/det3d/torchie/trainer/trainer.py”, line 325, in call_hook
getattr(hook, fn_name)(self)
File “/root/Det3D/det3d/torchie/trainer/hooks/optimizer.py”, line 17, in after_train_iter
trainer.outputs[“loss”].backward()
File “/usr/local/lib/python3.6/dist-packages/torch/tensor.py”, line 107, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File “/usr/local/lib/python3.6/dist-packages/torch/autograd/init.py”, line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

I was wondering it’s the problem of using the segmentation model in the wrong way.
def seg_inference(image):

The following code is how I use segmentation model for inference, I don’t need to train it.

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

checkpoint = torch.load("/root/Det3D/deeplab/run/cityscapes/deeplab-resnet/model_best.pth.tar")

seg_model = DeepLab(num_classes=19 , backbone='resnet',output_stride=16,sync_bn=True,freeze_bn=False)
seg_model.load_state_dict(checkpoint['state_dict'])
seg_model.eval()
seg_model.to(device)

def transform(image):
        return tr.Compose([tr.Resize((513,513)), tr.ToTensor(),
        tr.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))])(image)

torch.set_grad_enabled(False)
for param in seg_model.parameters():
        param.requires_grad = False
inputs = transform(image).to(device)
output = seg_model(inputs.unsqueeze(0)).squeeze().cpu().numpy()
output = np.resize(output, (19,image.size[0],image.size[1]))[[13,11,12,18]]
output = np.transpose(output,(2,1,0))
#pred = np.argmax(output, axis=0)
return output

Hope someone can help me. Thank you very much!

ptrblck · February 4, 2020, 5:59am

It seems your code tries to train the model using some trainer class, which isn’t defined in your code snippet.
If you freeze all parameters or disable the gradient calculation via torch.set_grad_enabled(False), you won’t be able to calculate the gradients with a backward call.

Could you recheck your code and see, where the trainer comes from?

allenpeng · February 4, 2020, 7:17am

Thank you very much!

I found when I use first model for inference, I set torch.set_grad_enabled(False).