I am getting the error: RuntimeError: expected type torch.cuda.FloatTensor but got torch.FloatTensor
when running: scores, classification, transformed_anchors = retinanet(batch)
See code below:
import torch
import model
from torchvision import transforms
from PIL import Image
torch.set_default_tensor_type(torch.cuda.FloatTensor)
def image_loader(loader, image_name):
image = Image.open(image_name)
image = loader(image).float()
image = image.unsqueeze(0)
return image
def main():
retinanet = model.resnet50(num_classes=80, pretrained=True, device='cuda:0')
state_dict_path = 'coco_resnet_50_map_0_335_state_dict.pt'
retinanet.load_state_dict(torch.load(state_dict_path))
retinanet = retinanet.cuda()
retinanet.eval()
for name, param in retinanet.named_parameters():
if param.device.type != 'cuda':
print('param {}, not on GPU'.format(name))
image_path = 'image.jpg'
data_transforms = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor()
])
batch = image_loader(data_transforms, image_path).cuda()
print(batch.type())
print(batch.device)
with torch.no_grad():
scores, classification, transformed_anchors = retinanet(batch)
if __name__ == '__main__':
main()
There is no output when running:
for name, param in retinanet.named_parameters():
if param.device.type != 'cuda':
print('param {}, not on GPU'.format(name))
Implying that every parameter has device type ‘cuda’
Could you rerun the code with CUDA_LAUNCH_BLOCKING=1 python script.py args and post the stack trace here, please?
I cannot find any obvious error, so I hope the error might point to the line of code causing this issue.
/home/nvidia/.local/lib/python3.6/site-packages/torch/nn/modules/upsampling.py:129: UserWarning: nn.Upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
Traceback (most recent call last):
File "min.py", line 41, in <module>
main()
File "min.py", line 38, in main
scores, classification, transformed_anchors = retinanet(batch)
File "/home/nvidia/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/nvidia/projects/pytorch-retinanet/model.py", line 268, in forward
transformed_anchors = self.regressBoxes(anchors, regression)
File "/home/nvidia/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/nvidia/projects/pytorch-retinanet/utils.py", line 105, in forward
pred_ctr_x = ctr_x + dx * widths
RuntimeError: expected type torch.cuda.FloatTensor but got torch.FloatTensor
Thanks for the information.
Could you add a print statement to BBoxTransform and check the device of these tensors?
I would assume that self.mean and self.std are not pushed to the device correctly (line of code), as they are not registered as a buffer or parameter.
Mean is of type: torch.cuda.FloatTensor
std is of type: torch.cuda.FloatTensor
But, when I run the program I get the same error message as before. I went through and checked to see if any other variables were the wrong type and found that ctr_x, ctr_y, widths, and heights were also not the correct type. By changing the device for these variables I was able to run inference.
That’s true, before I posted this I also looked through that repo for related issues and couldn’t find any. I tested this on multiple devices running different versions of pytorch and got effectively the same error on each device.