A common pitfall with cv2 is that it is BGR instead of RGB. If you trained using PIL, I would recommend to do img_tensor = torch.from_numpy(img_arr)[:,:, ::-1].
Another is to put the network into eval mode (model.eval()). You need to do this after loading the model, in particular if you have BatchNorm.
Hi, it shows me the following error, seems the normalize has some issue
File "test_xy_final.py", line 62, in <module>
File "test_xy_final.py", line 43, in main
img_tensor = compose_img(img_tensor)
File "/opt/conda/lib/python3.6/site-packages/torchvision/transforms/transforms.py", line 61, in __call__
img = t(img)
File "/opt/conda/lib/python3.6/site-packages/torchvision/transforms/transforms.py", line 166, in __call__
return F.normalize(tensor, self.mean, self.std, self.inplace)
File "/opt/conda/lib/python3.6/site-packages/torchvision/transforms/functional.py", line 217, in normalize
tensor.sub_(mean[:, None, None]).div_(std[:, None, None])
RuntimeError: output with shape [1, 112, 112] doesn't match the broadcast shape [3, 112, 112]
Ah, sorry, it should have been [:, :, ::-1] but that doesn’t work in PyTorch but you need .flip(2) (after that, the three image channels should be switched), you can use matplotib’s pyplot.imshow to see whether it looks OK.
thank you. I’ll try this. Can you explain why during training, I use cv2, the validation result is good. Some mechanism in pytorch? And also, if i want to finetune a public model, the RGB, BGR should be consist with their training data ( their defined dataloader), right?
Oh, if you used cv2 in the training, you might have also trained with BGR (e.g. the original detectron was, I think). I had understood your reply as that you used the usual TorchVision pipeline which uses RGB images. I should have asked better, sorry.
In general, the TorchVision models work with RGB, so most pretrained models do.