Test only one image using a list, result is bad

Hi, Can i know that if I don not use the Dataset and Dataloader, I just load the image like a list and do the transforms etc. Will the test result be affected?

The definition of a TestLoader is a must?

Cause in my code, the training and validation accuracy could reade 95% but when i inference only one image, the accuracy is like only 10%.

Any idea?

Here is a snap of my testing code

for img_name in img_list:
        img_path = os.path.join(test_dir, img_name)
        img_arr = cv2.imread(img_path,1)
        img_tensor = torch.from_numpy(img_arr)
        compose_img = transforms.Compose([transforms.ToPILImage(),
                                         transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
        img_tensor = compose_img(img_tensor)
        print('Predicting image: ' + img_name)
        with torch.no_grad():
            img_tensor = img_tensor.unsqueeze(0)

            output = model(img_tensor)
            m = nn.Softmax(dim = 1)
            output_a = m(output)
            preds = torch.argmax(output_a.data, 1)

A common pitfall with cv2 is that it is BGR instead of RGB. If you trained using PIL, I would recommend to do img_tensor = torch.from_numpy(img_arr)[:,:, ::-1].
Another is to put the network into eval mode (model.eval()). You need to do this after loading the model, in particular if you have BatchNorm.

Hi @tom, thank you for your reply and suggestion. My training data is also RGB images, and I set the model.eval() already. Should i still use img_tensor = torch.from_numpy(img_arr)[:,:, -1] ?

By BGR I mean that OpenCV returns the image with the order of the three channels reversed compared to what PIL does. So yes, try it and see if it helps.

Hi, it shows me the following error, seems the normalize has some issue

File "test_xy_final.py", line 62, in <module>
  File "test_xy_final.py", line 43, in main
    img_tensor = compose_img(img_tensor)
  File "/opt/conda/lib/python3.6/site-packages/torchvision/transforms/transforms.py", line 61, in __call__
    img = t(img)
  File "/opt/conda/lib/python3.6/site-packages/torchvision/transforms/transforms.py", line 166, in __call__
    return F.normalize(tensor, self.mean, self.std, self.inplace)
  File "/opt/conda/lib/python3.6/site-packages/torchvision/transforms/functional.py", line 217, in normalize
    tensor.sub_(mean[:, None, None]).div_(std[:, None, None])
RuntimeError: output with shape [1, 112, 112] doesn't match the broadcast shape [3, 112, 112]

Ah, sorry, it should have been [:, :, ::-1] but that doesn’t work in PyTorch but you need .flip(2) (after that, the three image channels should be switched), you can use matplotib’s pyplot.imshow to see whether it looks OK.

thank you. I’ll try this. Can you explain why during training, I use cv2, the validation result is good. Some mechanism in pytorch? And also, if i want to finetune a public model, the RGB, BGR should be consist with their training data ( their defined dataloader), right?

Oh, if you used cv2 in the training, you might have also trained with BGR (e.g. the original detectron was, I think). I had understood your reply as that you used the usual TorchVision pipeline which uses RGB images. I should have asked better, sorry.
In general, the TorchVision models work with RGB, so most pretrained models do.

1 Like