VOCDetection getitem() problem

vihas_makwana · July 17, 2019, 10:59am

import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import matplotlib.pyplot as plt
import torchvision
import torchvision.transforms as transforms

data2=torchvision.datasets.VOCDetection("./",download=True,transform=transforms.ToTensor(),target_transform=transforms.ToTensor())
img,tar =data2[0]
I get the following error:
----> 1 img,tar=data1[0]

/usr/local/lib/python3.6/dist-packages/torchvision/transforms/functional.py in to_tensor(pic)
48 “”"
49 if not(_is_pil_image(pic) or _is_numpy_image(pic)):
—> 50 raise TypeError(‘pic should be PIL Image or ndarray. Got {}’.format(type(pic)))
51
52 if isinstance(pic, np.ndarray):

TypeError: pic should be PIL Image or ndarray. Got <class ‘dict’>

ptrblck · July 17, 2019, 11:34am

The VOCDetection target is a dictionary of the XML tree as stated in the docs.
Since ToTensor works on PIL.Images, you cannot use it as a target_transform.

vihas_makwana · July 17, 2019, 2:58pm

What should I do in order to get (image,target) pair?

ptrblck · July 17, 2019, 3:50pm

You would have to remove the target_transform and use the XML tree for your detection task.
Have a look at this tutorial.
While another dataset is used, it might be a good starter for your VOCDetection.

VOCDetection __getitem__() problem

VOCDetection getitem() problem