Hi,
Im doing an image segmentation task, and for that, within the Dataset, Im using a function which generates a stick model of a human based on the xy points of places of interest (head, joints etc). I have the xy points, and my Dataset class looks like the following.
class MyDataset(Dataset):
def __init__(self, json_file_dir, image_dir, transform=None):
# json_file_dir :: String ; Path to json file with xy coordinates
# image_dir :: String ; Path to image data
# transform :: Optional transform for dataset
self.json_file_dir = json_file_dir
with open(json_file_dir) as f:
self.json_file = json.loads(f.read())
self.image_dir = image_dir
self.transform = transform
def __len__(self):
return len(self.json_file)
def __getitem__(self, index):
# index :: index number for input,label pair
image_name = self.json_file[index]["filename"] #needed to read the image
image_coordinates = self.json_file[index]["keypoints"] #needed to generate the labelled image
image = cv2.imread(image_name)
labelled_image = LabelMaker(image, image_coordinates, args="LINES") #outputs n channel label, n is the number of classes
# image :: np.array 256x256x3
# labelled_image :: np.array 256x256x4
sample = {"image": image, "labelled_image": labelled_image}
if self.transform:
sample = self.transform(sample["image"], sample["labelled_image"])
return sample
def transform(image, label):
image = cv2.resize(image, (256, 256)).astype(np.float64) # uint8 with RGB mode
image -= rgb_mean #[R, G, B] where R, G, B denote the mean values
image = image.astype(np.float64)
image = image.astype(float) / rgb_sd #[R, G, B] where R, G, B are the standard deviations of them
image = torch.from_numpy(image).float()
label = torch.from_numpy(label).float()
return {"image": image, "labelled_image": label}
After making an instance of this for the training dataset, I make a dataloader, with the parameters - batch size=5, shuffle=True and number of workers=4.
However
During the training process, I get an error, which states
AttributeError: Traceback (most recent call last):
File "dir/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 57, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "<ipython-input-20-e1502ecaff7b>", line 32, in __getitem__
labelled_image = LabelMaker(image, image_coordinates, args="LINES")
File "<ipython-input-19-820c82037e68>", line 23, in PointAndLineMaker
height, width = image.shape[0], image.shape[1]
AttributeError: 'NoneType' object has no attribute 'shape'
Exception NameError: "global name 'FileNotFoundError' is not defined" in <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f13cacc8c90>> ignored
Here, the LabelMaker is the function that takes an input image, and the required coordinates to generate the label. As you can see from my Dataset class, the input to this function is fed from that class after loading it from there.
I checked with print statements going through the whole dataset trying to see whether anything is None but to no avail.
Whats more annoying is that this error pops up randomly. For debugging purposes Im running 5 epochs, and it chooses to spit out this None at different times, and I also ran it for 5 epochs successfully without the error coming.
What am I doing wrong here?
EDIT:
I have reason to believe this is a problem in openCV, specifically the imread function, since the problem is avoided after converting the input images to .npy files, and loading these into the dataset