Num_workers on single GPU

I am experimenting with Xception network on my PC which has single Geforce GTX 1660 GPU. Batch size is 8. The dataset directory contains train and validation directories which in turn contain seven classes in each.
I first create a function that returns list of images_path and labels_dict as follows:

def imagee(data_path):
    classes = os.listdir(data_path)

    images_path = []
    labels_dict = {}
    for i, class_name in enumerate(classes):
        class_path = os.path.join(data_path, class_name)
        for image_path in os.listdir(class_path):
            image_path = os.path.join(class_path, image_path)
            images_path.append(image_path)
            labels_dict[image_path] = i
    
    return images_path, labels_dict

Then, for both validation and train dataset, I get the returned images_path and labels_dict:

train_images_path, train_labels_dict = imagee(train_data_path)
test_images_path, test_labels_dict = imagee(test_data_path)

My custom Dataset class is as follows:

class FaceRecognationDataset(Dataset):
    def __init__(self, images_path, labels_dict, transforms=None):
        super(FaceRecognationDataset, self).__init__()
        self.images_path = images_path
        self.labels_dict = labels_dict
        self.transforms = transforms

    def __len__(self):
        return len(self.images_path)

    def __getitem__(self,index):
        image_path = self.images_path[index]
        label = self.labels_dict[image_path]
        image = Image.open(image_path).convert("RGB")

        if self.transforms != None:
            image = self.transforms(image)

        return image, label

Transforms, datasets and dataloaders:

my_transforms = transforms.Compose([
    transforms.Resize(308),
    transforms.RandomCrop(299),
    transforms.ToTensor()])

train_dataset = FaceRecognationDataset(train_images_path, train_labels_dict, transforms=my_transforms)
test_dataset = FaceRecognationDataset(test_images_path, test_labels_dict, transforms=my_transforms)

train_loader = DataLoader(train_dataset, shuffle=True, batch_size=batch_size, num_workers=2)
test_loader = DataLoader(test_dataset, shuffle=True, batch_size=batch_size, num_workers=2)

I try to get the first batch of test_loaders as follows:

images, labels = next(iter(test_loader))

And it takes forever.
After trying for some hours, I removed num_workers in data_loader and it worked out finally.
Why is it taking such long time with num_workers on in my case?

I assume your code hangs and has issues using multiprocessing and is not just slow.

I guess so too. I just thought it would take too long to finish the process and interrupted it after waiting for 5 mins. What could be the isssue?