I have the same problem reported in the post Different batches take different times to load. I am using 3 workers Dataloading workers for loading images from a local folder. And as you can see below, it seems to me that one worker is constantly having a slower time than the other two.
The accepted answer suggests using SSD, which is true in my case. Also suggests increasing the number of workers. When I increased the num_workers=8, I observed the same pattern, where every 8 batches, one of them is constantly taking more time to load.
Below is my code for loading. Is there anything missing or optimizations that I need to make to fix this?
class MyDataset(data.Dataset): def __init__(self, datasets, transform=None): self.datasets = datasets self.transform = transform def __len__(self): return len(self.datasets) def __getitem__(self, index): image = Image.open(os.path.join(self.datasets[index])) if self.transform: image = self.transform(image) return image, torch.tensor(self.datasets[index], dtype=torch.long) # create Dataset object training_dataset = MyDataset(training_set, transformer) training_dataloader = torch.utils.data.DataLoader( training_dataset, batch_size=batch_size, num_workers=8, shuffle=True, pin_memory=True )