If you ever trained a model, you sometimes can see DataLoader (e.g. I am using pytorch Dataloader) could be the bottleneck because everytime you get an training sample from the dataset, the data transformation will be performed on-the-fly. Here I take the getitem func of DatasetFolder from torchvision.datasets
as an example.
path, target = self.samples[index]
sample = self.loader(path)
if self.transform is not None:
sample = self.transform(sample)
if self.target_transform is not None:
target = self.target_transform(target)
return sample, target
I wonder can we pre-processed the images (e.g. ImageNet) in advance to tensors and save to disk. Then we modify the __getitem__
function to get these tensors directly from disk. How efficient is this approach? Anyone has tried this solution before?
I think that maybe the loading from disk will burden and likely become a new bottleneck (instead of data transform we have before). Another thing is the size, for example, one ImageNet image takes 74 MB when being saved as tensors using standard transformation:
transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])