Loading datasets in Google Colab takes more time than being done locally in Jupyter Notebook
|-- datasets
| |-- train_folder
| | |-- 00
| | | |-- 0
| | | | |-- file2161.jpg
| | | | |-- file2162.jpg
| | | | |-- file2163.jpg
| | | | |-- file2164.jpg
| | |
| |-- test_folder
| | |-- 01
| | | |-- 1
| | | | |-- file1161.jpg
| | | | |-- file1162.jpg
| | | | |-- file1163.jpg
| | | | |-- file1164.jpg
| | |
This is the organization of my dataset.
class SmileDataset(Dataset):
def __init__(self, data_root):
self.samples = ImageFolder(data_root)
def __len__(self):
return len(self.samples)
def __getitem__(self, idx1):
return self.samples[idx1]
ds_rude = SmileDataset('../mindnotix/smile-detection-master/datasets/train_folder/00')
ds_smile = SmileDataset('../mindnotix/smile-detection-master/datasets/train_folder/01')
trans1 = transforms.ToTensor()
ds_rude = [trans1(img) for img,l in ds_rude]
ds_smile = [trans1(img) for img,l in ds_smile]
ds_rude, ds_smile = torch.stack(ds_rude), torch.stack(ds_smile)
I used the above code to load the dataset using Pytorch.
When I run the code locally in Jupyter Notebook, it runs within a few seconds since there are only 2000 images in both the train and test folders.
But when I run it in Google Colab, it takes several minutes.
Is there anyway to speed up the data-loading in Google Colab.
I did change the runtime to GPU, but that doesn’t speed it up.
Thanks in Advance !