I am trying to train on around 200G of .npy files. I have a custom image class:
class CustomImageFolder(ImageFolder):
def __init__(self, root, transform=None):
super(CustomImageFolder, self).__init__(str(root),transform)
def __getitem__(self, index):
path = self.imgs[index][0]
img = np.load(path)
img /= 255 # normalization
return img
root = Path(dset_dir).joinpath('ZebraFish/train/')
transform = None
train_kwargs = {'root':root, 'transform':transform}
dset = CustomImageFolder
train_dataset = dset(**train_kwargs)
train_loader = DataLoader(dataset=train_dataset,
batch_size=batch_size,
shuffle=True,
num_workers=num_workers,
pin_memory=True,
drop_last=True)
I’m getting the following error: RuntimeError: Found 0 files in subfolders of: data/ZebraFish/train
Supported extensions are: .jpg,.jpeg,.png,.ppm,.bmp,.pgm,.tif.
I see that the default loader function will create a PIL object. Although since I’m working with .npy is there a simple way around this?
Is there a way to make the dataLoader have this same massive functionality with .npy files?
All the best