I have these ImageNet 32x32 npz dataset.
How can I concatenate these batches and make one dataloader?
And npz file contains the label information too?
Please help.
Thank you!
I have these ImageNet 32x32 npz dataset.
How can I concatenate these batches and make one dataloader?
And npz file contains the label information too?
Please help.
Thank you!
The .npz
file format is usually used by numpy.savez
so we cannot know, what’s inside the data.
You can use np.load
to load each file and inspect it.
Once you got the numpy arrays, you could transform them to tensors via torch.from_numpy
and create your Dataset
.
Here’s an NPZ loader I wrote for my own dataset…
Like Peter said, it’s going to be different for each archive structure… but perhaps this will be useful to get started…
import torch
import numpy as np
from pathlib import Path
class NPZLoader(dataloader.Dataset):
def __init__(self, path, transform=None):
self.path = path
self.files = list(Path(path).glob('*/*.npz'))
self.transform = transform
def __len__(self):
return len(self.files)
def __getitem__(self, item):
numpy_array = np.load(str(self.files[item]))['arr_0']
torch_array = torch.from_numpy(numpy_array)
if self.transform is not None:
torch_array = self.transform(torch_array)
return torch_array, 0