i have .npz files in my dataset which every file represent sequence of image(15frames) “X” and its target “Y”:
video1: [array(img1, img2, img3, …, img10)], [Y1]
video2: [array(img1, img2,img3, …, img10)], [Y2]
and i search how to load this custom data with DataLoader if it is possible?
This could be done in the following way -
class VideoDataset(torch.utils.data.Dataset): def __init__(self, path_to_npy_files): self.video_files = os.listdir(path_to_npy_files) ''' Anything else here ''' def __getitem__(self, idx): video_frames =  video_file = self.video_files[idx] video = numpy.load(video_file) image_tensor = torch.from_numpy(video).permute(0, 3, 1, 2) # I am assuming video is of shape D, H, W, C. If not so, please change accordingly label = torch.tensor(video, dtype = torch.long) # I assume you are classifying video frames return video_tensor, label def __len__(self): return len(self.video_files)
I am not sure what exactly is in the npz file. From what I understood, npz file stores an array, who’s first entry is an array which contains the images and second is the class, and I have wriiten the dataset accordingly, If this is not the case, then please change accordingly. only the
video part would change.
The dataloader can be created as follows -
def GetDataloader(path_to_npy_files, batch_size, num_workers): dataset = VideoDataset(path_to_npy_files) dataloader = torch.utils.data.DataLoader(dataset = dataset, batch_size=batch_size, num_workers = num_workers, shuffle=True)
PS - Please excuse any indentation errors, I have directly typed the code here.
But i need to use .npz file, each npz file represent (X , label):
it’s correct like this ?
def npy_loader(path): with np.load(path) as train_data: train_examples=train_data['x'] train_labels=train_data['y'] X = torch.from_numpy(train_examples) Y = torch.from_numpy(train_labels) return X,Y dataset = datasets.DatasetFolder( root='data/npz_file', loader=npy_loader, extensions='.npz' )```
Oh yeah this works too, I forgot you already have an array of frames
So I believe this should work.
are you facing any trouble with this approach ?
with ResNet 3D ?
not yet I am in the first task (data preparation)
I am not familiar with ResNet3D , but if it has 3D convolutions, then the input to the network should be in the shape -
[batch, num_channels, num_frames/depth, H, W]
that is the shape that would get loaded with my dataloader. I am not so familiar with
If this is the shape getting loaded, then you are good to go