Dear all,
i have .npz files in my dataset which every file represent sequence of image(15frames) “X” and its target “Y”:
video1: [array(img1, img2, img3, …, img10)], [Y1]
video2: [array(img1, img2,img3, …, img10)], [Y2]
…
and i search how to load this custom data with DataLoader if it is possible?
Thanks
Hey -
This could be done in the following way -
class VideoDataset(torch.utils.data.Dataset):
def __init__(self, path_to_npy_files):
self.video_files = os.listdir(path_to_npy_files)
'''
Anything else here
'''
def __getitem__(self, idx):
video_frames = []
video_file = self.video_files[idx]
video = numpy.load(video_file)
image_tensor = torch.from_numpy(video[0]).permute(0, 3, 1, 2) # I am assuming video is of shape D, H, W, C. If not so, please change accordingly
label = torch.tensor(video[1], dtype = torch.long) # I assume you are classifying video frames
return video_tensor, label
def __len__(self):
return len(self.video_files)
I am not sure what exactly is in the npz file. From what I understood, npz file stores an array, who’s first entry is an array which contains the images and second is the class, and I have wriiten the dataset accordingly, If this is not the case, then please change accordingly. only the video[0]
and video[1]
part would change.
The dataloader can be created as follows -
def GetDataloader(path_to_npy_files, batch_size, num_workers):
dataset = VideoDataset(path_to_npy_files)
dataloader = torch.utils.data.DataLoader(dataset = dataset, batch_size=batch_size, num_workers = num_workers, shuffle=True)
PS - Please excuse any indentation errors, I have directly typed the code here.
But i need to use .npz file, each npz file represent (X , label):
it’s correct like this ?
def npy_loader(path):
with np.load(path) as train_data:
train_examples=train_data['x']
train_labels=train_data['y']
X = torch.from_numpy(train_examples)
Y = torch.from_numpy(train_labels)
return X,Y
dataset = datasets.DatasetFolder(
root='data/npz_file',
loader=npy_loader,
extensions='.npz'
)```
Oh yeah this works too, I forgot you already have an array of frames
So I believe this should work.
are you facing any trouble with this approach ?
with ResNet 3D ?
not yet I am in the first task (data preparation)
I am not familiar with ResNet3D , but if it has 3D convolutions, then the input to the network should be in the shape - [batch, num_channels, num_frames/depth, H, W]
that is the shape that would get loaded with my dataloader. I am not so familiar with DatasetFolder
.
If this is the shape getting loaded, then you are good to go