How upload sequence of image on video-classification

Yes, I think the cleanest way would be to define a custom Dataset.

I’m not sure what your use case is, but I understood that you’ve already split the input images into patches. If that’s the case, you could create a mapping and store the prediction for each patch and “combine” them afterwards. E.g. if you are working on a classification use case, you could use a majority voting etc.

Hi @ptrblck. First of all I would like to thank you for your very functional code (MyDataset), however I notice that it is slow and makes training very slow. Indeed here are the times for a pass in the network and for a pass in MyDataset (4 for batch of 4).

Time of one pass in the network on GPU (with update) :
0.025026798248291016s

Time of each pass in MyDataset so 4 for batch=4 (sequence length=32):

0.6877198219299316s
0.6699411869049072s
0.6670119762420654s
0.6949927806854248s

To obtain those times I have already made a modification, basically transforming the for loop into a list of comprehension:
images = [self.transform(Image.open(self.image_paths[i][0])) for i in indices]

How do you speed up your MyDataset code? Thanks!

How can I modify this code if all my sequences don’t have the same length ?

You could pad the sequences to the same length e.g. in a custom collate_fn. Also, take a look at this post which discusses some approaches.

please correct to pytorch format: I think your return N D C H W

  1. N → number of sequences (mini batch)
  2. Cin → number of channels (3 for rgb)
  3. D → Number of images in a sequence
  4. H → Height of one image in the sequence
  5. W → Width of one image in the sequence