How upload sequence of image on video-classification

ptrblck · April 11, 2021, 3:48am

Yes, I think the cleanest way would be to define a custom Dataset.

I’m not sure what your use case is, but I understood that you’ve already split the input images into patches. If that’s the case, you could create a mapping and store the prediction for each patch and “combine” them afterwards. E.g. if you are working on a classification use case, you could use a majority voting etc.

S_dB · September 2, 2021, 9:57pm

Hi @ptrblck. First of all I would like to thank you for your very functional code (MyDataset), however I notice that it is slow and makes training very slow. Indeed here are the times for a pass in the network and for a pass in MyDataset (4 for batch of 4).

Time of one pass in the network on GPU (with update) :
0.025026798248291016s

Time of each pass in MyDataset so 4 for batch=4 (sequence length=32):

0.6877198219299316s
0.6699411869049072s
0.6670119762420654s
0.6949927806854248s

To obtain those times I have already made a modification, basically transforming the for loop into a list of comprehension:
images = [self.transform(Image.open(self.image_paths[i][0])) for i in indices]

How do you speed up your MyDataset code? Thanks!

Manasi_Kasande · May 8, 2022, 2:41am

How can I modify this code if all my sequences don’t have the same length ?

ptrblck · May 8, 2022, 4:43pm

You could pad the sequences to the same length e.g. in a custom collate_fn. Also, take a look at this post which discusses some approaches.

Maths_Electronics_Tu · January 11, 2023, 6:08pm

please correct to pytorch format: I think your return N D C H W

N → number of sequences (mini batch)
Cin → number of channels (3 for rgb)
D → Number of images in a sequence
H → Height of one image in the sequence
W → Width of one image in the sequence