Hi! I’ve been learning how to use PyTorch for a month, and I have achieved to process video input data to classify videos or regress a vector from them.
I’ve done this by overfitting data with this structure:
(PackedSequence with ALL the videos) -> (Conv2D layer process all the frames) -> (new PackedSequence with the features extracted from frames) -> (LSTM) -> (Linear layer takes the output of last frame) -> (Final output)
I’ve been able to load the videos in batches of PackedSequence’s manually, but now I want to make use of torch.utils.data.Dataset and I don’t know how to implement it, or if it’s even possible.
Should __getitem__
return a PackedSequence? Should I stop using PackedSequence?
Thanks in advance.