I’m currently using ResNet152 to extract features from a video, where the extracted features are of shape (1842, 2048)
. I would like to also extract features for the same video, but horizontally flipped. I see 2 ways to do so:
- Flip the video horizontally, then re-compute features
- Flip the ResNet152 features for each frame from the already-computed features on the original video
The 2nd option is attractive when considering computing resources - how would that be achieved? Are ResNet152 features able to be “flipped” like this? If so, I’m guessing it can be done with torch.flip
?
Thank you for reading! Hoping someone can guide me in the right direction.