Hi,
I’m training a video classification model with 8 classes. Each video contains 64 frames and each frame is 600x600 size.
since every video is quite big I can only use batch size of 16 on 8 V100 GPU’s (each gpu gets 2 videos randomly) - therefor the BatchNormalization layers calculated for 2 videos and not on the entire 16 videos which gives me low results.
Anyone has an idea how to solve this?
Best Regards,
Yana