I am a SW eng trying to learn pytorch and deep learning.
I am trying to train a resnext50_32x4d model.
I have limited resources; so I can not increase the batch size to more than 16.
But just to try I used a high capacity cloud machine for a single day and increased batch size to 128 and I got a lot better results.
I think this is because of the normalization health.
I am ok with the slow training because of the limited resources but I am not ok with the normalization defect.
Is there a way to do the batch normalization across 8 batches or something like that?
Thanks in advance!
Edit: Not sure if I posted according to the forum rules; not sure which category to select. Please feel free to let me know if I did wrong.