Can pytorch support to update parameters after a relative large batch size which exceed the GPU memory if feeded in one time?
My model now can only be feeded batch_size=32 samples a time due to GPU 11G memory. The loss is varied heavily when batch_size is small, because the category is 4000. So I want to update the parameters after more samples, like 128 samples. Anyone has any advice?
By the way, anyone has tried it? I am doubt if it performs better than small batch size.