my gpu has small memory , so to have a large batch size ,i want to accumulate gradients and backward after a few iterations.
does anyone know how to do that? i tried a few ways but they failed.
You could accumulate the gradient for a few samples and call the optimizer after these steps.
This post might help.