CUDA out of memory - Transfer Learning

ptrblck · April 9, 2019, 10:01am

You could try to accumulate the gradients using @albanD’s suggestions posted here and thus artificially create a larger batch size. This might help the convergence of your model.