I am training on MNIST images using resnet-18 architecture on a 16 GB GPU memory machine. Everything works perfectly if I do not attach the privacy engine to my optimizer, each batch takes about 1-2 GB GPU memory, which is flushed out after processing each batch, and hence the total GPU memory consumption stays around 2 GB throughout the process of training.
However, when I attach the privacy engine to my optimizer, the used GPU memory keeps accumulating with each batch. Hence, with the processing of each batch, the used GPU memory increases, and soon I receive CUDA out of memory error.
I am not able to understand why GPU memory is not being freed at the end of every batch upon the inclusion of the privacy engine.