Pytorch Model Memory Leaks Issue

Hello folks,

Actually I have a requirement that convert keras model to pytorch, and I did it. But when am training keras model it is consuming 100MB of GPU memory and with the same data set when am training in pytorch it almost consuming 1GB of GPU memory, even i didn’t change the data set.

When i set pytorch model to(device), occupying 800MB of GPU memory but this is not happening with keras

appreciate for the quick response.


The first CUDA operation will create the CUDA context, which will use ~700-800MB of device memory.
Are you seeing a difference in the memory usage while running the training or only during the model loading?

Yes, i could see difference while training model in keras and pytorch. Actually I haven’t specified specific gpu device in keras but it’s utilizing little bit GPU memory. Can I know the reason please?

What CUDA and cudnn versions are used in Keras/TF and how large is the allocated memory during the run in comparison to PyTorch?