Thanks!
As you can see in the memory_summary()
, PyTorch reserves ~2GB so given the model size + CUDA context + the PyTorch cache, the memory usage is expected:
| GPU reserved memory | 2038 MB | 2038 MB | 2038 MB | 0 B |
| from large pool | 2036 MB | 2036 MB | 2036 MB | 0 B |
| from small pool | 2 MB | 2 MB | 2 MB | 0 B |
If you want to release the cache, use torch.cuda.empty_cache()
. This will synchronize your code thus slowing it down, but would allow other applications to use this memory.