May I know why pytorch always need to stall the GPU memory? Or is there any way to release it?
PyTorch uses a caching mechanism to be able to reuse already allocated memory.
You can check the reserved and allocated memory e.g. via print(torch.cuda.memory_summary())
to see how much memory is currently allocated and in the cache.
To release the memory from the cache (and to reallocate it with a synchronizing cudaMalloc
) you could use torch.cuda.empty_cache()
so that other processes can use it.