Current memory use 2.6GiB of 7.7 GiB as seen on Ubuntu System Monitor
a = torch.tensor(1).cuda(0)
After running the command above memory use jumps to 4.7GiB
I’m not able to release this memory other than restarting the kernel. Tried del a, working on cuda in a function and torch.cuda.empty_cuda() but these do not work.
This only happens the first time I send tensor to cuda. To me this use of memory is ironic as I’m working on GPU and it limits how much I can send to cuda as the RAM committed does grow when batch sizes grow.
Working with tensor on .cpu() has none of these memory commitment issues.
My questions are:
Why is so much memory being committed when sending tensor with single INT tensor to GPU?
How can I release this memory again without killing the kernel?
Sending a tensor to the GPU should not allocate that much system RAM.
Could you post some information about your setup?
I.e. which GPU are you using as well as PyTorch version, how you’ve installed it (built from source or binaries), as well as the local CUDA and cudnn versions, if installed.
I suspect some just-in-time compilation might be going on in the background.
How long does this command take when you first run it?
NVidia Driver 430.64
GeForce RTX 2070
torchvision 0.3.0 py37_cu10.0.130_1 pytorch installed with conda as part of fastai install.
Cuda compilation tools, release 10.1, V10.1.105
Having trouble verifying version of CudNN