Torch tensor initialization consumes a lot of memory

Hi all,

I am trying to figure out, why torch allocates so much memory for a tensor that at least to me doesn’t seem it should allocate this much memory:

  weights = torch.tensor([0.3242]).cuda()

image

This tensor allocates more than 737MB on my GPU and I have absolutely no idea why this would happen.

I am using torch1.1 but also tried with torch 1.3 which results in more than 790MB memory allocation.

Neither

weights = weights.cpu()
torch.cuda.empty_cache()

or

weights = weights.detach().cpu()
torch.cuda.empty_cache()

NOR

del weights
torch.cuda.empty_cache()

have any effect. The memory stays allocated.

Does anyone know what to do in this case?

Thanks a lot.
Christian

The CUDA context will be created on the device before the first tensor is created, which will use memory.

1 Like

Hi!
Sorry for posting in the old thread, but is there a way to calculate how much this CUDA context will take from GPU memory?

Unfortunately that’s not easily doable, as it depends on the CUDA version, the number of native PyTorch kernels, the number of used compute capabilities, the number of 3rd party libs (such as cudnn, NCCL) etc.
The best way would be to perform a CUDA operation on your system and check the memory usage via nvidia-smi.

1 Like