How to free gpu memory by deleting tensors?

Suppose that I create a tensor and put it on gpu, then I don’t need it and want to free gpu memory allocated by it. How to do that?

import torch
a=torch.randn(3,4).cuda() # nvidia-smi shows that some mem has been allocated.
# do something
# a does not exist and nvidia-smi shows that mem has been freed.

I have tried:

  1. del a
  2. del a; torch.cuda.empty_cache()

    But none of them works.

For such a small Tensor, most of the allocated memory is actually the cuda runtime which will never be freed until you exit the python interpreter.