Given that people link this thread and appear to be looking for it:
As Simon says, when a Tensor (or all Tensors referring to a memory block (a Storage)) goes out of scope, the memory goes back to the cache PyTorch keeps. You can free the memory from the cache using
#include <c10/cuda/CUDACachingAllocator.h>
and then calling
c10::cuda::CUDACachingAllocator::emptyCache();
(of course, you could try using torch::
instead of c10::
and see if it is automatically imported somewhere).
Best regards
Thomas