A Wild Tensor Appears after `to()`

The following code runs out of memory on my machine.

import torch

a = torch.ones(2000, 1000, 1000)
b = a.clone()
a.to('cuda:0')
del a
torch.cuda.empty_cache()
b.to('cuda:0')

However, the following code works well

import torch

a = torch.ones(2000, 1000, 1000)
b = a.clone()
a = a.to('cuda:0')
del a
torch.cuda.empty_cache()
b.to('cuda:0')

Their difference is line 5. For the first code, I have no idea what to do to release the memory. Can anyone help me? Thanks.

Hi,

The .to() operation is out of place, so the first code does weird stuff for sure because it send data to the GPU and then discard it instantly.
The second one actually holds on to it and is only deleted on the next line.

Why one runs out of memory and not the other is due to fragmentation? Not sure what would be the reason here.