How does PyTorch automatically free up GPU space?

Consider the following code snippet.

import torch

for i in range(10):
    x = torch.randn(10000, 10000).cuda()
    print(torch.cuda.max_memory_allocated())

On my computer, the output is

400556032
801112064
801112064
801112064
801112064
801112064
801112064
801112064
801112064
801112064

It seems that PyTorch is automatically deleting tensors from GPU that will no longer be used. How does it know? Isn’t the name of the tensor x only a reference to the tensor?
Consider the following code snippet

import torch

x = torch.randn(100).cuda()
print(x.mean().item())
y = x    # Shallow copy
del x
torch.cuda.empty_cache()
print(y.mean().item())

On my computer, the two output values are the same. This suggests that PyTorch does not “delete” tensors, even though explicitly ordered.
My question is, how do memory allocation and deallocation work in PyTorch?

In your first example you are replacing the x tensor in each iteration, so that the “old” x tensor can be freed or more specifically: it’s memory can be reused.

PyTorch frees memory of tensors, which don’t have any valid reference pointing to them anymore.
Since y points to the data of x, this data cannot be freed even after x was deleted.