Setting 'requires_grad = True' increase the GPU allocated memory at each iteration

I illustrate my issue with the following simple example:

import torch
import sys

x = torch.randn(3,4)
x=x.cuda()
x.requires_grad = True
q=x+1
#q.requires_grad=True
z=q*q
z.sum().backward()
print(‘before del x’,torch.cuda.memory_allocated(device=0)
del x
print(‘after del x’,torch.cuda.memory_allocated(device=0))

by running the above code ‘del x’ does not delete the tensor x and allocated memory does not change. However if i change ‘x.requires_grad= False’ and ‘q.requires_grad = True’, I can run the above code and ‘del x’ removes the tensor x.
The question is how can I delete tensor x by setting ‘x.requires_grad = True’ ?