Huge CPU RAM consumed when creating tensor inside GPU

Hello!

I would like to know why my CPU memory is highly used when I create a tensor inside my GPU. For example:

t = torch.cuda.FloatTensor(size=(10, 10000000)) # Creates a peak of 4.7GB of my CPU RAM

# I also have tried to create the same array inside GPU using cupy and send it to pytorch:
t = cupy.ndarray(shape=(10, 10000000), dtype=np.float32) # Here, no significative CPU RAM seems to be used and consumes aprox. 400MB of my GPU Memory, as expected.

torch.utils.dlpack.from_dlpack(output_cupy.toDlpack()) # Here the same memory consumption peak of 4.7 GB of my CPU RAM occurs

Is it expected? If so, why?