Creating Pytorch variables directly on GPU somehow uses up lots of CPU during creation

I need to create variables directly on the GPU because I am very limited in my CPU ram.

I found the method to do this here

Which mentions using


However, when I tried

pytorchGPUDirectCreate = torch.FloatTensor(20000000, 128).uniform_(-1, 1).cuda()

It still seemed to take up mostly CPU RAM, before being transferred to GPU ram.

I am using Google Colab. To view RAM usage during the variable creation process, after running the cell, go to Runtime -> Manage Sessions

With and without using torch.set_default_tensor_type('torch.cuda.FloatTensor') , the CPU RAM bumps up to 11.34 GB while GPU ram stays low, and then GPU RAM goes to 9.85 and CPU ram goes back down.

It seems that torch.set_default_tensor_type(‘torch.cuda.FloatTensor’) didn’t make a difference

For convenience here’s a direct link to a notebook anyone can directly run


You can directly create a tensor on a GPU by using the device argument:

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
pytorchGPUDirectCreate = torch.rand(20000000, 128, device = device).uniform_(-1, 1).cuda()

I just tried this in your notebook and got RAM 1.76GB used and GPU 9.86GB. Still, a lot of RAM is used but that ~10GB less than originally.

Hope it helps!

Thanks! I am wondering if the .cuda() is still necessary since you’re already specifying the device on initialization .