Model.cuda() does not convert all variables to cuda

Hmrishav_Bandyopadhy · March 14, 2021, 7:13am

Hi, so i am trying to write an architecture where i have to convert entire models to cuda using model.cuda(). However, some of the elements are variables initialised in the init() loop of nn.Module() class. How do i convert them to cuda ? For example,

class Net(nn.Module):
    def __init__(self):
        self.xyz=torch.tensor([1,2,3,4...])   
        # Convert this to cuda without using .cuda() on tensor xyz, but by using model.cuda()
        .
        .
    def forward(self,x):
        .
        .

Is it possible to do this?

Valerio_Biscione · March 14, 2021, 7:21am

torch.tensor([1,2,3]).cuda()

EDIT: Wait, I just read your comment! At what point in your script you want to push those tensors in cuda? Why don’t you want to use .cuda()?

Hmrishav_Bandyopadhy · March 14, 2021, 7:24am

I want to do this:

model=Net()
model.cuda() ##Convert everything to cuda()

I dont want to use .cuda() as later on i will be migrating the code to a CPU only device where i would need to test on CPU.

EDIT: I will train on GPU and test on CPU, so the weights will need to be .cpu()

Valerio_Biscione · March 14, 2021, 7:28am

I see, I also run stuff both on the CPU and GPU. My solution generally is something like

def make_cuda(fun):
    return fun.cuda() if torch.cuda.is_available() else fun

class Net(nn.Module):
    def __init__(self):
        self.xyz = make_cuda(torch.tensor([1,2,3,4...]))
        # Convert this to cuda without using .cuda() on tensor xyz, but by using model.cuda()
        .
        .
    def forward(self,x):
        .
        .

This will automatically push stuff to cuda only if you are ona machine with a GPU.

Hmrishav_Bandyopadhy · March 14, 2021, 7:30am

Wont work, as i would be packaging the model and converting it to tf-lite and stuff…I figured out something that would work though:

model=model.cpu()
model.xyz=model.xyz.cpu()

While saving the weights

Valerio_Biscione · March 14, 2021, 7:35am

This will push tensors to cuda when you call model.cuda():


import torch
import torch.nn as nn
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.xyz = torch.tensor([1,2,3,4])

    def forward(self,x):
        pass

    def cuda(self):
        super().cuda()
        for k, v in self.__dict__.items():
            if isinstance(v, torch.Tensor):
                v.cuda()
       

a = Net()
a.cuda()

edit: I mistakenly wrote self.cuda() instead of super().cuda. Fixed.

ptrblck · March 14, 2021, 8:42am

While this approach would work, the proper way to register tensors inside an nn.Module would be to either use nn.Parameter (if this tensor requires gradients and should be trained) or via self.register_buffer. Both approaches will make sure that this tensor will be pushed to the specified device (in model.to(device)) and will also be added to the state_dict (which would be important if you want to save and load this model).

CC @Hmrishav_Bandyopadhy

Hmrishav_Bandyopadhy · March 14, 2021, 9:19am

Can you add an example of self.register_buffer for this use-case ?

Kitsunetic · March 14, 2021, 10:40am

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()

        xyz = torch.tensor([1, 2, 3, 4], dtype=torch.float)
        self.xyz = nn.Parameter(xyz)

    def forward(self, x):
        return x

model = Net()
model.xyz
"""
Parameter containing:
tensor([1., 2., 3., 4.], requires_grad=True)
"""

model.cuda()
model.xyz
"""
Parameter containing:
tensor([1., 2., 3., 4.], device='cuda:0', requires_grad=True)
"""

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()

        xyz = torch.tensor([1, 2, 3, 4], dtype=torch.float)
        self.register_buffer("xyz", xyz)

    def forward(self, x):
        return x

model = Net()
model.xyz
"""
tensor([1., 2., 3., 4.])
"""

model.cuda()
model.xyz
"""
tensor([1., 2., 3., 4.], device='cuda:0')
"""