Move model to GPU but keep some attributes on CPU

I have a model defined as

class Model(nn.Module):
    def __init__():
        super(model, self).__init__()
    
        self.conv_0 = nn.Conv2d()
        self.conv_1 = nn.Conv2d()
        self.conv_2 = nn.Conv2d()

        self.make_coordinates()

    def make_coordinates(self):
        xs = torch.arange(512, device="cpu").float() 
        ys = torch.arange(512, device="cpu").float()
        zs = torch.arange(512, device="cpu").float()
        if normalize:
            xs = xs / (H - 1) * 2 - 1 # (-1, 1)
            ys = ys / (W - 1) * 2 - 1 # (-1, 1)
            zs = zs / (D - 1) * 2 - 1 # (-1, 1)
        self.coords = torch.stack(torch.meshgrid(xs, ys, zs), dim=-1)

    def forward(self, x):
        pass


# move model to GPU
model = Model().cuda()

And I move the model to GPU, my question is that is self.coords still on CPU or is it already moved to GPU because I move the whole model to GPU? If latter, is there any way to move the trainable parameters on GPU but keep some attributes (self.coords in this case) on CPU?

I only want to move some batches of self.coords to GPU when using it because self.coords is extremely large and would take up almost 2GB if I initialize on GPU.

There is another solution in which I don’t initialize coordinates in the model but in dataloader and pass batches of coordinates to forward, but in this case, if I want to pass all coordinates to forward, then feed forward will be performed multiple times which might take a lot of time.

self.coords will be on the CPU since you are explicitly creating its inputs via device="cpu" and since none of these tensors are a registered buffer or parameter and thus won’t be moved to the GPU via model.to().

1 Like

thank you so much, so only registered buffer or parameter defined in model will be moved to the GPU via model.to()?

Yes, as well as registered submodules (with their submodules, buffers, parameters, etc.).