Dynamically set Conv2d based on input channels

andrejankas · July 4, 2021, 2:51pm

I am currently doing a module that initializes a Conv2d and BatchNorm2d outside the __init__. Code works fine in cpu, but when moving module to cuda these conv and bn aren’t moved properly. Code:

import torch
import torch.nn as nn

class ExpandChannels(nn.Module):
    def __init__(self, num_classes: int = None):
        super(ExpandChannels, self).__init__()
        self.num_classes = num_classes
        self.conv = None
        self.bn = None

    def reset_parameters(self, x):
        self.conv = nn.Conv2d(x.size(1), self.num_classes, kernel_size=1)
        self.bn = nn.BatchNorm2d(self.num_classes)

    def forward(self, x):
        if self.conv is None:
            self.reset_parameters(x)
        x = self.conv(x)
        x = self.bn(x)
        return x

m = ExpandChannels(100).cuda()
m(torch.randn(4, 3, 28, 28).cuda())

Error: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

ptrblck · July 4, 2021, 11:20pm

You are creating the new self.conv and self.bn layers inside the forward pass without specifying the device, so they will be created on the CPU by default.
To properly push them to the GPU, you could use:

    def reset_parameters(self, x):
        self.conv = nn.Conv2d(x.size(1), self.num_classes, kernel_size=1).to(x.device)
        self.bn = nn.BatchNorm2d(self.num_classes).to(x.device)

Additionally, you could also check the lazy modules, e.g. LazyConv2d, which perform a similar approach.