I want to use custom convolutional layers, but I don’t want to modify the model definition file (such as resnet.py), instead I want to override the forward method of each convolutional layer for a given nn.Module.
I think the apply function seems to be helpful for my purpose. My implementation is as follows:
def channel_pruning(m):
def new_forward(self, x): y = F.conv2d(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups) k = int(m.out_channels * m.rate) if k > 0: s = F.adaptive_avg_pool2d(torch.abs(x), (1, 1)).view(x.size()[0], -1) g = F.relu(self.gate(s)) i = (-g).topk(k, 1)[1] t = g.scatter(1, i, 0) t = t / torch.sum(t, dim=1).unsqueeze(1) * self.out_channels y = y * t.unsqueeze(2).unsqueeze(3) return y name = m.__class__.__name__ if 'Conv' in name: m.rate = args.rate m.gate = nn.Linear(in_features=m.in_channels, out_features=m.out_channels, bias=True).to(device) nn.init.constant_(m.gate.bias, 1) nn.init.kaiming_normal_(m.gate.weight) m.forward = types.MethodType(new_forward, m)
model.apply(channel_pruning)
This code runs fine under CPU and single GPU. Unfortunately, when I try to parallelize the model, the program reports an error “RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument weight in method wrapper__cudnn_convolution)”
The reason for the error is obvious, the model and the input are assigned to different GPUs. What should I do to make this code support multi-GPU parallelism?