nn.Module with multiple inputs

trypag · January 28, 2017, 4:39pm

Hey,
I am interested in building a network having multiple inputs. I understand that when calling the forward function, only one Variable is taken in parameter. I have two possible use case here :

the same image at multiple resolutions is used
different images are used

I would like some advice to design a nn.Module in the same fashion as alexnet for example.
I have no idea how to :

give multiple inputs to the nn.Module
join the fc layers together

I am following the example of imagenet, which looks like this :

class SimpleConv(nn.Module):
    def __init__(self, num_classes):
        super(SimpleConv, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(64, 192, kernel_size=5, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(192, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
        )
        self.classifier = nn.Sequential(
            nn.Dropout(),
            nn.Linear(256 * 6 * 6, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Linear(4096, num_classes),
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), 256 * 6 * 6)
        x = self.classifier(x)
        return x


def simple_conv(pretrained=False, num_classes=140):
    model = SimpleConv(num_classes)
    # if pretrained:
    #     model.load_state_dict(model_zoo.load_url(model_urls['alexnet']))
    return model

I hope it’s clear
Thanks

trypag · January 28, 2017, 4:53pm

Maybe I am mistaking, but I think the magic should happen in the forward call where the input is a tensor, not a Variable as I was thinking ?
For the second point, about merging the fc layers, I guess I should sum the layers outputs to a final layer ?

fmassa · January 28, 2017, 5:01pm

Hi,

You can pass multiple inputs to the forward call of the network, that is not a problem, just pass a Variable and you will be fine.
About merging the fc layers, you can do any operation you want, for example concatenating the outputs (via torch.cat([res1, res2],1)), summing them, etc.
Here is a simplified example

class SimpleConv(nn.Module):
    def __init__(self):
        super(SimpleConv, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(1, 1, kernel_size=3, stride=1, padding=1),
            nn.ReLU(inplace=True),
        )

    def forward(self, x, y):
        x1 = self.features(x)
        x2 = self.features(y)
        x = torch.cat((x1, x2), 1)
        return x

trypag · January 28, 2017, 5:05pm

oh nice !! Thanks @fmassa

Kalamaya · February 11, 2017, 10:35pm

Is there a way in which we can actually put the x.view(x.size(0), 256 * 6 * 6) operation INSIDE the nn.Sequential? This is been bugging me for some time…thanks

apaszke · February 11, 2017, 11:35pm

No, there’s not. We don’t recommend that. Just write a custom container, with two sequential parts and a reshape in the middle. See torchvision models.

Russel_Russel · February 12, 2017, 7:36am

Would someone please explain what this function does? Does a pre-trained model means that I can just use its weights right away without doing anything? Thanks for the help.

def simple_conv(pretrained=False, num_classes=140):
    model = SimpleConv(num_classes)
    # if pretrained:
    #     model.load_state_dict(model_zoo.load_url(model_urls['alexnet']))
    return model

apaszke · February 12, 2017, 1:49pm

Yes, pretrained models are ones that have been trained by someone earlier and that you can use in different applications.

vodp · October 30, 2017, 8:11am

To the second point of the question, concatenating features is not okay because the tensor’s size will grow hence not compatible with classifier anymore.

arcticriki · August 22, 2018, 3:18pm

What do you mean with:

…just pass a Variable and you will be fine.

? Can you show an example of usage in the train loop? Thanks a lot

Joy_Chopra · October 17, 2018, 4:20am

But changing the definition of forward would constrain you, as you might not be able to use functions that assume forward takes 1 input like the tensorboardX library.
Maybe you could just pass a dictionary of your inputs.

lucf · April 7, 2019, 12:33pm

Hi there,

Sorry for jumping in, but I am also trying to write modules that can accept multiple inputs. Since I want them to be flexible with regards to the number of inputs, I tried passing a list of Variables() to the forward() method. Here is an example among other similar modules:

class Addition(Aggregation):
    """
    Add two input tensors, return a single output tensor of same dimensions. If input and output have different sizes,
    use largest in each dimension and zero-pad or interpolate (spatial dimensions), or convolve with a 1x1 filter
    (number of channels)
    """
    def __init__(self, in_channels: list, pad_or_interpolate: str = 'pad', pad_mode: str = 'replicate', 
        interpolate_mode: str = 'nearest'):

        assert pad_or_interpolate in ['pad', 'interpolate'], \
        "Error: Unknown value for `pad_or_interpolate` {}".format(pad_or_interpolate)

        super(Addition, self).__init__()
        self.ch_align = ChannelAlignment(in_channels)  # use 1x1 convolution to align n_channels

        if pad_or_interpolate == 'pad':
            self.sz_align = partial(self.align_sizes_pad, mode=pad_mode)
        else: 
            self.sz_align = partial(self.align_sizes_interpolate, mode=interpolate_mode)

    def forward(self, inputs: list):
        """
        Performs element-wise sum of inputs. If they have different dimensions, they are first adjusted to
        common dimensions by 1/ padding or interpolation (h and w axes) and/or 2/ 1x1 convolution.
        :param inputs: List of torch input tensors of dimensions (N, C_i, H_i, W_i)
        :return: A single torch Tensor of dimensions (N, max(C_i), max(H_i), max(W_i)), containing the element-
            wise sum of the input tensors (or their size-adjusted variants)
        """
        inputs = self.sz_align(inputs)  # Perform size alignment
        inputs = self.ch_align(inputs)  # Perform channel alignment
        stacked = torch.stack(inputs, dim=4)  # stack inputs along an extra axis (will be removed when summing up)
            
        return torch.sum(stacked, 4, keepdim=True).squeeze(4)

However I am getting weird errors:

Models using these modules do not train if they are more than a few layers deep (accuracy does not increase and loss is “infinite”)
Sometimes they crash and I get error messages such as

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

or sometimes:

  File "C:\Users\Luc\Miniconda3\envs\pytorch\lib\site-packages\torch\autograd\__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: CUDA error: an illegal memory access was encountered

I am not sure the issue comes from the fact that I am passing lists to forward() but the reason why I suspect this is that when I try viewing my models with pytorch-summary, I get the following message:

  File "C:\Users\Luc\Miniconda3\envs\pytorch\lib\site-packages\torchsummary\torchsummary.py", line 19, in hook
    summary[m_key]["input_shape"] = list(input[0].size())
AttributeError: 'list' object has no attribute 'size'

(even though testing the forward pass with a simple tensor returns no error).

I am trying to generate CNNs automatically so it has a lot of boilerplate code which makes it difficult for me to provide a simple reproducible example, but I hope you can assist!

Many thanks