nn.Module with multiple inputs

lucf · April 7, 2019, 12:33pm

Hi there,

Sorry for jumping in, but I am also trying to write modules that can accept multiple inputs. Since I want them to be flexible with regards to the number of inputs, I tried passing a list of Variables() to the forward() method. Here is an example among other similar modules:

class Addition(Aggregation):
    """
    Add two input tensors, return a single output tensor of same dimensions. If input and output have different sizes,
    use largest in each dimension and zero-pad or interpolate (spatial dimensions), or convolve with a 1x1 filter
    (number of channels)
    """
    def __init__(self, in_channels: list, pad_or_interpolate: str = 'pad', pad_mode: str = 'replicate', 
        interpolate_mode: str = 'nearest'):

        assert pad_or_interpolate in ['pad', 'interpolate'], \
        "Error: Unknown value for `pad_or_interpolate` {}".format(pad_or_interpolate)

        super(Addition, self).__init__()
        self.ch_align = ChannelAlignment(in_channels)  # use 1x1 convolution to align n_channels

        if pad_or_interpolate == 'pad':
            self.sz_align = partial(self.align_sizes_pad, mode=pad_mode)
        else: 
            self.sz_align = partial(self.align_sizes_interpolate, mode=interpolate_mode)

    def forward(self, inputs: list):
        """
        Performs element-wise sum of inputs. If they have different dimensions, they are first adjusted to
        common dimensions by 1/ padding or interpolation (h and w axes) and/or 2/ 1x1 convolution.
        :param inputs: List of torch input tensors of dimensions (N, C_i, H_i, W_i)
        :return: A single torch Tensor of dimensions (N, max(C_i), max(H_i), max(W_i)), containing the element-
            wise sum of the input tensors (or their size-adjusted variants)
        """
        inputs = self.sz_align(inputs)  # Perform size alignment
        inputs = self.ch_align(inputs)  # Perform channel alignment
        stacked = torch.stack(inputs, dim=4)  # stack inputs along an extra axis (will be removed when summing up)
            
        return torch.sum(stacked, 4, keepdim=True).squeeze(4)

However I am getting weird errors:

Models using these modules do not train if they are more than a few layers deep (accuracy does not increase and loss is “infinite”)
Sometimes they crash and I get error messages such as

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

or sometimes:

  File "C:\Users\Luc\Miniconda3\envs\pytorch\lib\site-packages\torch\autograd\__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: CUDA error: an illegal memory access was encountered

I am not sure the issue comes from the fact that I am passing lists to forward() but the reason why I suspect this is that when I try viewing my models with pytorch-summary, I get the following message:

  File "C:\Users\Luc\Miniconda3\envs\pytorch\lib\site-packages\torchsummary\torchsummary.py", line 19, in hook
    summary[m_key]["input_shape"] = list(input[0].size())
AttributeError: 'list' object has no attribute 'size'

(even though testing the forward pass with a simple tensor returns no error).

I am trying to generate CNNs automatically so it has a lot of boilerplate code which makes it difficult for me to provide a simple reproducible example, but I hope you can assist!

Many thanks