Hi there,
Sorry for jumping in, but I am also trying to write modules that can accept multiple inputs. Since I want them to be flexible with regards to the number of inputs, I tried passing a list of Variables() to the forward() method. Here is an example among other similar modules:
class Addition(Aggregation):
"""
Add two input tensors, return a single output tensor of same dimensions. If input and output have different sizes,
use largest in each dimension and zero-pad or interpolate (spatial dimensions), or convolve with a 1x1 filter
(number of channels)
"""
def __init__(self, in_channels: list, pad_or_interpolate: str = 'pad', pad_mode: str = 'replicate',
interpolate_mode: str = 'nearest'):
assert pad_or_interpolate in ['pad', 'interpolate'], \
"Error: Unknown value for `pad_or_interpolate` {}".format(pad_or_interpolate)
super(Addition, self).__init__()
self.ch_align = ChannelAlignment(in_channels) # use 1x1 convolution to align n_channels
if pad_or_interpolate == 'pad':
self.sz_align = partial(self.align_sizes_pad, mode=pad_mode)
else:
self.sz_align = partial(self.align_sizes_interpolate, mode=interpolate_mode)
def forward(self, inputs: list):
"""
Performs element-wise sum of inputs. If they have different dimensions, they are first adjusted to
common dimensions by 1/ padding or interpolation (h and w axes) and/or 2/ 1x1 convolution.
:param inputs: List of torch input tensors of dimensions (N, C_i, H_i, W_i)
:return: A single torch Tensor of dimensions (N, max(C_i), max(H_i), max(W_i)), containing the element-
wise sum of the input tensors (or their size-adjusted variants)
"""
inputs = self.sz_align(inputs) # Perform size alignment
inputs = self.ch_align(inputs) # Perform channel alignment
stacked = torch.stack(inputs, dim=4) # stack inputs along an extra axis (will be removed when summing up)
return torch.sum(stacked, 4, keepdim=True).squeeze(4)
However I am getting weird errors:
- Models using these modules do not train if they are more than a few layers deep (accuracy does not increase and loss is “infinite”)
- Sometimes they crash and I get error messages such as
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
or sometimes:
File "C:\Users\Luc\Miniconda3\envs\pytorch\lib\site-packages\torch\autograd\__init__.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: CUDA error: an illegal memory access was encountered
I am not sure the issue comes from the fact that I am passing lists to forward() but the reason why I suspect this is that when I try viewing my models with pytorch-summary, I get the following message:
File "C:\Users\Luc\Miniconda3\envs\pytorch\lib\site-packages\torchsummary\torchsummary.py", line 19, in hook
summary[m_key]["input_shape"] = list(input[0].size())
AttributeError: 'list' object has no attribute 'size'
(even though testing the forward pass with a simple tensor returns no error).
I am trying to generate CNNs automatically so it has a lot of boilerplate code which makes it difficult for me to provide a simple reproducible example, but I hope you can assist!
Many thanks