Conv2d Argument Error For Custom Dataset

My task is to create a CNN model which takes in an (1)image, (2)robot path and (3)robot action and output a (4)reward value.

(1), (2), (3) are all 2D binary matrices which are output from a simulator I made using Python.

E.g. of a data point containing (1), (2), (3) and (4):

[ [ [image], [robot_path], [robot_action] ] , reward_value ]

image is a list of 3 binary matrices = torch.Size[(5, 5, 3)]
robot_path is a single binary matrix = torch.Size[(5, 5)]
robot_actions is a list of 3 binary matrices = torch.Size[(5, 5, 3)]
reward_value is an integer.

I created a custom Dataset class and also the DataLoader.
Then I followed the code given here:Training a Classifier — PyTorch Tutorials 1.9.0+cu102 documentation

This is the output when I print a single data point from my Dataset object:

[[[tensor([[1, 1, 1, 1, 1],
        [0, 1, 0, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 1, 0],
        [1, 1, 1, 1, 1]]), tensor([[1, 1, 1, 1, 1],
        [0, 1, 0, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 1, 0],
        [1, 1, 1, 1, 1]]), tensor([[1, 1, 1, 1, 1],
        [0, 1, 0, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 1, 0],
        [1, 1, 1, 1, 1]])], tensor([[1, 1, 1, 1, 1],
        [0, 1, 0, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 1, 0],
        [1, 1, 1, 1, 1]]), [tensor([[1, 1, 1, 1, 1],
        [0, 1, 0, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 1, 0],
        [1, 1, 1, 1, 1]]), tensor([[1, 1, 1, 1, 1],
        [0, 1, 0, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 1, 0],
        [1, 1, 1, 1, 1]]), tensor([[1, 1, 1, 1, 1],
        [0, 1, 0, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 1, 0],
        [1, 1, 1, 1, 1]])]], 7]

However, I get the following error when running the code:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_78598/322194925.py in <module>
     12 
     13         # forward + backward + optimize
---> 14         outputs = net(inputs)
     15 #         loss = criterion(outputs, labels)
     16 #         loss.backward()

~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/tmp/ipykernel_78598/3992303050.py in forward(self, x)
     13 
     14     def forward(self, x):
---> 15         x = self.pool(F.relu(self.conv1(x)))
     16         x = self.pool(F.relu(self.conv2(x)))
     17         x = torch.flatten(x, 1) # flatten all dimensions except batch

~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

~/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py in forward(self, input)
    441 
    442     def forward(self, input: Tensor) -> Tensor:
--> 443         return self._conv_forward(input, self.weight, self.bias)
    444 
    445 class Conv3d(_ConvNd):

~/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
    437                             weight, bias, self.stride,
    438                             _pair(0), self.dilation, self.groups)
--> 439         return F.conv2d(input, weight, bias, self.stride,
    440                         self.padding, self.dilation, self.groups)
    441 

TypeError: conv2d() received an invalid combination of arguments - got (list, Parameter, Parameter, tuple, tuple, tuple, int), but expected one of:
 * (Tensor input, Tensor weight, Tensor bias, tuple of ints stride, tuple of ints padding, tuple of ints dilation, int groups)
      didn't match because some of the arguments have invalid types: (!list!, !Parameter!, !Parameter!, !tuple!, !tuple!, !tuple!, int)
 * (Tensor input, Tensor weight, Tensor bias, tuple of ints stride, str padding, tuple of ints dilation, int groups)
      didn't match because some of the arguments have invalid types: (!list!, !Parameter!, !Parameter!, !tuple!, !tuple!, !tuple!, int)

I did realize that the conv2d takes in images of 3 channels and I think mine has about 7 channels, but this still did not work.

What can I do about this?

Hi Kavi!

I don’t know whether your specific use case makes sense, but to
answer your immediate question:

Let image be a three-channel image (say RGB) of shape
[height = 5, width = 5, nChannel = 3];
let robot_path be a single-channel image (with no explicit nChannel
dimension) of shape [height = 5, width = 5];
and let robot_actions be a three-channel image of shape
[height = 5, width = 5, nChannel = 3].

We will unsqueeze() robot_path to give it an explicit nChannel
dimension of size nChannel = 1, and then cat() the three images
together to produce a single seven-channel image, input, of shape
[height = 5, width = 5, nChannel = 7].

We then permute input's dimensions so that nChannel comes
before height and width and unsqueeze() it so that it has a leading
nBatch dimension of size nBatch = 1. We do this because Conv2d
requires an nBatch dimension and also requires that the nChannel
dimension precede the height and width dimensions.

You would now want your convolution neural network (CNN) to start
with a Conv2d layer that has in_channels = 7.

Here is an example that shows how to cat() your input images
together and pass them to an appropriate Conv2d. (The rest of
the CNN is left as an exercise for the reader.)

>>> import torch
>>> torch.__version__
'1.9.0'
>>> image = torch.randn (5, 5, 3)
>>> robot_path = torch.randn (5, 5)
>>> robot_actions = torch.randn (5, 5, 3)
>>> image.shape
torch.Size([5, 5, 3])
>>> robot_path.shape
torch.Size([5, 5])
>>> robot_actions.shape
torch.Size([5, 5, 3])
>>> input = torch.cat ([image, robot_path.unsqueeze (-1), robot_actions], dim = -1)
>>> input.shape
torch.Size([5, 5, 7])
>>> input = input.permute (2, 0, 1).unsqueeze (0)
>>> input.shape
torch.Size([1, 7, 5, 5])
>>> conv = torch.nn.Conv2d (in_channels = 7, out_channels = 2, kernel_size = 3)
>>> conv (input).shape
torch.Size([1, 2, 3, 3])

Best.

K. Frank

Thanks Frank, I solved this!

How did you solve having a similiar problem