RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x13056 and 153600x2048)

You have applied self.pool two times.

# x = 10x1x28x28
x = self.pool(F.relu(self.conv1(x))) # x = 10x6x12x12
x = self.pool(F.relu(self.conv2(x))) # x = 10x16x4x4
1 Like

Thanks!! Somehow I managed to miss that… :confused:

After fixing the if condition, the code runs fine and I don’t get any issues.
Based on your initial error I guess the shape mismatch is raised in a linear layer, which is not used in your model.
If you want to use self.linear comment it in again, set in_features to 65536, and use it in the forward method.

1 Like

Thanks a lot but I still have the same error with the same numbers when training :confused:

If it’s any help, here is the output of print(UNet):

UNet(
  (encoder): ModuleList(
    (0): DoubleConv2d(
      (stack): Sequential(
        (0): Conv2d(3, 112, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (1): ReLU()
        (2): Conv2d(112, 112, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (3): ReLU()
      )
    )
    (1): DoubleConv2d(
      (stack): Sequential(
        (0): Conv2d(112, 224, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (1): ReLU()
        (2): Conv2d(224, 224, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (3): ReLU()
      )
    )
    (2): DoubleConv2d(
      (stack): Sequential(
        (0): Conv2d(224, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (1): ReLU()
        (2): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (3): ReLU()
      )
    )
  )
  (bottom): DoubleConv2d(
    (stack): Sequential(
      (0): Conv2d(448, 896, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (1): ReLU()
      (2): Conv2d(896, 896, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (3): ReLU()
    )
  )
  (decoder): ModuleList(
    (0): ConvTranspose2d(896, 448, kernel_size=(2, 2), stride=(2, 2))
    (1): DoubleConv2d(
      (stack): Sequential(
        (0): Conv2d(896, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (1): ReLU()
        (2): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (3): ReLU()
      )
    )
    (2): ConvTranspose2d(448, 224, kernel_size=(2, 2), stride=(2, 2))
    (3): DoubleConv2d(
      (stack): Sequential(
        (0): Conv2d(448, 224, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (1): ReLU()
        (2): Conv2d(224, 224, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (3): ReLU()
      )
    )
    (4): ConvTranspose2d(224, 112, kernel_size=(2, 2), stride=(2, 2))
    (5): DoubleConv2d(
      (stack): Sequential(
        (0): Conv2d(224, 112, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (1): ReLU()
        (2): Conv2d(112, 112, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (3): ReLU()
      )
    )
  )
  (end): Conv2d(112, 1, kernel_size=(1, 1), stride=(1, 1))
  (maxpool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (linear): Linear(in_features=65536, out_features=10, bias=True)
)

This code works for me:

class DoubleConv2d(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(DoubleConv2d, self).__init__()

        self.stack = nn.Sequential(
            nn.Conv2d(in_channels, out_channels, 3, 1, 1),
            nn.ReLU(),
            nn.Conv2d(out_channels, out_channels, 3, 1, 1),
            nn.ReLU()
        )


    def forward(self, x):
        return self.stack(x)


class UNet(nn.Module):
    def __init__(self):
        super(UNet, self).__init__()
        
        inn = 3
        out = 1
        mid = [112, 224, 448]

        self.encoder = nn.ModuleList()
        self.bottom  = DoubleConv2d(mid[-1], 2*mid[-1]) # should both be mid[-1]?
        self.decoder = nn.ModuleList()
        self.end     = nn.Conv2d(mid[0], 1*out, 1)

        self.maxpool = nn.MaxPool2d(2, 2)
        self.linear  = nn.Linear(65536, 10)

        for dim in mid:
            self.encoder.append(DoubleConv2d(inn, dim))
            inn = dim

        for dim in mid[::-1]:
            self.decoder.append(nn.ConvTranspose2d(2*dim, dim, 2, 2))
            self.decoder.append(DoubleConv2d(2*dim, dim))


    def forward(self, x):
        connections = []

        for i in range(len(self.encoder)):
            module = self.encoder[i]

            x = module(x)
            connections.append(x)
            x = self.maxpool(x)

        x = self.bottom(x)

        for i in range(len(self.decoder)):
            module = self.decoder[i]
            x = module(x)

            if i % 2 == 0: # ConvTranspose2d
                connection = connections.pop()
                x = torch.cat((connection, x), dim=1)
        
        x = self.end(x)

        x = x.view(x.size(0), -1)
        x = self.linear(x)

        return x

model = UNet()
x = torch.randn(2, 3, 256, 256)
out = model(x)

If you get stuck, please post a new minimal, executable code snippet showing the error.

1 Like

This is exactly the code that I have now, but I just realized I had passed another model in the training function :man_facepalming:

I got a few errors I could fix, but now I’m stuck on this one that I moved to a new topic: Target size (torch.Size([3, 3, 256, 256])) must be the same as input size (torch.Size([3, 65536]))

Thank you so much for the help though!

I am getting similar error: The input dimensions are 6 and here is the observation space:

low = np.array([2, 0, 0, 0, 3, 2]).astype(np.float32)
high = np.array([3, 10, 10, 1, 30, 20]).astype(np.float32)
self.observation_space = spaces.Box(low, high)

Following is the code for model:

import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import torch as T
import os
import torch.optim as optim
import numpy as np


EPS = 0.003

def fanin_init(size, fanin=None):
	fanin = fanin or size[0]
	v = 1. / np.sqrt(fanin)
	return torch.Tensor(size).uniform_(-v, v)

class Critic(nn.Module):

	def __init__(self, state_dim, action_dim):
		"""
		:param state_dim: Dimension of input state (int)
		:param action_dim: Dimension of input action (int)
		:return:
		"""
		super(Critic, self).__init__()

		self.state_dim = state_dim
		self.action_dim = action_dim

		self.fcs1 = nn.Linear(state_dim,256)
		self.fcs1.weight.data = fanin_init(self.fcs1.weight.data.size())
		self.fcs2 = nn.Linear(256,128)
		self.fcs2.weight.data = fanin_init(self.fcs2.weight.data.size())

		self.fca1 = nn.Linear(action_dim,128)
		self.fca1.weight.data = fanin_init(self.fca1.weight.data.size())

		self.fc2 = nn.Linear(256,128)
		self.fc2.weight.data = fanin_init(self.fc2.weight.data.size())

		self.fc3 = nn.Linear(128,1)
		self.fc3.weight.data.uniform_(-EPS,EPS)

	def forward(self, state, action):
		"""
		returns Value function Q(s,a) obtained from critic network
		:param state: Input state (Torch Variable : [n,state_dim] )
		:param action: Input Action (Torch Variable : [n,action_dim] )
		:return: Value function : Q(S,a) (Torch Variable : [n,1] )
		"""
		s1 = F.relu(self.fcs1(state))
		s2 = F.relu(self.fcs2(s1))
		a1 = F.relu(self.fca1(action))
		x = torch.cat((s2,a1),dim=1)

		x = F.relu(self.fc2(x))
		x = self.fc3(x)

		return x


class Actor(nn.Module):

	def __init__(self, state_dim, action_dim, action_lim):
		"""
		:param state_dim: Dimension of input state (int)
		:param action_dim: Dimension of output action (int)
		:param action_lim: Used to limit action in [-action_lim,action_lim]
		:return:
		"""
		super(Actor, self).__init__()

		self.state_dim = state_dim
		self.action_dim = action_dim
		self.action_lim = action_lim

		self.fc1 = nn.Linear(state_dim,256)
		self.fc1.weight.data = fanin_init(self.fc1.weight.data.size())

		self.fc2 = nn.Linear(256,128)
		self.fc2.weight.data = fanin_init(self.fc2.weight.data.size())

		self.fc3 = nn.Linear(128,64)
		self.fc3.weight.data = fanin_init(self.fc3.weight.data.size())

		self.fc4 = nn.Linear(64,action_dim)
		self.fc4.weight.data.uniform_(-EPS,EPS)

	def forward(self, obs):
		"""
		returns policy function Pi(s) obtained from actor network
		this function is a gaussian prob distribution for all actions
		with mean lying in (-1,1) and sigma lying in (0,1)
		The sampled action can , then later be rescaled
		:param state: Input state (Torch Variable : [n,state_dim] )
		:return: Output action (Torch Variable: [n,action_dim] )
		"""


		# state = T.tensor(obs, dtype=T.float)
		x = F.relu(self.fc1(obs))
		x = F.relu(self.fc2(x))
		x = F.relu(self.fc3(x))
		action = F.tanh(self.fc4(x))

		# action = action * self.action_lim

		return action




This is the error I get:

  File "c:\Users\MY\Documents\Slicing\envs\model.py", line 102, in forward
    x = F.relu(self.fc1(obs))
  File "C:\Users\MY\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl     
    return forward_call(*input, **kwargs)
  File "C:\Users\MY\anaconda3\lib\site-packages\torch\nn\modules\linear.py", line 103, in forward
    return F.linear(input, self.weight, self.bias)
  File "C:\Users\MY\anaconda3\lib\site-packages\torch\nn\functional.py", line 1848, in linear
    return torch._C._nn.linear(input, weight, bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x0 and 6x256)

If seems the input activation is empty as it has a shape of [1, 0] while the linear layer expects 6 input features. Check the shape of all intermediate activations and make sure they are not pooled or reduced to an empty tensor.

What is meant by “input activation is empty” ? I am new to this so maybe I don’t understand some terms.
This is the observation space which has a dimension size of 6

low = np.array([2, 0, 0, 0, 3, 2]).astype(np.float32)
high = np.array([3, 10, 10, 1, 30, 20]).astype(np.float32)
self.observation_space = spaces.Box(low, high)

The error message shows that the input activation is a tensor with a shape of [1, 0] which does not contain any values and is thus empty.
Here is a small code snippet to reproduce the error:

x = torch.randn(1, 0)
print(x.shape)
# torch.Size([1, 0])
print(x) # empty tensor, i.e. no values stored
# tensor([], size=(1, 0))

lin = nn.Linear(6, 256)

out = lin(x)
# RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x0 and 6x256)
1 Like

RuntimeError: mat1 and mat2 shapes cannot be multiplied (25x256 and 64x4)

Could anyone please help me with these errors

Based on the error message I would guess that the first linear layer is raising it and it seems c might be set to 64 while the input would have a shape of [batch_size=25, features=256].
Make sure the input has 64 features or set c to 256.

PS: you can post code snippets by wrapping them into three backticks ``` :wink:

1 Like

Thanks a lot for the reply, but I am new to DL and having difficulties debugging the code.I am trying to build a few shot algorithm implemented with prototypical networks with a resnet50 backbone along with SENet.I am attaching few more code snippets.I would be greatly obliged if you could take a look and suggest some changes. Thanks

    "credits: https://github.com/moskomule/senet.pytorch/blob/master/senet/se_module.py#L4"
    def __init__(self, c, r=16):
        super().__init__()
        self.squeeze = nn.AdaptiveAvgPool2d(1)
        self.excitation = nn.Sequential(
            nn.Linear(c, c // r, bias=False),
            nn.ReLU(inplace=True),
            nn.Linear(c // r, c, bias=False),
            nn.Sigmoid()
        )

    def forward(self, x):
        bs, c, _, _ = x.shape
        y = self.squeeze(x).view(bs, c)
        y = self.excitation(y).view(bs, c, 1, 1)
        return x * y.expand_as(x)

The code for the Squeeze and excitation block

```class SEBottleneck(nn.Module):
    # Bottleneck in torchvision places the stride for downsampling at 3x3 convolution(self.conv2)
    # while original implementation places the stride at the first 1x1 convolution(self.conv1)
    # according to "Deep residual learning for image recognition"https://arxiv.org/abs/1512.03385.
    # This variant is also known as ResNet V1.5 and improves accuracy according to
    # https://ngc.nvidia.com/catalog/model-scripts/nvidia:resnet_50_v1_5_for_pytorch.

    expansion = 4

    def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1,
                 base_width=64, dilation=1, norm_layer=None, r=16):
        super(SEBottleneck, self).__init__()
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d
        width = int(planes * (base_width / 64.)) * groups
        # Both self.conv2 and self.downsample layers downsample the input when stride != 1
        self.conv1 = conv1x1(inplanes, width)
        self.bn1 = norm_layer(width)
        self.conv2 = conv3x3(width, width, stride, groups, dilation)
        self.bn2 = norm_layer(width)
        self.conv3 = conv1x1(width, planes * self.expansion)
        self.bn3 = norm_layer(planes * self.expansion)
        self.relu = nn.ReLU(inplace=True)
        self.downsample = downsample
        self.stride = stride
        # Add SE block
        self.se = SE_Block(planes, r)

    def forward(self, x):
        identity = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)
        # Add SE operation
        out = self.se(out)

        if self.downsample is not None:
            identity = self.downsample(x)

        out += identity
        out = self.relu(out)

        return out```
The resnet50 bottle neck added with one extra line of code i.e. **out = self.se(out)**.
The code works fine without the Squeeze and Excitation operation i.e. when I comment out the **out=self.se(out)** line.
But when included it gives this error
```RuntimeError: mat1 and mat2 shapes cannot be multiplied (25x256 and 64x4)```

I am also facing similar issue

import numpy as np
import torch as th
from torch import nn as nn
import torch.nn.functional as F
from torch import tensor
from stable_baselines3.common.vec_env import VecTransposeImage


class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        self.features_extractor = nn.Flatten(start_dim=1, end_dim=-1)
        self.mlp_extractor = nn.Sequential(
            nn.Linear(in_features=9, out_features=64, bias=True),
            nn.Tanh(),
            nn.Linear(in_features=64, out_features=64, bias=True),
            nn.Tanh()
        )
        self.action_net = nn.Sequential(
            nn.Linear(in_features=64, out_features=9, bias=True),
        )
    
    def forward(self, x):
        x = self.features_extractor(x)
        x = self.mlp_extractor(x)
        x = self.action_net(x)
        x = x.argmax()

        return x


def getMove(obs):
    model = Net()
    model = model.float()
    model.load_state_dict(state_dict)
    model = model.to('cpu')
    model = model.eval()
    obs = th.as_tensor(obs).to('cpu').float()
    obs = obs.unsqueeze(1)
    action = model(obs)

    return action

and i am getting this error -


for line - x = self.mlp_extractor(x)
how can i fix it?

Your input seems to use a shape of [batch_size=9, features=1] while features=9 is expected in the first linear layer.
Either make sure the input has 9 features or set in_features of the first linear layer to 1.

but my model policy is this -

I am training a tic tac toe AI

Yes, I know as you’ve already shared the model definition.
In that case your input is wrong and I assume you might need to permute it.

how can i do that? i am totally beginner

x = x.t() should work assuming the dimensions are permuted.
Since I’m not familiar with your use case I would still recommend to check the root cause of this issue and narrow down why the shape is wrong at all.

but during the training period it was same so what can be the cause?