Shape '[92160, 1]' is invalid for input of size 188743680

AGM · April 4, 2024, 12:12pm

When training, in the step of introducing the inputs to the model…

outputs = model(inputs)

… in the forward process, I got an error of invalid shape for input of size in the following line:

x, _ = self.lstm(x)

My forward code is:

def forward(self, x):

        x = pool(F.relu(bn(conv(x))))
        x = x.view(x.size(0), -1, x.size(2)*x.size(3)*x.size(4))
        x, _ = self.lstm(x)

        for fc in self.fcs[:-1]: x = F.relu(fc(x))
        x = self.fcs[-1](x) 

        return x

My input data is 4-dimensional as it’s a RGB videoframe. My labels are 2-dimensional as it’s one set of features per frame of the video.

Inputs shape: torch.Size([1, 3, 10, 256, 360])
Labels shape: torch.Size([1, 10, 3])

If I print the shape of x in forward right before and after the x.view(), I obtain:

Before reshape: torch.Size([1, 32, 1, 32, 45])
After reshape: torch.Size([1, 32, 1440]) # 1*32*1440 = 1*32*1*32*45

However, in the step of x, _ = self.lstm(x) I got that now, the shape is 92160 and we are looking for a size of 188743680.

More info:
batch_size = 1
image_height = 256
image_width = 360
n_frames = 10

My designed network is made of 3 sets of…

Conv3d: kernel_size=3, stride=1, padding=1
BatchNorm3d
MaxPool3d: kernel_size=2, stride=2

… one LSTM layer, and 3 last Linear layers.