Can anyone point out the error in dimensions of permute?

krishna511 · December 26, 2020, 7:06pm

This is the model I m referring

class CNN_spec(torch.nn.Module):
     def __init__(self, num_classes=7):
        super(CNN_spec, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(1, 64, kernel_size=3,stride=1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))
        self.layer2 = nn.Sequential(
            nn.Conv2d(64, 64, kernel_size=3,stride=1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=4, stride=4))
        self.layer3 = nn.Sequential(
            nn.Conv2d(64, 128, kernel_size=3,stride=1),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=4, stride=4))
        self.layer4 = nn.Sequential(
            nn.Conv2d(128, 128, kernel_size=3,stride=1),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=4, stride=4))
        self.layer5 = nn.LSTM(128,1000 )
        self.emotion_layer = nn.Linear(2000,num_classes)
       

     def forward(self,inputs): 
         out = self.layer1(inputs)
         out = self.layer2(out)
         out = self.layer3(out)
         out = self.layer4(out)
         out = out.permute(0, 2, 1)
         out, (final_hidden_state, final_cell_state) = self.layer5(out)

        #out = out[:, -1, :].reshape(out.shape[0], 1, out.shape[2])
         

         mean = torch.mean(out, 1)
         std = torch.std(out, 1)
         stat = torch.cat((mean, std), 1)
         pred_emo = self.emotion_layer(stat)
         return pred_emo

The error is : number of dims don’t match in permute.
Any suggestions

Nikronic · December 26, 2020, 8:27pm

Hi,

out from self.layer4 is 4D tensor in shape [batch, channel, h, w]. But you are refering to 3D tensor using your permute.

I am not sure about your idea, but something like out.permute(0, 2, 3, 1) will work.

PS. It would be much easier to debug if you print whole stack trace of error.
Also, documentation is clear about majority of modules’ behavior. Might need to check it.

Bests

krishna511 · December 26, 2020, 8:41pm

Ok let me try that.
Thank you

krishna511 · December 27, 2020, 7:55am

Actually this dimesion error is due to size mismatch, lstm can take input in 3 dim , here this 2dcnn is giving (batch_size,no of channels, m,n), where mxn is the size of spectroogram, reducing at each conv layer. If it was 1dcnn then permute (0,2,1) is okay, so I think i need to read this spectrogram row wise to input lstm layer, how to rehsape it like that?
[batch_size,no of channels, mxn], am I doing it right?
plz guide

krishna511 · December 27, 2020, 11:54am

its done!
just use
nn.Flatten(2,3)