How can I use a linear layer with a bidirectional LSTM?

Shisho_Sama · October 20, 2019, 12:09pm

Hi,
Using a normal lstm with a linear layer after is trivial, you just reshape the lstm output and feed it to the
next linear layer.
However, when I set the bidirection = True, the number of outputs doubles, and even if I double the input_features of my linear layer, it wont work and it will create the following error :

RuntimeError                              Traceback (most recent call last)
e:\DeepLearning\Codes\Pytorch Basics\recurrent neural networks.py in 
     74 print(f'rnn type : {model.rnn_type}')
     75 print(f'our input(data).shape: {data.shape}')
---> 76 outputs, hiddenstates = model(data,None)
     77 print(f'model input size: {model.input_size}')
     78 print(f'model output size: {model.output_size}')

~\Anaconda3\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    545             result = self._slow_forward(*input, **kwargs)
    546         else:
--> 547             result = self.forward(*input, **kwargs)
    548         for hook in self._forward_hooks.values():
    549             hook_result = hook(self, input, result)

e:\DeepLearning\Codes\Pytorch Basics\recurrent neural networks.py in forward(self, input, hidden_states)
     50         outputs = rnn_outputs.reshape(-1, self.hidden_size)
     51         outputs = self.drp(outputs)
---> 52         outputs = self.fc(outputs)
     53         return outputs, hidden_states
     54 

~\Anaconda3\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    545             result = self._slow_forward(*input, **kwargs)
    546         else:
--> 547             result = self.forward(*input, **kwargs)
    548         for hook in self._forward_hooks.values():
    549             hook_result = hook(self, input, result)

~\Anaconda3\lib\site-packages\torch\nn\modules\linear.py in forward(self, input)
     85 
     86     def forward(self, input):
---> 87         return F.linear(input, self.weight, self.bias)
     88 
     89     def extra_repr(self):

~\Anaconda3\lib\site-packages\torch\nn\functional.py in linear(input, weight, bias)
   1367     if input.dim() == 2 and bias is not None:
   1368         # fused op is marginally faster
-> 1369         ret = torch.addmm(bias, input, weight.t())
   1370     else:
   1371         output = input.matmul(weight.t())

RuntimeError: size mismatch, m1: [120 x 30], m2: [60 x 84] at C:\w\1\s\windows\pytorch\aten\src\TH/generic/THTensorMath.cpp:752

What should I do ?

tom · October 20, 2019, 12:20pm

You probably want to revisit your program code, in particular

In generalI it helps to figure out which tensors are not matching, most likely this is fc.weight and the reshaped output of your RNN and to look at both the init (defining the submodules) and the forward (calling them).

Best regards

Thomas

Shisho_Sama · October 20, 2019, 12:25pm

Hi, Thanks a lot for the quick reply. I really appreciate it.
Thanks Got it
Is it correct now? cause I dont get any errors now
outputs = rnn_outputs.reshape(-1, self.hidden_size*self.direction)

tom · October 20, 2019, 12:37pm

Yeah, you have got all the ingredients there, but compare your reshaping with your rnn output and the linear in size. So you need to multiply the self.hidden_size by self.direction in the reshape.

bedru_yimam · January 31, 2020, 9:17am

HI all, i am bedru, i going to conduct thesis wiz Bilstm but how can develop the code