I noticed that it is not possible to feed
packed_sequences to things like activation functions or linear layers, forcing me to design models having a forward method like :
def forward(self, input, lengths, hidden = None): input = nn.utils.rnn.pack_padded_sequence(input, lengths, batch_first = True, enforce_sorted = False) out, hidden = self.lstm(input,hidden) out = nn.utils.rnn.pad_packed_sequence(out, batch_first = True, padding_value= -100) out = self.drop_layer(self.sigmoid(out)) out = self.softmax(self.linear_layer(out))
Where I have to unpack the
packed_sequence directly after the passing through
LSTM, and later need to filter it before loss calculation. if I don’t, I received an error message :
TypeError: sigmoid(): argument 'input' (position 1) must be Tensor, not PackedSequence
This seem highly inefficient since a lot of the calculations done by the activations functions and linear layers will have to be thrown away afterwards.
Why is that so ? What is the reason preventing us from passing
packed_sequences to activation function or linear layers ?