I am passing a
pack_padded_sequence to a
RNN and want to feed the
mean output from all time steps to a
Linear layer, how can I do this so that the padded portions are not included in the
mean and the gradients are computed correctly?
I have defined the pack_padded_sequence, RNN and Linear layer as follows:
self.rnn = torch.nn.RNN(input_size=feature_dim, hidden_size=self.hidden, num_layers=self.num_layers, batch_first=True) ... self.fc = nn.Linear(self.hidden, self.num_classes) ... packed = pack_padded_sequence(base_out, lengths, batch_first=True)