Hello,

I am passing a `pack_padded_sequence`

to a `RNN`

and want to feed the `mean`

output from all time steps to a `Linear`

layer, how can I do this so that the padded portions are not included in the `mean`

and the gradients are computed correctly?

I have defined the pack_padded_sequence, RNN and Linear layer as follows:

```
self.rnn = torch.nn.RNN(input_size=feature_dim, hidden_size=self.hidden, num_layers=self.num_layers, batch_first=True)
...
self.fc = nn.Linear(self.hidden, self.num_classes)
...
packed = pack_padded_sequence(base_out, lengths, batch_first=True)
```

Thanks.