How do I do max pooling over output as sequence length is variable? Assuming output is of dimension N x L x hiddenLayerDim. N = batchSize, L=length of longest sequence.

I would like to do max pooling along the length (L) dim (=1).

But the sequence length is different across different slices of the batch. The values in the sequence may be negative, so I don’t want to pick up padded zeros as max while simply doing maxpool1d

You might want to consider creating batches where all sequences within one batch have the same length. No need for padding or packing; see this older post. It’s simply convenient, and I use it all the time.