From RNN packed seqs to rest of network

Trying to model different sized elements in a batch. Wondering if this kind of processing is valid:

  1. Pack a padded sequence and feed through LSTM.
  2. Unpack
  3. Shift dimensions to simulate timedistributed.
  4. Reshape back into original shape
  5. Am I done with processing at this point? Part of me thinks I need to zero out all the items past the seq length for each element.

Does this make sense?

input_seq = early_rnn.pack_sequence(input_seq)

    # ----------------------------
    # ----------------------------
    # io_rnn_output (RNN output at each timestep) = [steps, batch_size, x_features]
    io_rnn_output, (io_h_n, io_c_n) = self.io_rnn(input_seq, self.io_rnn_hidden)
    io_rnn_output, seq_lengths = pad_packed_sequence(io_rnn_output)

    # dense output
    io_rnn_output = self.time_distributed(self.io_dense_1, io_rnn_output)
    io_rnn_output = self.time_distributed(F.relu, io_rnn_output)
    io_rnn_output = self.time_distributed(self.io_dense_2, io_rnn_output)
    io_rnn_output = self.time_distributed(F.sigmoid, io_rnn_output)    

    # what's next?? zero out what used to be the zeros in the packed sequence? or just flow this into the rest of the net?