Is it necessary to use packed sequences for variable length RNN inputs, utilizing mini-batch training

Hey all

I’m currently in the process of looking at padding time-series for an LSTM implementation of mine and I’m trying to wrap my head around the use of packed sequences. I’m wondering why I shouldn’t just pad my sequences with float(‘nan’), and then search for nan in the output and use that to filter my loss function? This may be specific for my case, but I only care about the final output at the end of my time-series, so, for example in the below code, could I not just use the known lengths of my sequences to pull out the values just before the 'nan’s and use that to calculate my loss?

In [66]: x = [[3,5,6,4,float('nan')],
    ...:     [3,5,4,float('nan'),float('nan')],
    ...:     [4,5,6,7,5]]

In [70]: xT = torch.tensor(x).view(3,5,1)

In [73]: lstm = nn.LSTM(1,3,1,batch_first=True)

In [75]: o,_ = lstm(xT)

In [77]: lin = nn.Linear(3,1)

In [78]: lin(o)
         [    nan]],

         [    nan],
         [    nan]],


My apologies if this is a stupid question, I’m still kind of new to pytorch and would like to understand this.


Hey @Clint , did you find a solution ? I’m stuck on the same problem , I’m using a time series and cannot pad with 0.