Why the output of RNN/LSTM/GRU of a `pack_padded_sequence` is not copied through time?

The short sequences’ output is left 0 in the subsequent time step. One can not simply call output[-1] to get all the valid outputs.

Variable containing:
(0 ,.,.) = 
 -0.0501 -0.2952 -0.2333  0.4341  0.3493  0.2266
 -0.3599  0.1146  0.0839  0.3452 -0.2610  0.0438
  0.0732 -0.3825 -0.2358  0.3724 -0.0710 -0.0938

(1 ,.,.) = 
 -0.0679 -0.2363 -0.2831  0.4428 -0.1835 -0.4050
 -0.2480 -0.2260  0.0623 -0.0588 -0.0716 -0.0399
  0.0906 -0.4665 -0.2691  0.1649 -0.1746 -0.3439

(2 ,.,.) = 
 -0.0457  0.1439 -0.3324  0.4185 -0.0935 -0.4124
 -0.1829 -0.4376  0.0962 -0.2061  0.0495 -0.1017
  0.0053 -0.5100 -0.3125  0.2655 -0.2417 -0.3377

(3 ,.,.) = 
 -0.1166 -0.1815 -0.3255  0.4191  0.0036 -0.0437
 -0.1765 -0.1423  0.1529  0.0551  0.4874  0.0341
  0.1387 -0.6859 -0.4179  0.4549 -0.2319 -0.1515

(4 ,.,.) = 
 -0.1024 -0.4288 -0.6777 -0.1263 -0.0838 -0.5730
 -0.1947  0.0544 -0.0195  0.4354 -0.2913 -0.5107
 -0.0379 -0.3370 -0.3031  0.4669  0.3479  0.0385

(5 ,.,.) = 
 -0.1310 -0.0398 -0.3535  0.2669 -0.5321 -0.6508
 -0.1677  0.3583  0.0721  0.4400 -0.1823 -0.3988
  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000

(6 ,.,.) = 
  0.0290 -0.0032 -0.3978  0.5375  0.2659 -0.2442
  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000

(7 ,.,.) = 
  0.0236  0.2544 -0.3984  0.4688 -0.0030 -0.2773
  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000
  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000
[torch.FloatTensor of size 8x3x6]

you need to use pad_pack_sequence to turn back the output to get the valid output

1 Like