Pad_packed_sequence cannot unpack

I encountered a problem to unpack the packed padded sequence. It returned error:
Traceback (most recent call last):
File “”, line 118, in
File “”, line 115, in main
logSM = dec(x_emb, h_0, c_0, enc_out, mask, True)
File “/u/nieyifan/anaconda3/lib/python3.6/site-packages/torch/nn/modules/”, line 325, in call
result = self.forward(*input, **kwargs)
File “”, line 77, in forward
h, c, mask) #dec_out=(1, BS, hid)
File “/u/nieyifan/anaconda3/lib/python3.6/site-packages/torch/nn/modules/”, line 325, in call
result = self.forward(*input, **kwargs)
File “/u/nieyifan/projects/EncDec/”, line 79, in forward
out, _ = torch.nn.utils.rnn.pad_packed_sequence(out) # (seq_len, BS, hid_size)
File “/u/nieyifan/anaconda3/lib/python3.6/site-packages/torch/nn/utils/”, line 121, in pad_packed_sequence
output[prev_i:, :prev_batch_size] = var_data[data_offset:data_offset + l]
File “/u/nieyifan/anaconda3/lib/python3.6/site-packages/torch/autograd/”, line 87, in setitem
return SetItem.apply(self, key, value)
File “/u/nieyifan/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/”, line 117, in forward
i._set_index(ctx.index, value)
RuntimeError: invalid argument 2: sizes do not match at /opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/

I have 2 classes, one low-level layer called LSTMlayer
it’s a wrapper of the nn.LSTM with the management of padding

def forward(self, x, h_0, c_0, mask):
        packed_x = torch.nn.utils.rnn.pack_padded_sequence(x,
                            sorted_len.long().data.tolist(), batch_first=False)
        # run Encoder
        out, (h_n, c_n) = self.lstm(packed_x, (h_0, c_0))  # (n_layers*n_dir, BS, hid_size)
              # unpack out
        out, _ = torch.nn.utils.rnn.pad_packed_sequence(out)  # (seq_len, BS, hid_size)

then there’s a Decoder class which calls this forward function

def forward(self, ...): 
       dec_out_i, h, c = self.rnn(
                                            context.permute(1, 0, 2)), dim=2), 
                                            h, c, mask)

self.rnn is the LSTMlayer’s forward function
so basically the input for lowlevel LSTM is a concatenated tensor of dim (seqlen=1, BS, 306 ). I run the decoder one time step at each time, so seqlen=1, and the packer could pack the concatenaated tensor, but the unpacker couldn’t not unpack it dec_out_i

I tried my lowlevel LSTMlayer alone, there’s no problem, I don’t know why this Decoder’s function is not working. Thanks