If we use pack_padded_sequence and ignore_idx in F.cross_entropy, do we still need set padding_idx in nn.Embedding? If we do, why?
Yes, the same problem bothers me. Do you have any ideas now?
If we use pack_padded_sequence and ignore_idx in F.cross_entropy, do we still need set padding_idx in nn.Embedding? If we do, why?
Yes, the same problem bothers me. Do you have any ideas now?