Relationship between padding_idx argument of nn.Embedding and padding_value argument of pad_sequence function

coyote · February 19, 2020, 7:58am

Hi everyone,

Here is my question. We’re using pad_sequence and pack_padded_sequence functions to align instances in a batch (i.e. to make them equal in length). Pad_sequence function has an argument called padding_value which is in general set to zero. Morever, there is a padding_idx function in nn.Embedding. My question is that: How should I initialize my vocabulary,embedding layer so that it can be in consistent with pad_sequence? For example, which one of the below is true ?

unique_words = ['word1','word2']
w2i = { i : unique_words[i] for i in range(0, len(unique_words) ) }
emb_layer = nn.Embedding(len(w2i),emb_size)
pad_sequence(_some_input_here_, batch_first=True, padding_value=0)

or:

 unique_words = ['word1','word2']
 unique_words += ['<<PAD>>']         # I am adding PAD token explicitly into my dictionary
 w2i = { i : unique_words[i] for i in range(0, len(unique_words) ) }
 emb_layer = nn.Embedding(len(w2i),emb_size,padding_idx=0) # Is this necessary ? 
 pad_sequence(_some_input_here_, batch_first=True, padding_value=0)

Could any of you help me on this issue ? Thanks.

coyote · February 19, 2020, 1:09pm

@ptrblck Do you have any idea on this ? Thanks in advance!