nn.Embedding - Batch first VS seq_len first


Regarding nn.Embedding the documents say the input should in the form
(N, W), N = mini-batch, W = number of indices to extract per mini-batch

But I already have an implementation where my inputs are all in seq_len x batch_size

I pass this as an input and there is no problem (seem’s logical since it’s just a dictionary).

I’m wondering if it’s also correct this way or the parameters will not be updated correctly?


Yes your way should work without issue. nn.Embedding is just a lookup table on keys from a 2d indices tensor :slight_smile: If you really want to, you can think of your seq_len as batch_size and batch_size as #embeddings_to_extract _per_batch :stuck_out_tongue:

1 Like