I have a case where I get a smoothened one-hot(probability) distribution and I would like to do an average embedding lookup. This is what I am doing now,
when flag = True I will be passing normal input sequence as indices of size (batch x max_seq_len), while in other case it will be a smoothened onehot vectors of size (batch x max_seq_len x vocab_size).
If you are not using any of the more fancy options to Embedding, namely any of these padding_idx=None, max_norm=None, norm_type=2, scale_grad_by_freq=False, then your code looks plausibly correct to me.
That said, I am not particularly familiar with the inner workings of Embeddings, so I won’t give a clear definitive answer on that point.