For others stumbling on this thread, be careful to choose a positive index (i.e. 0 instead of -1) for padding sequences as input to an embedding layer. Even if you specify the negative index in the embedding constructor, you will still get a runtime error on both CPU and GPU
import torch
import torch.nn as nn
emb = nn.Embedding(20, 100, padding_idx=-1)
inp = torch.tensor([5, 2, 7, 12, 3])
bad_padding = torch.cat((inp, torch.tensor([-1] * 3)))
good_padding = torch.cat((inp, torch.tensor([0] * 3)))
out = emb(good_padding)
out = emb(bad_padding) # RuntimeError