[solved] Embedding is very slow for 10M+ words

Hi, I simply make a big embedding layer (10M vocabulary) as code below.
optimizer.step() is very slow (less than 100 samples / second).
I tried CPU, CPU sparse, and GPU (cuda), but all of them are very slow. CPU non-sparse is the fastest.
Can I get a reason? if I remove loss.backward() and optimizer.step(), it’s 10000+ samples / second (I mean, data generator is not a bottleneck).

class Model(nn.Module):
    def __init__(self, n_words=10000000, dim_word=64):
        super(Model, self).__init__()
        self.n_words = n_words
        self.dim_word = dim_word
        self.embedding = nn.Embedding(self.n_words, self.dim_word, sparse=False)

    def forward(self, indices):
        y = self.embedding(indices)
        return y

def train():
    model = Model(10000000, 64)
    criterion = loss.TripletMarginLoss()
    optimizer = optim.Adagrad(model.parameters(), lr=0.1)

Sorry, it seems to be due to my confusion.

Hey can you explain how you fixed it ?


It seems the time complexity of training a Embedding of N words is not O(1). (maybe O(N) or more).
Thus I found training 5M-word-Embedding is much slower than training 1M-word-Embedding. So I tried to split it to smaller ones, it was better although not perfect.

Which size of embedding are you trying?

I hope this might be helpful.It’s just a coding of my simple approach, it was ok for my purpose.

1 Like