Very slow training with nn.Embedding

ptrch_c_m · July 7, 2025, 2:32pm

I use nn.Embedding initialized with GloVe embeddings and I want to further train the embeddings when I fine tune on my task. The GloVe embeddings are a bit more 2.1M, and their dimension is 300. The training becomes extremely slow. I think that this might be due to the size of the Embedding table. I don’t know if this is relatable, but I have small relatively small sentences (many of them might be less than 15 words). The weights of Embedding change only for the words fount in the batch right? Is this very slow training normal?

vdw · July 7, 2025, 6:39pm

Do you really have a vocabulary size of 2.1 million? Or do you load all pretrained vectors into your embedding layer?

You only need the vectors for the words in your vocabulary? So if you have, say, 100k words/tokens, the input size of the embedding layer only needs to be 100k.

Even if your whole vocabulary is 2.1 million tokens, try considering only the, say, 100k most frequent ones. Due to Zipf’s Law, the 100k most frequent tokens probably cover 98% of your dataset.

ptrch_c_m · July 8, 2025, 6:47am

Thank you, you are right, that was it! Now the training is very fast.