Hi, I simply make a big embedding layer (10M vocabulary) as code below.
optimizer.step() is very slow (less than 100 samples / second).
I tried CPU, CPU sparse, and GPU (cuda), but all of them are very slow. CPU non-sparse is the fastest.
Can I get a reason? if I remove loss.backward() and optimizer.step(), it’s 10000+ samples / second (I mean, data generator is not a bottleneck).
class Model(nn.Module):
def __init__(self, n_words=10000000, dim_word=64):
super(Model, self).__init__()
self.n_words = n_words
self.dim_word = dim_word
self.embedding = nn.Embedding(self.n_words, self.dim_word, sparse=False)
def forward(self, indices):
y = self.embedding(indices)
return y
def train():
model = Model(10000000, 64)
criterion = loss.TripletMarginLoss()
optimizer = optim.Adagrad(model.parameters(), lr=0.1)
...