Reproducibility of nn.Embedding?

Anikily · March 11, 2022, 9:56am

I am using the nn.Embedding class to generate some word embeddings, and i have tried two setting in my code:
1.
def init(self):
self.embedding = nn.Embedding(emb_idim, emb_odim)
def forward(self, indexs):
wemb = self.embedding(indexs)

def init(self):
self.embedding = nn.Embedding(emb_idim, emb_odim)
def forward(self, indexs):
query_one_hot = F.one_hot(indxes, max_num_of_indexes)
wemb = query_one_hot.float() @ self.embedding.weight

It seems that the wemb results are equal in 1 and 2 settings after one run, but i got the diferrent testing performance after one epoch.

Could anyone can tell me what’s wrong with my code ?

ptrblck · March 11, 2022, 9:50pm

I cannot reproduce the issue and get a zero error:

device = 'cuda'

embedding1 = nn.Embedding(100, 100).to(device)
embedding2 = nn.Embedding(100, 100).to(device)
embedding2.load_state_dict(embedding1.state_dict())

for _ in range(10):
    indices = torch.randint(0, 100, (32,)).to(device)
    out1 = embedding1(indices)
    out2 = F.one_hot(indices, 100).float() @ embedding2.weight
    
    print('out abs err {}'.format((out1 - out2).abs().max()))
    out1.mean().backward()
    out2.mean().backward()
    print('grad abs err {}'.format((embedding1.weight - embedding2.weight).abs().max()))
    
    embedding1.zero_grad()
    embedding2.zero_grad()