Ignore a specific index of Embedding

jef · January 20, 2018, 7:21pm

I am using many zero-paddings in my batch data. The padding index is 0. So I want my model to ignore the padding effect of my trainings. So could someone check my code is correct or not?

 optimizer.zero_grad()
 loss.backward()

 for name, param in model.named_parameters():
       if param.grad is not None:
           if 'A.' in name or 'W.weight' in name:
               param.grad.data[0] = 0

 optimizer.step()

Reference
After changing weight of the nn.Embedding, the values at the padding_index also changed

tom · January 20, 2018, 9:46pm

If you don’t have a very large number of embedding layers, you could address the parameters directly, e.g. “model.emb.weight” if the embedding is named emb.

Also, I would probably follow Soumith’s suggestion to reset the padding embedding vector after the step instead of changing the grad before. Very likely both work, but if you are set to keep them fixed, resetting feels slightly more straightforward than setting the grad to zero in order to change the update to zero.

Best regards

Thomas

jef · January 21, 2018, 12:39am

Thank you for your nice information. Oh then, after update parameters, just reset the embedding weight of specific index. Is it the same as yours? It looks much simpler.

 optimizer.zero_grad()
 loss.backward()
 optimizer.step()
 
 model.embd.weight.data[specific_index] = 0

tom · January 21, 2018, 8:10am

That was what I had in mind. Does it work for your project?

Best regards

Thomas

jef · January 21, 2018, 3:30pm

It’s working. Thank you!