Ignore a specific index of Embedding

I am using many zero-paddings in my batch data. The padding index is 0. So I want my model to ignore the padding effect of my trainings. So could someone check my code is correct or not?

 optimizer.zero_grad()
 loss.backward()

 for name, param in model.named_parameters():
       if param.grad is not None:
           if 'A.' in name or 'W.weight' in name:
               param.grad.data[0] = 0

 optimizer.step()

If you don’t have a very large number of embedding layers, you could address the parameters directly, e.g. “model.emb.weight” if the embedding is named emb.

Also, I would probably follow Soumith’s suggestion to reset the padding embedding vector after the step instead of changing the grad before. Very likely both work, but if you are set to keep them fixed, resetting feels slightly more straightforward than setting the grad to zero in order to change the update to zero.

Best regards

Thomas

Thank you for your nice information. Oh then, after update parameters, just reset the embedding weight of specific index. Is it the same as yours? It looks much simpler.

 optimizer.zero_grad()
 loss.backward()
 optimizer.step()
 
 model.embd.weight.data[specific_index] = 0

That was what I had in mind. Does it work for your project?

Best regards

Thomas

1 Like

It’s working. Thank you!