Update part of word embeddings during training


In my current project, I introduce auxiliary embeddings to discriminate functional word (e.g., Conjunctions, Determiners, Pronouns and so on).and non-functional word. Specifically, I set auxiliary embeddings of functional words as zero vectors and randomly initialize those of non-functional words. My goal is to fine-tune the latter during training and keep the former unchanged (i.e., always zeros).

Is it feasible for nn.Embedding layer?

Two ways:

  1. This could probably be done by introducing an additional masking embedding with n_dims=1 and have
    the embedding of a particular word ‘1.0’ if it should use aux embeddings and ‘0.0’ if not.
    Set the require_grads to False and do en element wise multiplication between the
    aux embedding and the mask embedding.

  2. Calculate a mask (Boolen tensor) that determine if one should use the aux embedding for each particular word
    and use it to mask the input before embedding.

Hope this helps :slight_smile:

you can keep one separate Embedding layer for functional words, and one separate Embedding layer for non-functional words. Seems like a cleaner solution no?

1 Like

Thanks for your response!

In my previous implementation, I adopted your second strategy but I am not sure if it works.

Good idea!

I think it should work

Is it work? Can you share the code? Thanks a lot!