I have several word embeddings and I want to take weighed sum of them and use the result as regular one. One thing is I want to train those weights when each word has its own weight.
To do that I define weight matrix and embedding variable.
self.embeds_stacked = torch.tensor(np.stack(embeddings, axis=2), dtype=torch.float).cuda()
self.embed_coeffs = torch.tensor(np.ones(shape=(max_features, len(embeddings)))/len(embeddings),
self.embed_coeffs = nn.Parameter(self.embed_coeffs, requires_grad=True)
self.embedding = nn.Embedding(max_features, embed_size)
self.embedding.weight.requires_grad = False
Where “embeddings” is list of embedding matrices, max_features is number of words (all embeddings have same dimensionality). embeds_stack has (max_features, embedding_size, K) dimensionality,
During forward step I just use einsum to get weighted sum along axes and assign weighted values to embedding variable. After that I use it as regular embedding.
embeds_weighed = torch.einsum(‘ijk,ik->ij’, self.embeds_stacked, self.embed_coeffs)
self.embedding.weight.data = embeds_weighed
h_embedding = self.embedding(x)
However, embedding coefficients stay same across training.
Is there any option to fix that?
Thanks in advance!