Change embedding's weights

Rami_Nasser · February 8, 2023, 12:01pm

is there a way to change nn.Embedding weights and keep the gradient ?
for example, passing the nn.Embedding layer through an mlp layer and keep the gradient so the mlp is updated.
If I update the nn.Embedding.weight.data this doesn’t keep the gradient to the mlp.

dreidizzle · February 8, 2023, 9:36pm

Can you elaborate on what you want to do exactly? So if you have an Embedding layer and you do a forward pass and then a backward pass you’ll get all the gradients for all parameters. You’d then step to update all the parameters with the gradients that you computed. The loss ideally would go down.

You want to get gradients, change the embedding data, and then update? I mean then you can’t guarantee that you make progress on the loss. Why can you just do this manually?

old_grad = e.grad.data
e.data = new data
e.data -= old_grad

This is pseudo code. If you explain more, maybe I can comment more but unsure of the idea here.

Rami_Nasser · February 9, 2023, 7:37am

encoder = nn.Embedding()
W = nn.Parameters()
# once in epoch, I want to make the following update to the encoder:
encoder.weight.data = W.mm(encoder)

The problem is that the gradient is lost and the W is not updated.

dreidizzle · February 9, 2023, 1:08pm

Wait so the encoder is an embedding layer. So, I think what you want to do is to grab the weight inside of the encoder and multiply it by W and then compute the loss? This way W and the weight in the encoder are in the computational graph, and you’ll get gradients. This would work, but still not sure if it’s what you want to do.

import torch.nn as nn
import torch
e = nn.Embedding(10, 5)
w = nn.Parameter(torch.rand(5, 5))
p  = (e.weight @ w).sum()
l = (1 - p) ** 2
l.backward()
w = w - 0.01*w.grad
e.weight.data = e.weight.data - 0.01 * e.weight.grad

Now, you have gradients for w and e.weight and you can update w and e.weight. The goal here is so the product-sum of e.weight and w is 1 in MSE. You need to adapt this to what you want. The learning rate about is 0.01.