Hi there!
For some reasons I need to compute the gradient of the loss with respect to the input data.
My problem is that my model starts with an embedding layer, which doesn’t support propagating the gradient through it. Indeed, to set requires_true
to my input data, it has to be of type float
. But the embedding module (nn.Embedding
) only supports inputs of type double
.
Is there anything I am missing, or the embedding layer definitely stops the back propagation? My idea to make it work is to replace the embedding layer, which performs a lookup, by a matrix multiplication. But I first want to be sure my understanding is correct.
Here is a working dummy code of the situation, but my aim is to have the requires_grad
to True
:
import torch
import torch.nn as nn
class Seq(nn.Module):
def __init__(self):
super(Seq, self).__init__()
self.model = nn.Sequential(
nn.Linear(10, 20),
nn.ReLU(),
nn.Linear(20, 3),
nn.Sigmoid()
)
self.embed = nn.Embedding(5, 10)
def forward(self, data):
return self.model(self.embed(data))
model = Seq()
#####
# I want the `requires_grad` to be True!
#####
data = torch.tensor(
[[0, 1, 4, 3, 1], [1, 0, 4, 3, 0], [4, 2, 3, 1, 4]],
requires_grad=False, dtype=torch.long)
target = torch.rand([3, 1, 3])
output = model(data)
loss = torch.sum(torch.sqrt((output-target)**2))
loss.backward(retain_graph=True)