I am trying to implement the matrix multiplication using embedding layers. Basically, in the forward function I would like to get
a user factor with shape [1, N],
a time matrix with shape [N, N], and
a item factor with shape [N, 1]
Then, I would like (user_factor * time_matrix * item factor) and output a scalar value.
However, when I check the time_factors over time, it is not updated at all. I am not sure if the reshaping effect the autograd. I have no idea which step is wrong. The user_factors.weight are updated over time.
The following is my implementation. Thank you for your helps.
class MF(torch.nn.Module):
def __init__(self, n_users, n_attempts, n_items, n_factors=2, seed=1024):
super().__init__()
torch.random.manual_seed(seed)
self.n_users = n_users
self.n_items = n_items
self.n_factors = n_factors
self.user_factors = nn.Embedding(n_users, n_factors)
self.time_factors = nn.Embedding(n_attempts, n_factors * n_factors)
self.item_factors = nn.Embedding(n_items, n_factors)
self.stress_item_factor = nn.Embedding(1, n_factors)
self.user_biases = nn.Embedding(n_users, 1)
self.time_biases = nn.Embedding(n_attempts, 1)
self.item_biases = nn.Embedding(n_items, 1)
def forward(self, user, attempt, item):
u_factor = self.user_factors(user)
t_factor = self.time_factors(attempt)
t_matrix = t_factor.reshape(-1, self.n_factors, self.n_factors)
stress = self.user_biases(user) + self.time_biases(attempt)
tmp = torch.matmul(u_factor, t_matrix).squeeze(dim=1)
stress += torch.matmul(tmp, self.stress_item_factor(torch.tensor(0)))
return stress.squeeze(dim=-1)
And I use the following way to train it:
for idx, (u, t, i, v) in enumerate(self.train_data):
user = torch.Tensor([u]).long()
attempt = torch.Tensor([t]).long()
item = torch.Tensor([i]).long()
value = torch.Tensor([v])
self.optimizer.zero_grad()
pred = self.model(user, attempt, item)
loss = self.mse_loss(pred, value)
loss.backward()
self.optimizer.step()
When I try to print out the values of self.model.time_factors.weight
, the values do not change over time. However, then I try to print the values of self.model.time_factors.weight.grad
, the values change.
It is very strange. Hope anyone could help me on this. Thank you very much.