Loss not updating in PyTorch?

I’m trying to create a collaborative filtering model to generate recommendations for patient/psychologist pairs. Here’s my model so far:

class Recommender(nn.Module):
  def __init__(self, patients, psychologists):
    super().__init__()
    self.patient_params = nn.ParameterDict({patient.name: nn.Parameter(torch.randn(10, requires_grad=True, dtype=torch.float)) for patient in patients})
    self.psych_params = nn.ParameterDict({psych.name: nn.Parameter(torch.randn(10, requires_grad=True, dtype=torch.float)) for psych in psychologists})
  def forward(self, matches):
    output = []
    for psych_match in matches:
      patient = psych_match[0]
      psych = psych_match[1]
      patient_params = self.patient_params[patient.name]
      psych_params = self.psych_params[psych.name]
      output.append(patient_params @ psych_params.T)
    return torch.tensor(output, requires_grad=True)

I’m able to make predictions with this model, and run through the training loop, but my loss and weights don’t update.

model = Recommender(patients, psychologists)
loss_fn = nn.MSELoss()
optimizer = torch.optim.SGD(params=model.parameters(), lr=0.01)
for epoch in range(50):
  model.train()
  y_preds = model(X)
  loss = loss_fn(y_preds, y)
  optimizer.zero_grad()
  loss.backward() 
  optimizer.step()
  if epoch % 10 == 0:
    print(f"Epoch: {epoch} | Loss: {loss}")

And the output is:

Epoch: 0 | Loss: 20.708234786987305
Epoch: 10 | Loss: 20.708234786987305
Epoch: 20 | Loss: 20.708234786987305
Epoch: 30 | Loss: 20.708234786987305
Epoch: 40 | Loss: 20.708234786987305

For simplicity’s sake, I’ve made the y values numerical, so I’d assume that MSELoss would be work fine here.

Any idea what might be going wrong or how to fix it?

If you look at the link below:
https://pytorch.org/docs/stable/generated/torch.tensor.html

It says torch.tensor does not preserve autograd history, hence no grad value is preserved.
In you last step of forward function,you use this function, which I think leads to destruction of computation graph

I think it may be not necessary to turn into tensor with require grad in the last step since it already has grad as it originates from param.Parametere

1 Like
class Recommender(nn.Module):
  def __init__(self, patients, psychologists):
    super().__init__()
    self.patient_params = nn.ParameterDict({patient.name: nn.Parameter(torch.randn(10, requires_grad=True, dtype=torch.float)) for patient in patients})
    self.psych_params = nn.ParameterDict({psych.name: nn.Parameter(torch.randn(10, requires_grad=True, dtype=torch.float)) for psych in psychologists})
  def forward(self, matches):
    output = []
    for psych_match in matches:
      patient = psych_match[0]
      psych = psych_match[1]
      patient_params = self.patient_params[patient.name]
      psych_params = self.psych_params[psych.name]
      output.append(patient_params @ psych_params.T)
    return torch.tensor(output, requires_grad=True)

As @ConvolutionalAtom noted, your model isn’t maintaining the gradients graph all the way through, given the way you currently have structured the forward pass.

You could try something like:

  def forward(self, matches):
    output = torch.empty(0,1)
    for psych_match in matches:
      patient = psych_match[0]
      psych = psych_match[1]
      patient_params = self.patient_params[patient.name]
      psych_params = self.psych_params[psych.name]
      output=torch.cat([output,(patient_params @ psych_params.T).view(1,-1)])
    return output
2 Likes

Perfect, now it’s working as expected. Thanks!

1 Like