I have written a custom loss function to compute my model loss and I am doubting I have detached the loss Variable from the graph as the model weight parameters are not updating.
Here is my loss function:
def custom_loss(self, input_var, target_var):
loss = Variable(torch.FloatTensor(1).zero_(), requires_grad=True)
if self.config.cuda:
loss = loss.cuda()
for i in range(target_var.size(0)):
loss = torch.add(loss, torch.exp(input_var[target_var.data[i]]))
loss = torch.mul(torch.log(loss), -1)
return loss
Then doing the backprop as follows:
softmax_prob = self.model(train_sentences)
loss = self.custom_loss(softmax_prob, train_labels)
loss.backward()
self.optimizer.step()
I checked if the model weights are updating but it is not…!
a = list(self.model.parameters())[0].clone()
self.train(train_batches, dev_batches, (epoch + 1))
b = list(self.model.parameters())[0].clone()
print(torch.equal(a.data, b.data)) # this prints True
print(list(self.model.parameters())[0].grad) # this prints None
What wrong I am doing in my loss function? Why the model parameter.grad
is None
? If don’t use my custom loss function but other loss function, say, NLLLoss, the model works fine.
I also tried, removing requires_grad = True
from loss = Variable(torch.FloatTensor(1).zero_(), requires_grad=True)
in my loss function but the problem remains.
So, I believe I am doing some mistake in the loss function. Can anyone point me to that mistake?
Update: I found the reason for printing None
for the first model parameter and the code is actually working fine. There was no error, rather I misunderstood one thing which is now sorted out.