Loss function doesn't compute gradient

sigma_x · December 14, 2020, 9:03pm

The network outputs a number of vectors for which I compute distance and derive a SoftMin probability distribution (a type of clustering algorithm). This output is of the form (total_clustering_prob, target_classes):

nllloss = NLLLoss(size_average=False, reduce=True)

[0.32,0.33, 0.35] [1]
....
[0.31,0.39, 0.3] [2]

I plug it into NLLLoss:
total_clustering_loss = nllloss(total_clustering_prob, target_classes)
and the resulting loss doesn’t have a grad_fn, so all the previous layers don’t compute gradients either. Why does this happen? Do I need to all computations within nn.Module subclass?

InnovArul · December 14, 2020, 9:40pm

No. All computations need not be within nn.Module subclass.
You have to traceback and see if the computation graph remains connected (if you use detach() / .item() etc, the graph disconnects) and at least one operand (parameters/inputs) in the graph has requires_grad=True.

sigma_x · December 14, 2020, 10:56pm

I didn’t realize .item() detaches the graph. I owe you a beer!