Check if out_lsm.requires_grad == True
. If not, maybe there is some operation in self.evaluate
that is non differentiable that is causing requires_grad
to be False
You can refer to this FAQ to know more about None gradients Why are my tensor's gradients unexpectedly None or not None?