As said above, predict doesn’t have .grad because it is not a parameter, but only an intermediate computation result. The actual parameters of your model will have .grad calculated.
Obviously, the optimizer still can not see final_predict. So, how can I add final_predict to the existing optimizer or convert it to a nn.parameter()? Because I need to use final_predict to calculate the loss so as to update my model.