Hi,
I am following the official Tutorial “Learning PyTorch with Examples” and don’t understand the step from “Defining new autograd functions” to using the PyTorch nn module: From the description and the code it seems like we would be doing the exact same thing. However, we have to adjust the learning from 1e-6 to 1e-4 to keep converging with the nn model, so it appears that we are not computing the same model & gradients.
Can someone explain what the difference is?
Thanks for your help!