Linear layer gradients of PyTorch and LuaTorch differs?

cdluminate · July 28, 2017, 4:35pm

I tried to verify some equations with torch and pytorch, and accidentally found that the gradients differs.

iamalbert · July 29, 2017, 12:59am

Because you compute and accumulate gradients twice in Lua
A backward call is updateGradInput(input, gradOutput) + accGradParameters(input,gradOutput,scale)

cdluminate · July 29, 2017, 4:54am

Got it. Thank you