Linear layer gradients of PyTorch and LuaTorch differs?

I tried to verify some equations with torch and pytorch, and accidentally found that the gradients differs.

Because you compute and accumulate gradients twice in Lua
A backward call is updateGradInput(input, gradOutput) + accGradParameters(input,gradOutput,scale)

Got it. Thank you :slight_smile: