I tried to verify some equations with torch and pytorch, and accidentally found that the gradients differs.
Because you compute and accumulate gradients twice in Lua
A backward call is updateGradInput(input, gradOutput) + accGradParameters(input,gradOutput,scale)
Got it. Thank you