Different behaviour in Numpy and Pytorch

chenyuntc · May 6, 2017, 11:33am

if loss = error.pow(2).sum()/2.0, dloss/derror = error
if loss = error.pow(2).mean(), dloss/derror = 2*error/(batch_size), your batch_size is 64 here.
because in numpy implementation, outerror = error, so we should use the first form of loss.
I print (error**2).mean().data[0], because you are doing this in numpy

loss = (error ** 2).mean()
...
...
print(epoch, loss)

they are the same, but the pytorch code can backward and calculate grad automatically.