Different behaviour in Numpy and Pytorch

  • if loss = error.pow(2).sum()/2.0, dloss/derror = error
    if loss = error.pow(2).mean(), dloss/derror = 2*error/(batch_size), your batch_size is 64 here.
    because in numpy implementation, outerror = error, so we should use the first form of loss.

  • I print (error**2).mean().data[0], because you are doing this in numpy

loss = (error ** 2).mean()
...
...
print(epoch, loss)
  • they are the same, but the pytorch code can backward and calculate grad automatically.
2 Likes