Why "loss.backward()" didn't update parameters' gradient?

ptrblck · October 14, 2020, 6:47am

tensor.mean(1) calculates the mean of the tensor in dim1. You could use this approach on an output activation.