Confused about the MSE loss implementation

You don’t need the gradients in the usual use case, which is also why it wasn’t implemented in the backend in the first place.
However, there seem still to be some use cases as shown e.g. here.

1 Like