Why by adding second loss and applying the back.ward(), generated gradient is as same as before?

I don’t think that histc is differentiable without an approximation, so you could follow this topic for potential workarounds.