Tresholding the prediction image to binary before sending to loss function

If at all possible, would you be able to try how straight-through estimator works in your case?