I assume you are using nn.BCELoss
as your criterion.
If so, could you remove the last sigmoid and use nn.BCEWithLogitsLoss
?
Let me know, if this helps in any sense or if you are still seeing this behavior.
I assume you are using nn.BCELoss
as your criterion.
If so, could you remove the last sigmoid and use nn.BCEWithLogitsLoss
?
Let me know, if this helps in any sense or if you are still seeing this behavior.