tzeny
(Andrei Tenescu)
June 1, 2019, 3:41pm
1
Hello, I’ve tried to write a custom loss function, a Weighted Binary Cross Entropy. As suggested by @miguelvr here:
Hey there,
I’m trying to increase the weight of an under sampled class in a binary classification problem.
torch.nn.BCELoss has a weight attribute, however I don’t quite get it as this weight parameter is a constructor parameter and it is not updated depending on the batch of data being computed, therefore it doesn’t achieve what I need.
What is the correct way of simulating a class weight, similar to the way Keras does?
Cheers
I’ve tried to use the following function (wrapped inside a class):
def weighted_binary_cross_entropy(output, target, weights=None):
if weights is not None:
assert len(weights) == 2
loss = weights[1] * (target * torch.log(output)) + \
weights[0] * ((1 - target) * torch.log(1 - output))
else:
loss = target * torch.log(output) + (1 - target) * torch.log(1 - output)
return torch.neg(torch.mean(loss))
The problem is that sometimes this outputs nan or -inf.
Tracing it back I reached the conclusion that sometimes my model outputs very small numbers, for example -136. This leads to:
torch.sigmoid(torch.tensor([-136.])) -> tensor([0.])
which leads to -inf in torch.log()
.
I am on pyTorch 1.0.1.
Is it ok to use torch.clamp(torch.sigmoid(torch.tensor([-136.])), min=1e-8, max=1 - 1e-8)
?
miguelvr
(Miguel Varela Ramos)
June 1, 2019, 7:31pm
2
that function had that problem, in the end I used the NLLLoss.
But it is indeed okay to do that to fix the problem.
tzeny
(Andrei Tenescu)
June 4, 2019, 5:29pm
3
Thank you, I tried to use it like this, but it doesn’t work. The network does not train, even when the weights are (0.5, 0.5). Do you have any idea why?
KFrank
(K. Frank)
June 4, 2019, 7:54pm
4
Hello Andrei!
tzeny:
…
Tracing it back I reached the conclusion that sometimes my model outputs very small numbers, for example -136. This leads to:
torch.sigmoid(torch.tensor([-136.])) -> tensor([0.])
which leads to -inf in torch.log()
.
Is it ok to use torch.clamp(torch.sigmoid(torch.tensor([-136.])), min=1e-8, max=1 - 1e-8)
?
The short (narrow) answer – use LogSigmoid
.
Please see my reply to your “Weighted Binary Crossentropy”
post:
Hi Andrei!
First, to answer the question I think you’re asking:
You should be using (as in the comment in your code)
BCEWithLogitsLoss.
BCEWithLogitsLoss supports sample weights, which you
can use for class weights.
Let’s say you have class weight w_1 for class 1, and w_0
for class 0. Let w_n be the sample weight for sample n.
Simply set w_n = w_1 if y_n = 1, and w_n = w_0 if
y_n = 0. (This assumes that the y_n are either 0 or 1, as
they should be if they are binary class labels.)
N…
Best.
K. Frank
1 Like
tzeny
(Andrei Tenescu)
June 12, 2019, 11:05am
5
Thank you for your replies, I will mark this as solved.
One thing to note for anyone who might be thinking about implemented custom loss functions in the future: be careful about the numeric stability of your implementation.