Equivalent of TensorFlow's Sigmoid Cross Entropy With Logits in Pytorch

varunagrawal · April 18, 2017, 9:46pm

I am trying to find the equivalent of sigmoid_cross_entropy_with_logits loss in Pytorch but the closest thing I can find is the MultiLabelSoftMarginLoss.

Can someone direct me to the equivalent loss? If it doesn’t exist, that information would be useful as well so I can submit a suitable PR.

Chun_Li · April 19, 2017, 1:09am

I think it’s class torch.nn.CrossEntropyLoss.

varunagrawal · April 19, 2017, 7:19pm

That doesn’t seem to be the case. As per the docs CrossEntropyLoss only takes a single class index. Please correct me if I am wrong.

jekbradbury · April 20, 2017, 2:18am

You’re looking for KLDivLoss, which takes two log-probability inputs. If you have logits, you will need to apply F.log_softmax first.

varunagrawal · April 20, 2017, 3:16am

The objective function formulation is different from the Cross Entropy formulation given in TensorFlow. I don’t think this is the correct loss.

AjayTalati · April 20, 2017, 3:21am

Maybe the answer to this stackoverflow question is helpful,

In mathematical terms, what exactly do you want to do? That might be easier for people to help you with, rather than trying to port over a TF function?

If you want to do multi-label classification, so do I, but I haven’t figured out yet how to do it in PyTorch? So I’m also interested in your question

Best,

Ajay

varunagrawal · April 21, 2017, 4:26pm

From the implementation details, it would seem that the MultiLabelSoftMarginLoss is indeed the equivalent of the sigmoid_cross_entropy_with_logits loss. Closing this!

AjayTalati · April 21, 2017, 7:14pm

Hi @varunagrawal,

did you get MultiLabelSoftMarginLoss to work on a multi-label classification test problem?

I tried applying it to a multi-label MNIST test, (each image is label by it’s original class, and the class-1), but it didn’t work?

Misha_E · April 22, 2017, 4:22am

I believe you are talking about BCELoss
http://pytorch.org/docs/nn.html#torch.nn.BCELoss
or http://pytorch.org/docs/nn.html#binary-cross-entropy
but you’d have to apply sigmoid activation yourself before that

mratsim · April 28, 2017, 11:59am

@AjayTalati I managed to use BCELoss, binary_crossentropy and MultiLabelSoftMarginLoss on a MultiLabel problem

Here is the basic code

def train(epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        # data, target = data.cuda(async=True), target.cuda(async=True) # On GPU
        data, target = Variable(data), Variable(target)
        optimizer.zero_grad()
        output = model(data)
        loss = F.binary_cross_entropy(output, target)
        loss.backward()
        optimizer.step()
        if batch_idx % 10 == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.data[0]))

And the source is here.

For BCELoss you can use criterion = BCELoss() and then loss = criterion(output, target) but as @Misha_E said, the NN must return a sigmoid activation.

AjayTalati · April 29, 2017, 1:46am

Hi Mamy, @mratsim

thanks a lot for posting your code

All the best,

Aj

mattrobin · August 1, 2018, 9:18pm

Just for anyone else who finds this from Google (as I did), BCEWithLogitsLoss now does the equivalent of sigmoid_cross_entropy_with_logits from TensorFlow. It is a numerically stable sigmoid followed by a cross entropy combination.

moscow25 · February 24, 2020, 11:18pm

Worth noting that KLDivLoss still needs to run with reduction='batchmean' – to get the “soft cross_entropy” behavior that people are asking. Surprised this isn’t a more clearly documented…