What's the difference between Sigmoid+BCELoss and BCEWithLogitsLoss

When I use nn.Sigmoid()+nn.BCELoss(size_average=False), everything is ok.

But when I use nn.BCEWithLogitsLoss(size_average=False), I have the following error:

Traceback (most recent call last):
File “C:\Program Files\JetBrains\PyCharm 2017.3.3\helpers\pydev\pydevd.py”, line 1668, in
File “C:\Program Files\JetBrains\PyCharm 2017.3.3\helpers\pydev\pydevd.py”, line 1662, in main
globals = debugger.run(setup[‘file’], None, None, is_module)
File “C:\Program Files\JetBrains\PyCharm 2017.3.3\helpers\pydev\pydevd.py”, line 1072, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File “C:\Program Files\JetBrains\PyCharm 2017.3.3\helpers\pydev_pydev_imps_pydev_execfile.py”, line 18, in execfile
exec(compile(contents+"\n", file, ‘exec’), glob, loc)
File “E:/pytorch_project/pytorch-cfdnet/main_cfd.py”, line 349, in
File “E:/pytorch_project/pytorch-cfdnet/main_cfd.py”, line 305, in main
train_error = train.forward()
File “E:/pytorch_project/pytorch-cfdnet\train.py”, line 72, in forward
File “C:\Users\18mat\Anaconda3\lib\site-packages\torch\autograd\variable.py”, line 167, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File “C:\Users\18mat\Anaconda3\lib\site-packages\torch\autograd_init_.py”, line 99, in backward
variables, grad_variables, retain_graph)
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Process finished with exit code 1

my training code

    for batch_idx, (x, yt) in enumerate(self.data_loader):
        x = x.cuda(async=True)
        yt = yt.cuda(async=True)
        input_var = Variable(x)
        target_var = Variable(yt)
        y = self.model(input_var)
        loss = self.criterion(y, target_var)

        if self.vis:
                    X=torch.ones((1, 1)).cpu() * self.iterations,
                    # Y=new_loss.cpu(),

        # measure accuracy and record loss
        total_loss += new_loss.cuda()
        # compute gradient and do SGD step


        if batch_idx % 10 == 0:
            if (batch_idx*len(x) + 10*len(x)) <= len(self.data_loader.dataset):
                pbar.update(10 * len(x))
                pbar.update(len(self.data_loader.dataset) - batch_idx*len(x))

        self.iterations += 1


Some code seems to be missing from your example. For instance, the variable new_loss is not initialized in your snipped.

For a start, you can try to remove all code not directly related to backward/forward passes. That will definitively make debugging easier.

BCELoss is built on top of sigmoid layer, which is numerically unstable.(by yzgao)
So, it is necessary to add similar loss modules to address this numerical stability issue. (by yzgao)