Strange shape mismatch issue with torch.BCEWithLogitsLoss()

I am writing a relativistic GAN. I have the following generator update function

def optimize_generator(self, real_batch, fake_batch):

        """
        relativistic generator update step
        """

        valid = torch.ones(real_batch.size(0), device = self.device)
        fake = torch.zeros(fake_batch.size(0), device = self.device)


        y_pred = self.netD(real_batch)
        y_pred_fake = self.netD(fake_batch.detach())
        
        print("(y_pred- torch.mean(y_pred_fake)).squeeze().size() : ", (y_pred- torch.mean(y_pred_fake)).squeeze().size())
        print("(y_pred_fake - torch.mean(y_pred)).squeeze().size() : ", (y_pred_fake - torch.mean(y_pred)).squeeze().size())
        print("fake.size() : ", fake.size())
        print("valid.size() : ", valid.size())

        print((y_pred- torch.mean(y_pred_fake)).squeeze().size() == fake.size())
        print((y_pred_fake - torch.mean(y_pred)).squeeze().size() == valid.size())

        real_v_fake = self.loss((y_pred- torch.mean(y_pred_fake)).squeeze(), fake)
        fake_v_real = self.loss((y_pred_fake - torch.mean(y_pred)).squeeze(), valid)
        
        g_loss = (real_v_fake + fake_v_real)/2

        """
        gradient update pass
        """

        self.gen_optim.zero_grad()
        g_loss.backward()
        self.gen_optim.step()

        return g_loss

I have a very strange issue with the dimensions of the input and the target, linked in the following console output :


as you can see, the size of the input and target is the same torch.size([64]) and a boolean check of equality even succeeds, but BCEWithLogitsLoss keeps throwing this error. I had a look at the torch/nn/functional.py code, and I seem to be failing this exact check (line 2433) :

if not (target.size() == input.size()):
    raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))

I am at a bit of a loss, because I don’t see how that’s different from the boolean equality I tested before the call to BCEWithLogitsLoss. Any help would be much appreciated :slight_smile: !

The equality even fails after a reshape

    valid = torch.ones(real_batch.size(0), device = self.device)
    fake = torch.zeros(fake_batch.size(0), device = self.device)


    y_pred = self.netD(real_batch)
    y_pred_fake = self.netD(fake_batch.detach())
    
    real_v_fake = (y_pred- torch.mean(y_pred_fake))
    fake_v_real = (y_pred_fake - torch.mean(y_pred))

    real_v_fake = torch.reshape(real_v_fake,(real_batch.size(0),1))
    fake_v_real = torch.reshape(fake_v_real,(real_batch.size(0),1))
    valid = torch.reshape(valid, (real_batch.size(0),1))
    fake = torch.reshape(fake, (real_batch.size(0),1))
    
    real_v_fake = self.loss(real_v_fake, fake)
    fake_v_real = self.loss(fake_v_real, valid)

meaning torch.nn.BCEWithLogitsLoss still complains about different shapes even after pytorch successfully reshapes them to the exact same shape (real_batch.size(0),1).

Could you post an executable code snippet to reproduce this error, please?

So I also have a strange issue with BCEWithLogitsLoss , simply with computation of the loss thru

loss = criterion(outputs.squeeze(), labels)

The computation works just fine 99.9% of the time but when I use this on my dataset with 7987 rows AND with a batch size of 256, I have the error

File “/nics/b/home/doursand/anaconda3/envs/limsnet/lib/python3.7/site-packages/torch/nn/functional.py”, line 2580, in binary_cross_entropy_with_logits
raise ValueError(“Target size ({}) must be the same as input size ({})”.format(target.size(), input.size()))
ValueError: Target size (torch.Size([1])) must be the same as input size (torch.Size([]))

However, if I change the batch size value to 255 instead of 256, then everything works like a charm!

Now if you do the calculation,

7937 rows / 256 = 31.0039062

and

7937 rows / 255 = 31.133333

Question : is it a rounding issue which makes the iteration with 256 images/batch fail (as 31.0039062 is seen as 31 ?)

I’m an idiot, the error is not happening in optimize_generator, but in optimize_discriminator, which I called earlier. The equality in shapes succeeded … just not in the function outputting the error.

this is probably related to the last batch being incomplete. You can change the batch size as you did, or you can use drop_last = True as an option in the DataLoader to avoid this issue.

1 Like