Binary Labels and Loss function

Hi there!!
I am struggling on using CNN with BCELoss to classify with 0 or 1 images. Basically the loss struggles with the type of the labels but I really don’t understand how I should do it. Basically I have a class Dataset that opens an image and depending on having or not a feature I define a variable label as int 1 or int 0. Then, I convert it to tensor using torch.Tensor() and this class returns a sample with the image and the label. But then, when I try to train it, it always gives errors in the loss function and I really don’t understand why it doesn’t work.

Can someone help me? Thank you in advance.

Could you post the error message here, so that we could have a look? :slight_smile:

RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate
    return {key: default_collate([d[key] for d in batch]) for key in elem}
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 74, in <dictcomp>
    return {key: default_collate([d[key] for d in batch]) for key in elem}
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate
    return torch.stack(batch, 0, out=out)
RuntimeError: stack expects each tensor to be equal size, but got [0] at entry 0 and [1] at entry 1
class Dataset(object):
 #----Initializing the class----
    def __init__(self, train_names, masks_dir, img_dir,train, im_size):
        self.train_names=train_names
        self.masks_dir=masks_dir
        self.img_dir=img_dir
        self.train=train 
        self.noise = Variable(torch.zeros(299,299))
        self.data_augm=data_augm
       
        self.im_size=im_size

   
    def __len__(self):
        return len(self.train_names)
    
    #----To get a dataset item----
    def __getitem__(self, idx):

      image = Image.open(self.img_dir + self.train_names[idx])

      #target = 0

      
      if self.train == True:

        try:
          masks = Image.open(self.masks_dir + os.path.splitext(self.train_names[idx])[0]) 
          target = 1
            #----------------------------------------------------------------------- 
        except:
          target = 0
      else:
        target = 0

      image=np.array(image)
      if image.shape[2] == 4:
        #convert the image from RGBA2R
        image = cv2.cvtColor(image, cv2.COLOR_BGR2BGR)
      image=resize(image, (self.im_size , self.im_size),anti_aliasing=True)
      image = transforms.ToTensor()(image)
      image=image.float() 
      target = torch.Tensor(target)

      sample = {'image': image, 'target': target}
      
      return sample

And my CNN

class CNN(nn.Module): 
    def __init__(self, in_channels = 3, num_classes = 1):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=8, kernel_size=3, stride = (1,1), padding = (1,1))
        self.pool = nn.MaxPool2d(kernel_size=2, stride = (2,2))
        self.conv2 = nn.Conv2d(in_channels=8, out_channels=16, kernel_size=3, stride = (1,1), padding = (1,1))
        self.fc1 = nn.Linear(16*75*75, num_classes)
        self.sigmoid = nn.Sigmoid()


    def forward(self, x):
        x= F.relu(self.conv1(x))
        x = self.pool(x)
        x= F.relu(self.conv2(x))
        x = self.pool(x)
        x = x.reshape(x.shape[0], -1)
        x = self.fc1(x)
        x = self.sigmoid(x)
        return x

Based on the error message it seems your data loading is creating an empty tensor and tries to concatenate it into a batch with other tensors.
Could you use batch_size=1, iterate the DataLoader in isolation without any training and check the shapes for the data and target tensors?

image shape: torch.Size([1, 3, 300, 300])
target shape: torch.Size([1])

these are the shaped with batch_size = 1

Are these shapes returned for all batches?
The error suggests that the batch size was larger than 1:

RuntimeError: stack expects each tensor to be equal size, but got [0] at entry 0 and [1] at entry 1

Before the batch size was bigger than 1 yes!
I then changed for batch_size = 1 as you suggested!

For example, for batch_size = 2, for several iterations in the dataloader, the shapes were as following:

image shape:  torch.Size([2, 3, 300, 300])
target shape:  torch.Size([2])

Could you check the shapes for all batches returned by the DataLoader, as a certain batch might be creating the issue.

I just checked! Only the last batch has different shapes:

image shape itr 81:  torch.Size([1, 3, 300, 300])
target shape itr 81:  torch.Size([1])

What should I do? Should I force the training not to use that specific batch? why does this happen? :scream:

You could try to drop the last (smaller) batch via drop_last=True in the DataLoader.
However, I’m not sure, if this would solve the issue. Let me know, if you are still running into the error.

That error is gone!! thank you so much!! however, now I have this error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-58-3b352c3bae56> in <module>()
     38 
     39 
---> 40           loss = criterion(output, target) #+ 0.4*criterion2(output, mask)  + 0.2*criterion3(output, mask)
     41 
     42           if phase=='train':

2 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in binary_cross_entropy(input, target, weight, size_average, reduce, reduction)
   2482 
   2483     return torch._C._nn.binary_cross_entropy(
-> 2484         input, target, weight, reduction_enum)
   2485 
   2486 

RuntimeError: Found dtype Long but expected Float

I tried to convert the target to float but it didn’t work :confused:

Basically now I am getting all types of errors regarding the data and target types, maybe it has something to do with a way I convert to tensor, could something check it out?

if self.train == True:

        try:

          masks = Image.open(self.masks_dir + os.path.splitext(self.train_names[idx])[0]+'beard.jpg') ## change the end of the filename name depending of we want "beard" or "hair" or "_eyelid"

          target = 1

            #----------------------------------------------------------------------- 

        except:

          target = 0

      else:

        target = 0

      image=np.array(image)

      if image.shape[2] == 4:

        #convert the image from RGBA2R

        image = cv2.cvtColor(image, cv2.COLOR_BGR2BGR)

      image=resize(image, (self.im_size , self.im_size),anti_aliasing=True)

      image = transforms.ToTensor()(image)

      image=image.float() 

      #target = torch.Tensor(target)

      

      target = transforms.ToTensor()(target).float()

      

      sample = {'image': image, 'target': target}

When I try to iterate the dataloader, this happens:


TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "<ipython-input-83-577e6d24b257>", line 50, in __getitem__
    target = transforms.ToTensor()(target).float()
  File "/usr/local/lib/python3.6/dist-packages/torchvision/transforms/transforms.py", line 92, in __call__
    return F.to_tensor(pic)
  File "/usr/local/lib/python3.6/dist-packages/torchvision/transforms/functional.py", line 46, in to_tensor
    raise TypeError('pic should be PIL Image or ndarray. Got {}'.format(type(pic)))
TypeError: pic should be PIL Image or ndarray. Got <class 'int'>

Assuming you are using nn.BCELoss, the model output and target should have the same shape ([batch_size, 1] for a binary classification use case) and both should be FloatTensors.
Note that removing the last sigmoid and using nn.BCEWithLogitsLoss would give you a better numerical stability.