Size Mismatch Issue in BCE

Hi, I have used most solutions in this forum and seemed to be getting nowhere.This is the model I have:

def load_data(train_file, test_file):
    # Load the training data
    train_dataset = h5py.File(train_file)
    
    # Separate features(x) and labels(y) for training set
    train_set_x_orig = np.array(train_dataset["train_set_x"])
    train_set_y_orig = np.array(train_dataset["train_set_y"])

    # # Load the test data
    test_dataset = h5py.File(test_file)

    # # Separate features(x) and labels(y) for training set
    test_set_x_orig = np.array(test_dataset["test_set_x"])
    test_set_y_orig = np.array(test_dataset["test_set_y"])

    classes = np.array(test_dataset["list_classes"][:]) # the list of classes
    
    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))

       
    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes


train_file="data/train_catvnoncat.h5"
test_file="data/test_catvnoncat.h5"
train_x_orig, train_y, test_x_orig, test_y, classes = load_data(train_file, test_file)

# Explore your dataset 
m_train = train_x_orig.shape[0]
num_px = train_x_orig.shape[1]
m_test = test_x_orig.shape[0]


train_x_flatten = train_x_orig.reshape(train_x_orig.shape[0], -1).T   # The "-1" makes reshape flatten the remaining dimensions
test_x_flatten = test_x_orig.reshape(test_x_orig.shape[0], -1).T




train_x = np.transpose(train_x_flatten/255.)
test_x = np.transpose(test_x_flatten/255.)

train_x = torch.Tensor(train_x)
train_y = torch.Tensor(train_y.T)
my_dataset = data.TensorDataset(train_x,train_y) 
trainloader = data.DataLoader(my_dataset,batch_size=2,
                                          shuffle=True)

I had to transpose my train_y because its dimensions were (y1,y2) while the Data loader enforced the size of it to be (y2,y1) because my train_x had a similar shape for its first dimension.

class Net(nn.Module):
    def __init__(self, n_x, n_h, n_y):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(n_x, n_h)
        self.fc2 = nn.Linear(n_h, n_y)
        self.dropout = nn.Dropout(p=0.5)
        self.sigmoid = nn.Sigmoid()
    
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = self.dropout(x)
        x = self.fc2(x)
        x = torch.transpose(self.sigmoid(x), 0, 1)

        return x




n_x = 12288     # num_px * num_px * 3
n_h = 7
n_y = 1
lr=0.001
net = Net(n_x, n_h, n_y)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=lr, momentum=0.9)


for epoch in range(100):
  count=0
  train_losses = []
    # loop over the dataset multiple times  
  for i, data in enumerate(trainloader, 0):
    train_loss=[] 
    inputs, labels = data
    optimizer.zero_grad()
    outputs = net(inputs)
    print(outputs)
    
   
    print(labels)
    loss = criterion(outputs,labels)
    train_loss.append(loss.item())
    loss.backward()
    optimizer.step()
    preds = outputs > 0.5
    nb_correct = (preds == labels).sum()
    count+=nb_correct.item()
    if epoch % 100 == 1:
      print("Iteration : {}, Training loss: {} ,Accuracy %: {}".format(epoch,np.mean(train_loss),(count/train_x.shape[0])*100)) 

ValueError                                Traceback (most recent call last)
<ipython-input-161-9aea63858696> in <module>()
     12 
     13     print(labels)
---> 14     loss = criterion(outputs,labels)
     15     train_loss.append(loss.item())
     16     loss.backward()

3 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
   1834     if input.size(0) != target.size(0):
   1835         raise ValueError('Expected input batch_size ({}) to match target batch_size ({}).'
-> 1836                          .format(input.size(0), target.size(0)))
   1837     if dim == 2:
   1838         ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

ValueError: Expected input batch_size (1) to match target batch_size (2).

This seems to be the error I am getting. I tried using squeeze but it gave another error.Any suggestions is appreciated.

I have also used the squeeze trick many people here have endoresed without any luck :frowning:

Hello Ali!

There are several things that are inconsistent in your post. Let
me propose a question that might be useful for you to ask, and
then try to answer it.

“How do I use BCELoss to perform a binary classification task?”

You say “BCE” in the title of your post, but use
criterion = nn.CrossEntropyLoss() in your code.
Let’s go with BCELoss.

Further down, you have n_y = 1, so this becomes

        self.fc2 = nn.Linear(n_h, 1)

This is good. For a binary classification problem, you want your
network to output a single value (per sample in your batch).

This is fine. Your network will output the predicted probability of
your sample being in class “1”.

(But, as an aside, you will be better off, for reasons of numerical
stability, getting rid of the Sigmoid and using BCEWithLogitsLoss
instead of BCELoss.)

This torch.transpose() is asking for trouble. Get rid of it.

For a binary classification problem, you should use BCELoss,
not CrossEntropyLoss. (Or, better, BCEWithLogitsLoss.)

This makes sense provided labels are binary class labels, that
is a single value of 0 or 1 (per sample in your batch).

What values do your labels take on?

Well, torch.transpose() would be expected to swap your batch
dimension with some other dimension, so it’s to be expected that
your batch_sizes don’t match.

What do you expect your batch_size to be?

In this code:

    outputs = net(inputs)
...
    loss = criterion(outputs,labels)

please print out .shape for inputs, outputs, and labels. Are the
shapes what you were expecting? Do they match what net and
criterion require for their arguments?

Good luck.

K. Frank

1 Like