ValueError: Expected input batch_size

I have some code:

from torchvision.datasets import ImageFolder
import torchvision.transforms as T
from torch.utils.data import DataLoader
import torch
import torch.nn as nn
import torch.nn.functional as f
import torch.optim as optim

path ="/Users/edenbrown/Downloads/kagglecatsanddogs_3367a/PetImages"
transform = T.Compose([T.Resize((50, 50)),T.ToTensor()])

dataset = ImageFolder(root=path, transform=transform)

dataloader = DataLoader(dataset, batch_size=10, shuffle=True)

for batch_number, (images, labels) in enumerate(dataloader):
break

class Net(nn.Module):

def __init__(self):
    super().__init__()
    self.fc1 = nn.Linear(50 * 50, 64)
    self.fc2 = nn.Linear(64, 64)
    self.fc3 = nn.Linear(64, 64)
    self.fc4 = nn.Linear(64, 2)

def forward(self, x):
    x = f.relu(self.fc1(x))
    x = f.relu(self.fc2(x))
    x = f.relu(self.fc3(x))
    x = self.fc4(x)
    return f.log_softmax(x, dim = 1)

net = Net()

optimizer = optim.Adam(net.parameters(), lr = 0.001)

Epochs = 3

index = 0

for epoch in range(Epochs):
a = labels[index]
b = images[index]
net.zero_grad()
output = net(b.view(-1, 50 * 50))
loss = f.nll_loss(output, a)
loss.backward()
optimizer.step()
print(loss)
index += 1

but I get this error:

Traceback (most recent call last):
File “/Users/edenbrown/sample.ws45/new.py”, line 50, in
loss = f.nll_loss(output, a)
File “/Users/edenbrown/opt/anaconda3/envs/env_pytorch3/lib/python3.10/site-packages/torch/nn/functional.py”, line 2671, in nll_loss
return torch._C._nn.nll_loss_nd(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
ValueError: Expected input batch_size (3) to match target batch_size (0).

Is there a way to fix this error?

I took the liberty to change your code around.

You might want to look at this tutorial.

Also you can put three of ``` at the beginning and end of your code when creating a post here, so it will be easier to debug.

from torchvision.datasets import ImageFolder
import torchvision.transforms as T
from torch.utils.data import DataLoader
import torch
import torch.nn as nn
import torch.nn.functional as f
import torch.optim as optim

path ="/Users/edenbrown/Downloads/kagglecatsanddogs_3367a/PetImages"
transform = T.Compose([T.Resize((50, 50)),T.ToTensor()])

dataset = ImageFolder(root=path, transform=transform)
dataloader = DataLoader(dataset, batch_size=10, shuffle=True)

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        # You also need to take into account the Channels RGB
        # Or you might consider something like a Convolutional Layer or something else
        self.fc1 = nn.Linear(3 * 50 * 50, 64) # RGB, that is why I added the 3
        self.fc2 = nn.Linear(64, 64)
        self.fc3 = nn.Linear(64, 64)
        self.fc4 = nn.Linear(64, 2)

    def forward(self, x):
        # Since I added the 3 to the channels, you need to flatten all of the Image
        # Original BxCxHxW - Batch x Channel x Height x Width - 10 x 3 x 50 x 50
        # Flatten B x C*H*W - 10 x 3*50*50
        x = x.flatten(start_dim=1)
        x = f.relu(self.fc1(x))
        x = f.relu(self.fc2(x))
        x = f.relu(self.fc3(x))
        x = self.fc4(x)
        return f.log_softmax(x, dim = 1)

net = Net()

optimizer = optim.Adam(net.parameters(), lr = 0.001)
criterion = nn.NLLLoss()

Epochs = 3

for epoch in range(Epochs):
    running_loss = 0.0
    for batch_number, (images, labels) in enumerate(dataloader):
        net.zero_grad()
        # You can pass the full batch directly to the model
        # There is no need to get one image individually
        # But if you really want to then you need to match the dimensions BxCxHxW
        # This means, if you have a single image CxHxW, you need to add the Batch dimension
        # You can do this with img.unsqueeze(0)
        # But this code should work
        output = net(images)
        loss = criterion(output, labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        if batch_number % 20 == 19:    # print every 20 mini-batches
            print(f'[{epoch + 1}, {batch_number + 1:5d}] loss: {running_loss / 2000:.3f}')
            running_loss = 0.0

print('Finished Training')

Hope this helps :smile:
Please let me know if something is not clear

2 Likes

I can not test this code right now but I will test it in 1 or 2 hours. I think this will probably work.

This has worked decently well but I have noticed that it never stops training and it randomly sometimes give me this error:

  File "/Users/edenbrown/sample.ws45/new.py", line 40, in <module>
    for batch_number, (images, labels) in enumerate(dataloader):
  File "/Users/edenbrown/opt/anaconda3/envs/env_pytorch3/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 530, in __next__
    data = self._next_data()
  File "/Users/edenbrown/opt/anaconda3/envs/env_pytorch3/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 570, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/Users/edenbrown/opt/anaconda3/envs/env_pytorch3/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/Users/edenbrown/opt/anaconda3/envs/env_pytorch3/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/Users/edenbrown/opt/anaconda3/envs/env_pytorch3/lib/python3.10/site-packages/torchvision/datasets/folder.py", line 230, in __getitem__
    sample = self.loader(path)
  File "/Users/edenbrown/opt/anaconda3/envs/env_pytorch3/lib/python3.10/site-packages/torchvision/datasets/folder.py", line 269, in default_loader
    return pil_loader(path)
  File "/Users/edenbrown/opt/anaconda3/envs/env_pytorch3/lib/python3.10/site-packages/torchvision/datasets/folder.py", line 248, in pil_loader
    img = Image.open(f)
  File "/Users/edenbrown/opt/anaconda3/envs/env_pytorch3/lib/python3.10/site-packages/PIL/Image.py", line 3123, in open
    raise UnidentifiedImageError(
PIL.UnidentifiedImageError: cannot identify image file <_io.BufferedReader name='/Users/edenbrown/Downloads/kagglecatsanddogs_3367a/PetImages/Dog/11702.jpg'>

I also I have noticed that the loss is always 0.007

I am guessing the dataset is quite big. It is going to take some time to train.

If you have a GPU available, you can move your model to the GPU to make it faster with
model.cuda().

If you do not have a GPU with CUDA available, you might look into GoogleColab or some other tool like this that lets you train models online with GPU.

Regarding the error, you might want to look at that specific image and see if it is not corrupt or something.

The dataset has around 25000 images and I will check the images. Also, is the 0.007 loss
always happening, a problem or not.

You can try with another loss function like CrossEntropy and see if it works better for you.

But it does seem weird that you always get the same loss and it does not change.

It is also very low, meaning your model should (in theory) be already good to predict the images, or already overfitted.

The next step would be to look into validation, so that your model does not overfit.

(Also, as already mentioned, you might want to look to other types of layers, that might help your model extract features from near pixels like Conv2d and generalize more)

So if the loss is low should I use a different loss function or not and what is overfitting. I have been thinking of making a convolutional neural network instead but until the issues are fixed I am just going to use a normal neural network.

No, sorry, what I meant by using another loss is that when you are not sure if something is working correctly, you should change stuff around and see how it behaves. That way you will know what to expect.

Maybe there is a bug in the code and something weird is happening. Or it is simply working correctly and that is the expected behavior. So if you want, you can change stuff and see how this affects the results.

Overfitting me as that your models is too good for the training data, but when it sees new data it does not work.

Basically it remembers all the inputs and cann tel if it’s a dog or a cat because it memorized each image.

This is a very basic explanation, so if you want to learn more, there are several blogs, books and tutorials that give a better explanation.

1 Like

I will use the same loss function and research on how to stop overfitting

1 Like

After I stop the overfitting I will save my model.

1 Like

I do not think my model is overfitted due to it having a very large dataset so I did not change anything for that. But, I have figured out a way to stop it training forever by putting a break in my code. Also I have chosen to not save it due to the saving messing up my code so I am worried to attempt to save it. I will just test it in the same python file by adding like a train = true or false variable or something like that.

The code right now is:

import torchvision.transforms as T
from torch.utils.data import DataLoader
import torch
import torch.nn as nn
import torch.nn.functional as f
import torch.optim as optim

path ="/Users/edenbrown/Downloads/kagglecatsanddogs_3367a/PetImages"
transform = T.Compose([T.Resize((50, 50)),T.ToTensor()])

dataset = ImageFolder(root=path, transform=transform)
dataloader = DataLoader(dataset, batch_size=10, shuffle=True)

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(3 * 50 * 50, 64) 
        self.fc2 = nn.Linear(64, 64)
        self.fc3 = nn.Linear(64, 64)
        self.fc4 = nn.Linear(64, 2)

    def forward(self, x):
        x = x.flatten(start_dim=1)
        x = f.relu(self.fc1(x))
        x = f.relu(self.fc2(x))
        x = f.relu(self.fc3(x))
        x = self.fc4(x)
        return f.log_softmax(x, dim = 1)

net = Net()

optimizer = optim.Adam(net.parameters(), lr = 0.001)
criterion = nn.NLLLoss()

Epochs = 3

for epoch in range(Epochs):
    running_loss = 0.0
    for batch_number, (images, labels) in enumerate(dataloader):
        net.zero_grad()
        output = net(images)
        loss = criterion(output, labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        if batch_number % 20 == 19:   
            print(f'[{epoch + 1}, {batch_number + 1:5d}] loss: {running_loss / 200:.3f}')
            running_loss = 0.0
            break

print('Finished Training')

I can hopefully say the training model for the dogs or cats recogniser is done after 3 days.

Also I noticed that the running loss was divided by 2000 when it should be divided by 200.
The losses are now usually around 0.070. I have seen 0.072 and 0.071 though. These losses are still very low though.

Oh, my bad.

I took the example from the link I had posted previously and changed when the print statement was done, but forgot to change this value.

And yeah, as you mentioned it is definitely NOT overfitting, but it might be something that you want to look at in the future with a better model and after enough training epochs.

You can try printing the loss instead of the running loss to see what this value is.
You can also try to change several stuff around, like the model, the optimizer, the loss, etc.

1 Like

Thank you, I will use this advice.

1 Like