Wrong target format ImageFolder and DataLoader

Silviu-Alexandru_Din · November 10, 2021, 12:56pm

Hey guys,

I am new to working with Images and Pytorch and I have an issue with data format.
I’m using ImageFolder utility to load my images and classes, and I have the following folder structure:

I’m loading the images with:

training_data = datasets.ImageFolder(root=os.path.join(curr, r'fer2013/train/'),
                                     transform=transforms.ToTensor())
training_set = DataLoader(training_data, batch_size=10, shuffle=True)

and then I am looping over it like so:

for epoch in range(5):  # 3 full passes over the data
    for idx, data in enumerate(training_set):  # `data` is a batch of data
        X, y = data  # X is the batch of features, y is the batch of targets.
        net.zero_grad()
        output = net(X)
        loss = F.nll_loss(output, y.unsqueeze(1))
        loss.backward() 
        optimizer.step() 
    print(loss)

where net is an instance of my Net() class.

Below are the shapes of torch.argmax(output, dim=1), output, y, and y.unsqueeze(1) in this exact order:
torch.Size([10, 48, 7]) torch.Size([10, 3, 48, 7]) torch.Size([10]) torch.Size([10, 1])

and I am getting this error on the F.nll_loss function call:
RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of dimension: 2

Is this something related to how the data is being loaded, or should I apply some operations to the target y?

Thank you for any help!

ptrblck · November 10, 2021, 9:25pm

The output shape doesn’t match the target shape for the provided output.
nn.NLLLoss expects a model output in the shape [batch_size, nb_classes, *] and a target in the shape [batch_size, *] containing class indices in the range [0, nb_classes-1].
Since you are using ImageFolder I would assume you are working on a multi-class classification use case.
In that case the model output should have the shape [batch_size, nb_classes] and the target [batch_size] (note the missing additonal dimensions).
Remove the .unsqueeze operation performed on y and make sure the model output matches the expected shape.