Is mutually inclusive images for image classification a good dataset?

amirtmgr · December 16, 2022, 8:29pm

Hi! All,

I recently used CelebA dataset for image classification and used transfer learning for this purpose, but the loss was highly negative (-5341) with accuracy of just around 6-7% on 100 epochs. NLL was used as criterion. To find if there is any wrong with my code, I used the same code for CIFAR-10, it worked great with accuracy of around 90%.

For CelebA dataset, I categorized 128 images per class in 40 different folders. I used 40 different attributes of this dataset as classes. So, I found, most of the images are mutually inclusive between different classes.

My conclusion is that high loss was due to the imbalanced dataset that I had categorized into 40 different folders which were mutually inclusive images.

Can someone please correct my deduction?

Thank you in advance.

Regards,
Amir

ptrblck · December 17, 2022, 7:02am

A negative loss sounds wrong. Could you explain what your model outputs contain? nn.NLLLoss expects log probabilities, so did you pass logits to the criterion instead?

model = nn.Linear(10, 10)
x = torch.randn(10, 10)
target = torch.randint(0, 10, (10,))
criterion = nn.NLLLoss()

# right
output = model(x)
output = F.log_softmax(output, dim=1)
loss = criterion(output, target)
print(loss)
# tensor(2.3047, grad_fn=<NllLossBackward0>)

# wrong
output = model(x)
loss = criterion(output, target)
print(loss)
# tensor(-0.2415, grad_fn=<NllLossBackward0>)

amirtmgr · December 30, 2022, 4:05am

Thank you for your help. Yes, I was using Logits as an input instead of log probabilities.