Getting zero accuracy

chetan06 · May 7, 2020, 3:30pm

I am working on dog breed identification problem. While training I am getting zero accuracy though my loss is decreasing after each epoch. I can’t get where I am wrong so please help. I have used a dataset of images which is being loaded in the following class using root directory. The label passed to the following class is a pandas dataframe where the first column denotes id, i.e., the name of the file using which we can load the image. The second one is the breed_name( total 120 breeds ) and third is the labels or targets from 0 to 119.

class DogsDataset(Dataset):
    def __init__(self, labels, root_dir, transform=None):
        self.labels = labels
        self.root_dir = root_dir
        self.transform = transform
    
    def __len__(self):
        return self.labels.shape[0]
    
    def __getitem__(self, idx):
        img_name = '{}.jpg'.format(self.labels.iloc[idx, 0])
        fullname = self.root_dir+ img_name
        image = Image.open(fullname)
        label = torch.tensor(labels.iloc[idx,2])
        if self.transform:
            image = self.transform(image)
        return [image, label]

Used SGD as optimizer and cross entrpy loss as loss function

# Creating a model
use_gpu = torch.cuda.is_available()
model = models.resnet50(pretrained=True)
if use_gpu:     
    model = model.cuda()
for param in model.parameters():
    param.requires_grad = False
if use_gpu:
    model.fc = (nn.Linear(2048, num_classes)).cuda()
else:
    model.fc = (nn.Linear(2048, num_classes))


def train(model, criterion, validaton_data, train_data,  optimizer, epochs = 30):
    use_gpu = torch.cuda.is_available()
    train_loss_history = []
    valid_loss_history = []
    train_loader = {
        'train' : train_data,
        'validation' : validation_data
    }
    best_acc = 0
    best_model = None
    for epoch in tqdm(range(epochs), desc = "Loading", position = 0):
        for dataset in ['train', 'validation']:

            running_loss = 0
            running_corrects = 0
            for x, y in tqdm(train_loader[dataset]):
                optimizer.zero_grad()
                if use_gpu:
                    x, y = Variable(x.cuda()), Variable(y.cuda())
                else:
                    x, y = Variable(x), Variable(y)
                yhat = model(x)
                _, preds = yhat.max(1)
                loss = criterion(yhat, y)
                loss.backward()
                optimizer.step()
                #loss_history.append(loss.item())
                running_loss += loss.item()*x.size(0)
                running_corrects += (torch.sum(preds == y))
                                
            if dataset == 'train':
                train_loss = running_loss / 8151
                train_acc = float(running_corrects / 8151)
                print
                print("train_loss: ", train_loss)
                print("train_accuracy: ", train_acc)
                train_loss_history.append(train_loss)
            else:
                valid_loss = running_loss / 2071
                valid_acc = float(running_corrects / 2071)
                print("validation_ loss: " , valid_loss)
                print("validation_accuracy: " , valid_acc)
                valid_loss_history.append(valid_loss)
        if valid_acc > best_acc:
            best_acc = valid_acc
            best_model = model

    print()
    print("best accuracy : ", best_acc)
        
    return best_model

The transforms that I have used

normalize = transforms.Normalize(
   mean=[0.485, 0.456, 0.406],
   std=[0.229, 0.224, 0.225]
)

train_transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(200),
    transforms.ColorJitter(brightness=[0.5, 1], contrast=[0.5, 1], saturation=[0.5, 1]),  
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ToTensor(),
    normalize
])

iamgroot42 · May 7, 2020, 4:33pm

What version of Python are you on? The code seems okay (at a quick glance). The only possible reason for it to be doing what it is doing is a potential error in division, for eg.

Can you try printing the actual value of running_corrects and replace the above with running_corrects/8151.0, while you’re at it?

chetan06 · May 7, 2020, 5:17pm

I thought the same first but on seeing the running_loss expression it seems that it works quite well as I am getting the loss after each epoch. Thinking the same I had added float in the starting.

chetan06 · May 7, 2020, 5:19pm

@ptrblck and others please help

iamgroot42 · May 7, 2020, 5:29pm

So are you saying running_corrects is exactly 0 when you print it? Even with a model that makes random predictions, this seems highly unlikely.

chetan06 · May 7, 2020, 5:31pm

If you want to see I can share my whole code here. Should I?

ATUL_YADAV1 · May 7, 2020, 6:22pm

I am also getting this same error accuracy is 0. Did you find any solution ?

chetan06 · May 7, 2020, 6:36pm

For the same problem?

chetan06 · May 7, 2020, 7:05pm

I applied your idea, increased the learning rate and also the number of epochs and I am getting some accuracy value, though not enough but yeah better than zero .

ATUL_YADAV1 · May 7, 2020, 7:34pm

no, for detection but i am getting accuracy 0 and i tried to print out the correct prediction and its greater than 0.

chetan06 · May 7, 2020, 8:44pm

Then you might be committing the same mistake as I did before that is taking integer division which would always be returning integer.

iamgroot42 · May 7, 2020, 9:02pm

If it’s still low, I’d suggest toying around with hyperparameters and architectures. As a sanity check, you might want to:

Make sure you’ve defined data loaders properly (image-label mapping) by inspecting a few examples.
Start with a pre-trained feature extractor and see how a model on top of those features performs.

ATUL_YADAV1 · May 8, 2020, 7:44am

I have also applied your solution and got acc > 0 but still it is very low and i am using resnet34 for my detection task. And i have tried few samples to check if my data is correct and there is no problem in my data

and here is the link for my post.

Any suggestions why its happening

chetan06 · May 8, 2020, 7:48am

Yeah I saw the same problem with my model when I trained it today. Actually I think we should look more into augmentation and better hyperparameter search for better result.

ATUL_YADAV1 · May 8, 2020, 8:06am

I have implemented classifier and bounding box detection separately and they both performed decently but when i combine both these models results are not good.

chetan06 · May 8, 2020, 9:28am

I don’t have any working experience with object detection algorithms and just know their concepts. But I have a source that you could refer to.

ATUL_YADAV1 · May 8, 2020, 2:54pm

Thanks, I will look into it.

Kenneth · May 28, 2020, 12:24am

I had a similar problem.
I had to convert the 1d tensors into int or float to compute the accuracy.