Model accuracy stuck at ~0.5 and loss stuck at ~0.69

cagedgandalf · November 20, 2020, 3:10pm

Hi, I am new to PyTorch. I am trying to make a binary classifier for dogs and cats but the model does not seem to be learning. I suspect it is the way I calculate the accuracy and loss but I am not sure.
How I preprocessed the data:

class DataProcess:

    def __init__(self):
        self.batch_size = 32
        self.resolution = (64, 64)

    def dataload(self, path):
        mean, std = self.get_mean_std()
        train_t = transforms.Compose([transforms.Resize(self.resolution),
                                      transforms.RandomResizedCrop(self.resolution, ratio=(1.2, 1.2)),
                                      transforms.RandomHorizontalFlip(),
                                      transforms.ToTensor(),
                                      transforms.Normalize(mean, std)])
        test_t = transforms.Compose([transforms.Resize(self.resolution),
                                     transforms.ToTensor()])
        training_set = torchvision.datasets.ImageFolder(path + 'training_set',
                                                        transform=train_t)
        test_set = torchvision.datasets.ImageFolder(path + 'test_set',
                                                    transform=test_t)
        training_set = DataLoader(training_set, batch_size=self.batch_size, shuffle=True)
        test_set = DataLoader(test_set, batch_size=self.batch_size, shuffle=True)
        return training_set, test_set, self.batch_size

This is my code for the model:

class ConvNet(nn.Module):

    def __init__(self):
        super(ConvNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, 7)
        self.max_pool1 = nn.MaxPool2d(3, 3)
        self.conv2 = nn.Conv2d(64, 32, 3)
        self.conv3 = nn.Conv2d(32, 64, 3)
        self.avg_pool1 = nn.AvgPool2d(2, 2)
        self.fc1 = nn.Linear(64 * 7 * 7, 500)
        self.fc2 = nn.Linear(500, 1)

    def forward(self, x):
        x = self.max_pool1(f.relu(self.conv1(x)))
        x = f.relu(self.conv2(x))
        x = self.avg_pool1(f.relu(self.conv3(x)))
        x = x.view(-1, 64 * 7 * 7)
        x = f.relu(self.fc1(x))
        x = self.fc2(x)
        return x

Model training:

optimizer = optim.Adam(net.parameters(), lr=0.001)
training_bar = ProgressBar(250)  # from my loading bar module
for epoch in range(2):
    running_loss = 0.
    correct = 0.
    net.train()
    for i, training_i in enumerate(training_set, 0):
        data, target = training_i
        target = target.float()
        optimizer.zero_grad()
        output = net(data)
        loss = criterion(output, target.unsqueeze(1))
        loss.backward()
        optimizer.step()
        correct += (output == target.unsqueeze(1)).float().sum()
        running_loss += loss.item()
        acc = correct / ((i + 1) * batch_size)
        training_bar.trainingbar(epoch + 1, 2, acc, loss=running_loss / (i + 1))  # helps me see the progress of the training
print('Finished Training')

Thanks for the help in advance!!

Abhilash_Srivastava · November 20, 2020, 7:02pm

Your training code looks fine to me. A few things to try out:

Check how good your training data is and what’s the datasize like. A larger amount of high quality (and unbiased dataset) would result in a better performance.
Once, you’re happy with the initial model, try different hyperparams (number of conv layers, hidden size, learning rate etc).
Run for more number of epochs. Calculate the average loss for each epoch (not just each iteration).