What can I do to get a better accuracy?

A new dataset of 60000 8x8 color images from 10 class, with 6000 images per class was create. The training set has 5400 images and the valid set has 600 images. The dataset was obtained by dividing into 8x8 blocks each of the images corresponding to the DIBCO 2017 and classified according to their characteristics by applying a transform in the frequency domain.

My code is the following:

from __future__ import print_function
import argparse
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from visualization.visdom import Visualizations

import numpy as np
from PIL import Image
import random
from io import BytesIO

# Initialize the visualization environment
vis = Visualizations()

train_loss_list = []
valid_loss_list = []

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(3 * 8 * 8, 385)
        self.fc2 = nn.Linear(385, 10)
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.dropout(x)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

def train(args, model, device, train_loader, optimizer, epoch):
    # Before training the model, it is imperative to call model.train()

    global_loss = 0.0
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        data = data.view(args.batch_size,  3 * 8 * 8)
        target = target.view(args.batch_size)
        output = model(data)
        loss = F.nll_loss(output, target)
        global_loss = (global_loss*(batch_idx) + loss.item())/(batch_idx+1)
        if batch_idx % args.log_interval == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * args.batch_size, len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))
    return global_loss

def test(args, model, device, test_loader, epoch):
    # You must call model.eval() before testing the model

    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            data = data.view(args.test_batch_size,  3 * 8 * 8)
            target = target.view(args.test_batch_size)
            output = model(data)
            # sum up batch loss
            test_loss += F.nll_loss(output, target, reduction='sum').item()
            # get the index of the max log-probability
            pred = output.argmax(dim=1, keepdim=True)
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    acc = 100. * correct / len(test_loader.dataset)

        '\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
            test_loss, correct, len(test_loader.dataset), acc))
    return {'test_loss': test_loss, 'acc':acc}

def randomJpegCompression(image):
    p = random.random()
    outputIoStream = BytesIO()
    if p > 0.9:
        return image
    elif p > 0.8:
        image.save(outputIoStream, "JPEG", quality=75, optimice=True)
    elif p > 0.45:
        image.save(outputIoStream, "JPEG", quality=50, optimice=True)
        image.save(outputIoStream, "JPEG", quality=20, optimice=True)
    return Image.open(outputIoStream)

def main():
    # Training settings
    parser = argparse.ArgumentParser(description='FNN')
        '--batch-size', type=int, default=100, metavar='N',
        help='input batch size for training (default: 64)')
        '--test-batch-size', type=int, default=100, metavar='N',
        help='input batch size for testing (default: 1000)')
        '--epochs', type=int, default=600, metavar='N',
        help='number of epochs to train (default: 10)')
        '--lr', type=float, default=0.01, metavar='LR',
        help='learning rate (default: 0.01)')
        '--momentum', type=float, default=0.9, metavar='M',
        help='SGD momentum (default: 0.9)')
        '--no-cuda', action='store_true', default=False,
        help='disables CUDA training')
        '--seed', type=int, default=1, metavar='S',
        help='random seed (default: 1)')
        '--log-interval', type=int, default=100, metavar='N',
        help='how many batches to wait before logging training status')
        '--save-model', action='store_true', default=True,
        help='For Saving the current Model')
    args = parser.parse_args()
    use_cuda = not args.no_cuda and torch.cuda.is_available()


    device = torch.device("cuda" if use_cuda else "cpu")

    kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}
    # Transforms
    simple_transform = transforms.Compose(
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

    # Dataset
    train_dataset = datasets.ImageFolder('data/train/', simple_transform)
    valid_dataset = datasets.ImageFolder('data/valid/', simple_transform)

    # Data loader
    train_loader = torch.utils.data.DataLoader(
        dataset=train_dataset, batch_size=args.batch_size, shuffle=True,

    test_loader = torch.utils.data.DataLoader(
        dataset=valid_dataset, batch_size=args.test_batch_size, shuffle=False, num_workers=2)

    model = Net().to(device)
    optimizer = optim.SGD(
        model.parameters(), lr=args.lr, momentum=args.momentum)

    for epoch in range(1, args.epochs + 1):
        global_loss_train = train(
            args, model, device, train_loader, optimizer, epoch)
        dic = test(args, model, device, test_loader, epoch)

        # Visualization data
        vis.plot_loss_train(global_loss_train, epoch)
        vis.plot_loss_valid(dic['test_loss'], epoch)
        vis.plot_acc(dic['acc'], epoch)

    if (args.save_model):
        torch.save(model.state_dict(), "fnnNew_600epoch_1xlayers385.pt")

if __name__ == '__main__':

and these are the values of train and valid loss and accuracy that I get.

These values are not good. I see that the loss train is not low enough, or at least fast enough.

I have trained other models using a similar dataset and in the first periods both the loss train and the loss valid are less than 1.


What can I do to get a better accuracy ??

Since you are using the functional dropout, you should pass the self.training attribute to it.
Otherwise it will be always enabled (during training and validation):

x = F.dropout(x, p=0.5, self.training)

Also, have you tried to use a CNN, which might work better on image data?

Have you tried playing with the hyperparameters, e.g. reducing the learning rate etc.?
Did you use the same model and training parameters for the other dataset?

Hi! @ptrblck.
Thank you for your help me.

I do not understand why this:

x = F.dropout (x, p = 0.5, self.training)

F.dropout has a training attribute that by default is True.

Yes, the first thing I tried was with CNN but it did not work out. The explanation I give is that the CNN really give good results when we try to get information from the image but in the spatial domain (something closer to the human view). But in this case the classification is given by frequency characteristics. For example, one of the most used DCT transforms. Imagine that an image is calculated by the 2D DCT and wants to classify them but given the information of the DCT.

Yes, I read in other articles that when the image is small as it is the case (8x8) I must decrease the learning rate. I also tried several values ​​for the dropout (0.2, 0.3 and 0.5)

Yes, I tried the same model but with a similar dataset and got acceptable results. This gives me the direction that the problem may be in the dataset used.

By default F.dropout will be activated.
Since you most likely don’t want to use it during evaluation, you would have to disable it by passing the self.training attribute to it.

Could you use a small subset of your dataset (e.g. just 10 samples) and try to overfit your model on it?
If that doesn’t work, i.e. if the loss is not decreasing towards zero, you might have a bug in your code, so that we might need to dig a bit deeper.

The loss train in 100 epochs


Other sprint (with visdom)

Are these plots the results for just 10 samples?
If so, it looks a bit bad, as your model cannot overfit this tiny dataset in 100 epochs.
Let me know, what kind of experiments you have tried and we’ll debug it further.

Thank! @ptrblck

Yes, these plots correspond to the experiment with only 10 images per class.
OK. I will try to summarize as I can and share it.