Data overfitting - validation is not increased

Hi! I am working on the rock/scissors/paper classification in pytorch.

For some reasons, my validation is not increased. What I receive for validation is 38% maximum…

Can anyone help me to increase the validation accuracy?

MY dataset info - training dataset - generalized 300x300 rock/scissors/papers dataset from pytorch

My validation dataset is the data taken by some students (convert to the 300x300)

Let me show my code first.

Data converts to the png first,

import os
from PIL import Image
import cv2
c = 0

lst_fist01 = []
for i in os.listdir('./Dataset3/validation/paper'):
    print(i)
    if i.endswith('.jpg'):
        img = Image.open(r'./Dataset3/validation/paper/{}'.format(i))
        img = img.resize((300, 300), Image.LANCZOS)
        img.save(r'./Dataset3/validation/paper/{}.png'.format(c))
        c += 1
        print(img)
        lst_fist01.append(img)
    else:
        print('it is png')

Custom data

import os
import numpy as np
import torch
import torch.nn as nn
import natsort

from skimage.transform import resize
from PIL import Image
from skimage.color import rgb2gray
import imageio

# Data Loader
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self, data_dir, transform=None):#fdir, pdir, sdir, transform=None):
        
  
         # 0: Paper, 1: Rock, 2: Scissors
        self.paper_dir = os.path.join(data_dir,'paper/')
        self.rock_dir = os.path.join(data_dir,'rock/')
        self.scissors_dir = os.path.join(data_dir,'scissors/')
        
        self.transform = transform
    
        lst_paper = os.listdir(self.paper_dir)
        lst_rock = os.listdir(self.rock_dir)
        lst_scissors = os.listdir(self.scissors_dir)

        lst_paper = [f for f in lst_paper if f.endswith('.png')]
        
        lst_rock = [f for f in lst_rock if f.endswith('.png')]
        lst_scissors = [f for f in lst_scissors if f.endswith('.png')]
        

        self.lst_dir = [self.paper_dir] * len(lst_paper) + [self.rock_dir] * len(lst_rock) + [self.scissors_dir] * len(lst_scissors)
        self.lst_prs = natsort.natsorted(lst_paper) + natsort.natsorted(lst_rock) + natsort.natsorted(lst_scissors)
    
    def __len__(self):
        return len(self.lst_prs)
    
    def __getitem__(self, index): 
        self.img_dir = self.lst_dir[index]
        self.img_name = self.lst_prs[index]
        return [self.img_dir, self.img_name] 

    def custom_collate_fn(self, data):
        
        inputImages = []
        outputVectors = []

        for sample in data:
            prs_img = imageio.imread(os.path.join(sample[0] + sample[1]))
            gray_img = rgb2gray(prs_img)

            if gray_img.ndim == 2:
                gray_img = gray_img[:, :, np.newaxis]
            
            inputImages.append(gray_img.reshape(300, 300, 1))
#             inputImages.append(resize(gray_img, (89, 100, 1)))
            
            
            # 0: Paper, 1: Rock, 2: Scissors
            dir_split = sample[0].split('/')
            if dir_split[-2] == 'paper':
                outputVectors.append(np.array(0))
            elif dir_split[-2] == 'rock':
                outputVectors.append(np.array(1))
            elif dir_split[-2] == 'scissors':
                outputVectors.append(np.array(2))

        data = {'input': inputImages, 'label': outputVectors}

        if self.transform:
            data = self.transform(data)

        return data


class ToTensor(object):
    def __call__(self, data):
        label, input = data['label'], data['input']
        
        input_tensor = torch.empty(len(input),300, 300)
        
        label_tensor = torch.empty(len(input))
        for i in range(len(input)):
            input[i] = input[i].transpose((2, 0, 1)).astype(np.float32)
            input_tensor[i] = torch.from_numpy(input[i])
            label_tensor[i] = torch.from_numpy(label[i])
        
        input_tensor = torch.unsqueeze(input_tensor, 1)
        data = {'label': label_tensor.long(), 'input': input_tensor}

        return data

Training code

import os
import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from torchvision import transforms, datasets
from copy import copy
import warnings
warnings.filterwarnings('ignore')

num_train = len(os.listdir("./Dataset3/train/paper")) + len(os.listdir("./Dataset3/train/rock")) + len(os.listdir("./Dataset3/train/scissors"))
num_val = len(os.listdir("./Dataset3/validation/paper")) + len(os.listdir("./Dataset3/validation/rock")) + len(os.listdir("./Dataset3/validation/scissors"))

transform = transforms.Compose([ToTensor()])

dataset_train = CustomDataset("./Dataset3/train/", transform=transform)

loader_train = DataLoader(dataset_train, batch_size = 64, \
      shuffle=True, collate_fn=dataset_train.custom_collate_fn, num_workers=1)


dataset_val = CustomDataset("./Dataset3/validation/", transform=transform)
loader_val = DataLoader(dataset_val, batch_size=64, \
      shuffle=True, collate_fn=dataset_val.custom_collate_fn, num_workers=1)

# print(len(dataset_train))
# print(len(dataset_val))
# print(len(loader_train))
# print(len(loader_val), loader_val, type(loader_val))
# print(type(dataset_val.custom_collate_fn), dataset_val.custom_collate_fn)
# Define Model
model = nn.Sequential(nn.Conv2d(1, 32, 2, padding=1),
                    nn.ReLU(),
                    nn.MaxPool2d(kernel_size=2),
                    nn.Conv2d(32, 64, 2, padding=1),
                    nn.ReLU(),
                    nn.MaxPool2d(kernel_size=2),
                    nn.Conv2d(64, 128, 2, padding=1),
                    nn.ReLU(),
                    nn.MaxPool2d(kernel_size=2),
                    nn.Conv2d(128, 256, 2, padding=1),
                    nn.ReLU(),
                    nn.MaxPool2d(kernel_size=2),
                    nn.Conv2d(256, 256, 2, padding=1),
                    nn.ReLU(),
                    nn.MaxPool2d(kernel_size=2),
                    nn.Conv2d(256, 128, 2, padding=1),
                    nn.ReLU(),
                    nn.MaxPool2d(kernel_size=2),
                    nn.Conv2d(128, 64, 2, padding=0),
                    nn.ReLU(),
                    nn.MaxPool2d(kernel_size=1),
                    torch.nn.Flatten(),
                    nn.Linear(1024, 64, bias = True),
                    nn.Dropout(0.85),
                    nn.Linear(64, 3, bias = True),
                   )

soft = nn.Softmax(dim=1)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print("Current device:", device)

model.to(device)

# Define the loss
criterion = nn.CrossEntropyLoss().to(device)

# Define the optimizer
optim = torch.optim.Adam(model.parameters(), lr = 0.001)

best_epoch = 0
accuracy_save = np.array(0)
epochs = 20

for epoch in range(epochs):
    model.train()
    train_loss = []
    correct_train = 0
    correct_val = 0
    correct_batch = 0

    for batch, data in enumerate(loader_train, 1):
        label = data['label'].to(device)
        inputs = data['input'].to(device)

        output = model(inputs)
        label_pred = soft(output).argmax(1)
        
        optim.zero_grad()
        
        loss = criterion(output, label)
        loss.backward()
        
        optim.step()
        
        correct_train += (label == label_pred).float().sum()
        
        train_loss += [loss.item()]

    accuracy_train = correct_train / num_train
  
    correct_val = 0
    accuracy_tmp = np.array(0)
    
    with torch.no_grad():
        model.eval() 
        val_loss = []
        for batch, data in enumerate(loader_val, 1):

            label_val = data['label'].to(device)
            input_val = data['input'].to(device)
  
            output_val = model(input_val)
  
            label_val_pred = soft(output_val).argmax(1)
  
            correct_val += (label_val == label_val_pred).float().sum()
  
            loss = criterion(output_val, label_val)
            val_loss += [loss.item()]
  
        accuracy_val = correct_val / num_val
  
        # Save the best model wrt val accuracy
        accuracy_tmp = accuracy_val.cpu().numpy()
        if accuracy_save < accuracy_tmp:
            best_epoch = epoch
            accuracy_save = accuracy_tmp.copy()
            torch.save(model.state_dict(), 'param.data')
            print(".......model updated (epoch = ", epoch+1, ")")
        
    print("epoch: %04d / %04d | train loss: %.5f | train accuracy: %.4f | validation loss: %.5f | validation accuracy: %.4f" %
          (epoch+1, epochs, np.mean(train_loss), accuracy_train, np.mean(val_loss), accuracy_val))

print("Model with the best validation accuracy is saved.")
print("Best epoch: ", best_epoch)
print("Best validation accuracy: ", accuracy_save)
print("Done.")

For this, my result is

Current device: cuda
…model updated (epoch = 1 )
epoch: 0001 / 0020 | train loss: 1.10044 | train accuracy: 0.3313 | validation loss: 1.09863 | validation accuracy: 0.3311
…model updated (epoch = 2 )
epoch: 0002 / 0020 | train loss: 1.09831 | train accuracy: 0.3433 | validation loss: 1.09812 | validation accuracy: 0.3412
…model updated (epoch = 3 )
epoch: 0003 / 0020 | train loss: 0.70037 | train accuracy: 0.6794 | validation loss: 1.72336 | validation accuracy: 0.3682
…model updated (epoch = 4 )
epoch: 0004 / 0020 | train loss: 0.13533 | train accuracy: 0.9544 | validation loss: 2.76044 | validation accuracy: 0.3868
epoch: 0005 / 0020 | train loss: 0.06367 | train accuracy: 0.9790 | validation loss: 5.63855 | validation accuracy: 0.3514
epoch: 0006 / 0020 | train loss: 0.01963 | train accuracy: 0.9933 | validation loss: 6.07453 | validation accuracy: 0.3784
epoch: 0007 / 0020 | train loss: 0.00843 | train accuracy: 0.9972 | validation loss: 5.76545 | validation accuracy: 0.3733
epoch: 0008 / 0020 | train loss: 0.00917 | train accuracy: 0.9968 | validation loss: 8.41605 | validation accuracy: 0.3547
epoch: 0009 / 0020 | train loss: 0.01139 | train accuracy: 0.9960 | validation loss: 11.67817 | validation accuracy: 0.3395
epoch: 0010 / 0020 | train loss: 0.07132 | train accuracy: 0.9774 | validation loss: 3.93959 | validation accuracy: 0.3598
…model updated (epoch = 11 )
epoch: 0011 / 0020 | train loss: 0.03753 | train accuracy: 0.9857 | validation loss: 3.51701 | validation accuracy: 0.3885
epoch: 0012 / 0020 | train loss: 0.01213 | train accuracy: 0.9948 | validation loss: 3.99175 | validation accuracy: 0.3818
epoch: 0013 / 0020 | train loss: 0.00558 | train accuracy: 0.9984 | validation loss: 4.74905 | validation accuracy: 0.3784
…model updated (epoch = 14 )
epoch: 0014 / 0020 | train loss: 0.00826 | train accuracy: 0.9964 | validation loss: 4.72692 | validation accuracy: 0.3936
epoch: 0015 / 0020 | train loss: 0.00351 | train accuracy: 0.9996 | validation loss: 5.66631 | validation accuracy: 0.3716

Can anyone give me some tips to increase validation accuracy?

Thank you a lot for reading.

Hi Dong!

I haven’t looked at your code, but here are a couple of comments:

You don’t say how large you training dataset is. If it’s rather small
it will be easy to overfit, and hard to train your network well enough
to perform well on you validation dataset.

If your training dataset is small, the best thing would be to train
on a larger dataset. If you can’t get a larger training dataset, you
might be able to get better results by augmenting your training
dataset by including things like flipped and shifted and rescaled
versions of your original images.

If your validation dataset is rather different in character from your
training dataset – and here it might be because it has been collected
differently from your training dataset – even a well-trained network
might perform poorly on your validation dataset.

In an ideal world, you would train your network on a training dataset
and then have perform inference well on rather different images. But
this isn’t always realistic.

Consider training a cat-dog classifier only on images of short-haired
cats and long-haired dogs. If you then try to validate it on a dataset
containing only long-haired cats and short-haired dogs, you shouldn’t
be surprised if it does poorly, maybe even systematically mistaking
cats for dogs and vice versa.

I would suggest that you randomly split your original training dataset
in to a smaller training dataset and a separate validation dataset.

You could also combine your original training and validation datasets
into one large dataset, and then randomly split that into a training
and a validation dataset.

Both of these approaches will ensure that your training and validation
datasets have the same character, avoiding the “short-haired cat,
long-haired dog” issue.

Best.

K. Frank

Thank you so much for your kind reply

My train data set is about 2,500 images (840 for each rock, scissors, papers)

My validation data is about 600 images (200 for each rock, scissors, papers)

So, I need to augment the “train data”? or both dataset, I am sorry to ask you dum question, but I am a beginner of learning pytorch.

Thanks!