CNN does not predict properly / does not converge as expected

Hello world,

in order to become familiar with machine learning I am implementing a CNN to work with the Malaria Cells Dataset from kaggle. The CNN is learning, and it predicts values, but they are completely useless.The accuracy is worse than hell and I dont know why. I paid attention to the topology and implementing non linear layers as well as an appropriate loss function. I wonder why the target tensor is a horizontal and not vertical, but this does not seem to be the problem. The prediction vector [value for class zero, value for class one] is nearly the same for every input. This does not make sense to me. When I use the Softmax Function at the end, then the prediction are nearly [0.5 0.5] for every Input after severa l epochs, but shows a loss of less than one of a thousand. I know, that these ‘predictions’ are in training mode(-/method), but they would perform even worse in evaluation mode I guess. Hence, I did not run the evaluation method and it’s not even implemented correctly. I really appreciate your help and thank you for your time.

Edit: Increasing the learning rate to 0.1 causes an output of nearly [1 0] for every input.

Edit 2: What I tried so far:
use torchvision normalization, use dropout, change SGD to Adam Optimizer, changed Learning Rate. Nothing helped.

Here is the Code

import torch
import torch.nn as nn
import torch.nn.functional
import as d
import torchvision.datasets as tv
import torchvision.transforms as trans
import torch.nn.functional as F
import torch.optim as opt

class MalariaNetwork(nn.Module):
    def __init__(self):
        super(MalariaNetwork, self).__init__()
        self.conv1 = nn.Conv2d(1,8,kernel_size = (5,5)) 
        self.conv2 = nn.Conv2d(8,16, kernel_size = (3,3))
        self.conv3 = nn.Conv2d(16, 32, kernel_size = (3,3))
        self.conv4 = nn.Conv2d(32,64, kernel_size= (5,5))
        #Maxpool Layers
        self.maxp1 = nn.MaxPool2d((2,2))
        self.maxp2 = nn.MaxPool2d((2,2))
        self.maxp3 = nn.MaxPool2d((2,2))
        self.maxp4 = nn.MaxPool2d((2,2))
        #Linear Layers
        self.linear1 = nn.Linear(64*5*5, 600)
        self.linear2 = nn.Linear(600,100)
        self.linear3 = nn.Linear(100,2)
    def forward(self,x):
        x = F.relu(self.conv1(x))# In: 1*128*128, Out: 8x124x124
        x = self.maxp1(x)        # In: 8x124x124, Out: 8*62*62
        x = F.relu(self.conv2(x))# In: 8*62*62,   Out: 16*60*60
        x = self.maxp2(x)        # In: 16x60x60,   Out: 16*30*30
        x = F.relu(self.conv3(x))# In: 16*30*30,  Out: 32*28*28
        x = self.maxp3(x)        # In: 32*28*28,  Out: 32*14*14
        x = F.relu(self.conv4(x))# In: 32*14*14,  Out:64*10*10
        x = self.maxp4(x)        # In: 64*10*10,  Out:64*5*5
        x = x.reshape(-1,64*5*5)
        x = F.relu(self.linear1(x))
        x = F.relu(self.linear2(x))
        x =self.linear3(x)
        #x = F.softmax(x)
        return x

data_transformations = [trans.Grayscale(),
def load_data(data_dir, train_perc = 10, val_perc = 70, batch_size = 10):
    dataset = tv.ImageFolder(data_dir, transform = trans.Compose(data_transformations))
    data_size = len(dataset) #Anzahl zu ladender Daten
    train_size, val_size = train_perc*data_size//100, val_perc*data_size//100 
    #print(train_size, val_size)
    indcs = list(range(data_size))
    train_indcs, val_indcs = indcs[(data_size//2)-(train_size//2) : (data_size//2)+(train_size//2) ], indcs[train_size:val_size] 
    train_sampler, val_sampler = d.SubsetRandomSampler(train_indcs), d.SubsetRandomSampler(val_indcs)
    train_loader = d.DataLoader(dataset, batch_size = batch_size, sampler = train_sampler) 
    eval_loader = d.DataLoader(dataset, batch_size =  batch_size, sampler = val_sampler) 
    return train_loader, eval_loader
def train(network =  None, epochs = 0, train_loader =  None, learning_rate = 0.001):
    optimizer = opt.SGD(network.parameters(), lr=learning_rate)
    loss_fn = torch.nn.CrossEntropyLoss()
    for epoch in range(epochs):
        for inputs, targets in train_loader:
            outputs = network(inputs)
            loss = loss_fn(outputs, targets)
            print(outputs, targets, epoch)
            #print("OP-Size: ",outputs.size(), "Target-Size: ", targets.size())
    return network
def evaluate(trained_network =  None, eval_loader =  None):
    for inputs, targets in eval_loader:
        output = trained_network(inputs)
        out_idx = torch.argmax(output, dim = 1)
        print(out_idx, targets)
    print("I am an evaluating Method!")
nw = MalariaNetwork()
train_loader, eval_loader = load_data('dir')
trained_network = train(nw, epochs = 40, train_loader = train_loader, learning_rate= 0.01)
#evaluate(trained_network=trained_network, eval_loader = eval_loader)

I once had a similar problem where my predictions would just freak out. I lowered the learning rate and boom done, spent too much time debugging that…

I suggest that you try to overfit to a subset of your dataset (like 100 images) with a lower learning rate(0.001). Report back :wink:

1 Like

Hi, thanks for your advice :slight_smile: I started mit PC today, let the program run with the same parameters and for some reason it performs quite decent. You might be right that changing the parameters is the way to go. Unfortuneatly I can not learn anything from this problem, because there is no specific method to solve it. It is more like try and error.

Sure there are! Just takes a lot of time to learn the tricks. This AI dude just released his way to traing neural networks - to avoid such problems. Have a look