How to improve my AI model (numbers layers/neurons and layers' type)

I’m studying AI in practice using pytorch, so I would like some tips to improve the model I created.

The model’s input/target follow the role:

if input[0] + input[1] >= 0:
  target = 10
else:
  target = -10
inputs = torch.rand(1000000, 2).sub(.5).mul(200)

# tensor([[ 25.8606, -40.9372],
#        [ 79.3800,  44.6780],
#        [ 76.0107, -77.0862],
#        [-21.3406, -27.1793],
#        [ 71.2402, -81.8930],
#        [ 37.5089,  73.1345],
#        [ 73.6303,  69.5081],
#        [-52.1048, -70.0359],
#        [ 45.5813, -86.4757],
#        [ 96.3207,   5.4885]])
targets = (inputs.sum(1, keepdim=True) > 10).float().sub(.5).mul(20)

# tensor([[-10.],
#        [ 10.],
#        [-10.],
#        [-10.],
#        [-10.],
#        [ 10.],
#        [ 10.],
#        [-10.],
#        [-10.],
#        [ 10.]])

The model

import os
from pprint import pprint

import torch
from torch import nn


class Model2D(nn.Module):
    CHECKPOINT_DIR = './checkpoints'

    def __init__(self):
        super().__init__()
        self.layer_1 = nn.Linear(in_features=2, out_features=4)
        self.layer_2 = nn.Linear(in_features=4, out_features=2)
        self.layer_3 = nn.Linear(in_features=2, out_features=1)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        return self.layer_3(self.layer_2(self.layer_1(x)))

    def save(self):
        torch.save(obj=self.state_dict(),
                   f=f"{self.CHECKPOINT_DIR}/Model2D.pth")

    def load(self):
        if (os.path.isfile(f"{self.CHECKPOINT_DIR}/Model2D.pth")):
            self.load_state_dict(torch.load(f"{self.CHECKPOINT_DIR}/Model2D.pth"))
            print("The checkpoints loaded with the following values for weights and bias: ")
            pprint(self.state_dict())

It is the Train loss and Test loss

The problem is that my model does not learn more. I think it is because the number/type of layers.

Somebody could help me with some tips to improve my model?

I shared the code on Google Colab

use non linear functions (activations) after your linear layers, torch.relu() could be a good start

def forward(self, x: torch.Tensor) → torch.Tensor:
x = torch.relu(self.layer_1(x))
… rest of layers

1 Like

Maybe you can get better results if you use non-linearity. In this case use a activation function between the layers. I would suggest to use nn.ReLU()](ReLU — PyTorch 2.3 documentation). My suggestion is to change for this:

def __init__(self):
        super().__init__()
        self.layer_1 = nn.Linear(in_features=2, out_features=4)
        self.relu = nn.ReLU()
        self.layer_2 = nn.Linear(in_features=4, out_features=2)
        self.relu = nn.ReLU()
        self.layer_3 = nn.Linear(in_features=2, out_features=1)
1 Like

Thanks for the suggestions.
I changed my model to include nn.ReLU activations.
I also changed my labels from (-10 or 10) to (0 or 1)

Now, I’m having better inferences.

My model changed:

    def __init__(self):
        super().__init__()
        self.layer_1 = nn.Linear(in_features=2, out_features=4)
        self.act_1 = nn.ReLU()
        self.layer_2 = nn.Linear(in_features=4, out_features=2)
        self.act_2 = nn.ReLU()
        self.layer_3 = nn.Linear(in_features=2, out_features=1)
        # self.act_3 = nn.Sigmoid()

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.act_1(self.layer_1(x))
        x = self.act_2(self.layer_2(x))
        # x = self.act_3(self.layer_3(x))
        x = self.layer_3(x)
        return x

I’m using nn.L1Loss() as loss functions. Do you suggest a more appropriate loss function for this kind of problem?

1 Like

I think it is a good loss function for the problem you are trying to solve. But if you want to make some experiments to check, i would give a try on the L2 loss or Mse Loss as well.

1 Like