Why does increasing the amount of layers and neurons not effect my accuracy?

Hey Community!

I hope some of you can help me out with this one:

So, I have a network, made of 36 Inputs. Normally, this network has more than half a million data-fields, but currently I´m working with a smaller amount of data. (About 40.000 datas)

The networks aim is to predict the amount of people in a room, depending on their environment. The environment is represented through the 36 input-neurons.

In first place, I tried using 4 layers, with 36 In- and Outputs. The absolute difference between predicted people and the amount of people that should be there, is about 160. This is quite a lot, when the amount only moves between 0 and 2100. When I do this with all the data it is about 200.

So, now I tried to increase the amount of layers and neurons in my network. As you can see in my code, I have 10 layers, all of them have at least 10.000 output-neurons. How can it be, that my network is still that in-accurate? Does the change effect my network in any way? Can someone please tell me? I know the scheme behind neuronal networks, and increasing all these parameters should help…normally…

So this is my Network-Class:

    class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.linear1 = nn.Linear(36, 10000)  #36 Input-Neurons, 36 Output-Neurons, Linearer Layer
        self.linear2 = nn.Linear(36, 10000)
        self.linear3 = nn.Linear(36, 10000)
        self.linear4 = nn.Linear(36, 10000)
        self.linear5 = nn.Linear(36, 20000)
        self.linear6 = nn.Linear(36, 20000)
        self.linear7 = nn.Linear(36, 10000)
        self.linear8 = nn.Linear(36, 10000)
        self.linear9 = nn.Linear(36, 10000)
        self.linear10 = nn.Linear(36, 1)


    def forward(self, x):
        pax_predict = F.torch.relu(self.linear1(x))
        pax_predict = F.torch.relu(self.linear2(x))
        pax_predict = F.torch.relu(self.linear3(x))
        pax_predict = F.torch.relu(self.linear4(x))
        pax_predict = F.torch.relu(self.linear5(x))
        pax_predict = F.torch.relu(self.linear6(x))
        pax_predict = F.torch.relu(self.linear7(x))
        pax_predict = F.torch.relu(self.linear8(x))
        pax_predict = F.torch.relu(self.linear9(x))
        pax_predict = self.linear10(x)
        return pax_predict

    def num_flat_features(self, pax_predict):
        size = pax_predict.size()[1:]
        num = 1
        for i in size:
            num *= i
        return num



network = Network()
print('Netzwerk:')
print(network)
print()

Followed by the training:

optimizer = torch.optim.Adam(network.parameters(), lr=50)
#optimizer = torch.optim.SGD(network.parameters(), lr=0.0022, momentum=0.8)



switch = 1


def training():

    target = y_train_tensor

    for epoch in range(200):

        input = x_train_tensor
        y_prediction = network(input)
        loss = criterion(y_prediction, target)
        wurzel = math.sqrt(loss)
        loss_avg = loss /len(y_train_tensor)
        wurzel_avg = math.sqrt(loss_avg)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        print('Epoch:', epoch, ' Total Loss:', loss.data,'Sqroot:', wurzel)
        print('Average Loss:', loss_avg, 'Avg. Sqrt:', wurzel_avg)
        print()

        plt.scatter(epoch, loss_avg.data, color='r', s=10, marker='.')


    plt.xlabel('Epochen')
    plt.ylabel('Loss')
    plt.savefig('./plot/loss/figure.png', dpi = 300)


    ## Gewichtsausgabe #########################################################################################
    ############################################################################################################


    weights_list = list(network.parameters())
    print('Gewichtsliste:')
    print (weights_list)

    weights_numpy = np.asarray(weights_list)

    print('Array:', weights_numpy)

I hope that someone of you can help me.

Thank you so much in advance!

Kind Regards
Christian Richter

Before a professional person comes here to explain.
I think you misuse the nn.linear:
You can read this example or read this How to use nn.linear

    def __init__(self):
        super(Network, self).__init__()
        self.linear1 = nn.Linear(32, 64)  #32 Input-Neurons, 64 Output-Neurons, Linearer Layer
        self.linear2 = nn.Linear(64, 128) #64 Input-Neurons,128 Output-Neurons, Linearer Layer

I just want to share what I learned. If you are an increasing amount of layers and neurons, you are just an increasing amount of variable or feature. But the problem is your input data does not have that much feature for you to extract. I am not really sure. I remember it is also called the overfitting problem.

In addition, even though your input data has that many features to be extracted, you may have a gradient explosion and vanishing problem.

If that so, we need something to can learn to turn your model smaller or bigger when it needs to:
It is called ResBlock

Then try to put those ResBlocks between every layer
Reference:

class ResBlock(nn.Module):
    def __init__(self, nf):
        super().__init__()
        self.linear1 = conv_layer(nf,nf)
        self.linear2 = conv_layer(nf,nf)

Concept of ResBlock and how to use it to improve it by adding more layers:


Example code for using ResBlock: https://colab.research.google.com/drive/1NVxx0E6GoX9vzaigCs1urVN3529DsBtn#scrollTo=n3r8ZHuObvXu

If you want, please share it in Colab notebook or the data set to me and I will try to use a basic linear layer to train or possible ResBlock to improve it if I have time.

I feel you don’t really understand the nn.linear layer. Try to read this: https://stackoverflow.com/questions/54916135/what-is-the-class-definition-of-nn-linear-in-pytorch

2 Likes

Thank you very much for your answer!!
I will try this in future.

I also found out, that I made a mistake in my forward-pass. It´s a copy & paste mistake. I always used x in each line, instead of an “growing” variable. (x1, x2, x3 and so on)

I don’t think I have answered you anything. I hope someone who is professional to come here and answer your question.

@JonathanSum is correct. Increasing the number of neurons/layers don’t necessarily enhance the performance. In fact, too many neurons can cause over-fitting on small amounts of data. Increasing the layers can cause problems, too(u can read the ResNet paper). But for ur case, I think the bad performance is likely due to the improper use of nn.Linear

1 Like

Thank you @G.M for your submission.
You were right, the mistake was the way I used the layers.
And thank you for the information about overfitting.