# Wrong Output: Average loss doesn't go below certain number

Hey Community,
I´m very new to PyTorch and wrote my first working programm recently.
I do this for my bachelor-degree and have problems with the output of my model.

I am using a dataset about the number of people, beeing in a room at the same time, depending on some parameters like the date, time, holidays, weather-conditions and so on. The aim is to predict the number of people for a certain time, like half an hour, an hour or two in the future.

There are 13 Input-Values and one output.

I think the biggest trouble I have right now is to configure the parameters correctly that my loss drops lower than it is. Right now, it is always about 1.5 in average.

I hope someone of you has some tips or tricks how to get along with my data correclty, since this is required for my degree.

I will leave you the code here.

Kind regards
Christian Richter

``````import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import pandas as pd
import matplotlib.pyplot as plt

### Dataset ###

x_temp = dataset.iloc[:, :-1].values

print(x_temp)
print()
print(x_temp.size)
print()

y_temp = dataset.iloc[:, 13:].values

print(y_temp)
print()
print(y_temp.size)
print()

x_train_tensor = torch.FloatTensor(x_temp)
y_train_tensor = torch.FloatTensor(y_temp)

### Network Architecture ###

class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.linear1 = nn.Linear(13, 13)  #13 Input-Neurons, 13 Output-Neurons, Linearer Layer
self.linear2 = nn.Linear(13, 13)
self.linear3 = nn.Linear(13, 13)
self.linear4 = nn.Linear(13, 1600)

def forward(self, x):
pax_predict = F.torch.sigmoid(self.linear1(x))
pax_predict = F.torch.sigmoid(self.linear2(x))
pax_predict = F.torch.sigmoid(self.linear3(x))
pax_predict = self.linear4(x)
return pax_predict

def num_flat_features(self, pax_predict):
size = pax_predict.size()[1:]
num = 1
for i in size:
num *= i
return num

network = Network()
print(network)

## Loss-Functions ###

criterion = nn.MSELoss()

target = Variable(y_train_tensor)

### OPTIMIZER ###

#optimizer = torch.optim.SGD(network.parameters(), lr=0.00000001)       #Epochen: 50-100
#optimizer = torch.optim.Adam(network.parameters(), lr=5)               #Epochen: 50
optimizer = torch.optim.SGD(network.parameters(), lr=1, momentum=0.8)               #Epochen: 50-100

### Training ###

for epoch in range(100):
input = Variable(x_train_tensor)
y_pred = network(input)

loss = criterion(y_pred, target)

loss_avg = loss /len(y_train_tensor)

loss.backward()
optimizer.step()

print('Epoch:', epoch, ' Total Loss:', loss.data)
print('Average Loss:', loss_avg)
print()

plt.scatter(epoch, loss_avg.data, color='r', s=10, marker='o')

#plt.show
plt.savefig('./plot/figure.png')

#test_exp = torch.Tensor([,, , , , , , , ])
test_exp = torch.Tensor([[231216,36,5,0,1,0,0,0,0,1,1,1,1]])

result = network(test_exp).data.item()

print('Vorhergesagte Anzahl: ', result)
``````

The code looks generally alright besides some minor issues:

• `Variables` are deprecated since PyTorch `0.4.0`, so you can just use tensors directly in newer versions
• Although this shouldn’t be an issue in your code, you shouldn’t use the `.data` attribute, as this might create silent errors in training code

That being said, the learning rate seems to be a bit high.
Have you tried to use other non-linear activations, e.g. relu?
I would recommend to scale down your model (and data) a bit and make it work with a simplified version.
Once your model trains successfully, you could try to scale it up again.

1 Like

Thank you very much for your suggestions!
I removed the `Variables` and use the tensors now directly.
When I remove the `.data` Attribute from the result and `plt.scatter`-function, there will be an error-code. How do I get around this? Deleting it from `print('Epoch:', epoch, ' Total Loss:', loss.data)` went without problems.

I tried a lot of learning-rates. Most of them converge against a loss of 1.5.

I also tried relu in my forward-pass but for thesis-reasons I use sigmoid.
Or do you mean the “linear-layers”? If so, could you tell me how to use a different type there? I wasn´t able to find a working alternative.

I try downsizing my model a little and will see, how the output changes.
Thank you so much so far!

Regards
Christian

EDIT:
I tried using a smaller data-model. This delivered less accurate values.
The average-loss converges now against 2.

I tried using ReLU in my `__init__`-part, but this error-message comes up:

``````TypeError: __init__() takes from 1 to 2 positional arguments but 3 were given
``````

I don´t know why linear works while ReLU or Sigmoid won´t.