# My basic model doesn't learn, can someone help me?

the code does some basic backprop, than calculates the avarage loss for the first and second half. the loss stays about the same… I think I messed up with the backpropagation somehow?

``````import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

device = "gpu" if torch.cuda.is_available() else "cpu"
print(f"device: {device}")

class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
# layers:
self.L1 = nn.Linear(1, 4)
self.L2 = nn.Linear(4, 8)
self.L3 = nn.Linear(8, 10)
self.L4 = nn.Linear(10, 10)
self.L5 = nn.Linear(10, 1)

def forward(self, x):
x = torch.sigmoid(self.L1(x))
x = torch.sigmoid(self.L2(x))
x = torch.sigmoid(self.L3(x))
x = torch.sigmoid(self.L4(x))
x = self.L5(x)
x = F.softmax(x, dim=0)
return x

firstHalf = 0
secondHalf = 0
net = Network().to(device)
optimizer = optim.SGD(net.parameters(), lr=0.01)
criterion = nn.MSELoss()
epochs = 500

while epochs > 0:
input = torch.rand(1)

# determine target according to the input: 1 for >0.5, 0 for <0.5
if input[0] < 0.5:
target = torch.zeros(1)  # tensor([0.])
else:
target = torch.ones(1)  # tensor([1.])

out = net(input)
loss = criterion(out, target)
loss.backward()
optimizer.step()  # Does the update

# sum the loss for each half:
if epochs < 251:
firstHalf += loss
else:
secondHalf += loss

epochs -= 1

print(f"first half: {firstHalf/250:.4f}\n"
f"second half: {secondHalf/250:.4f}")
``````

The problem is the softmax layer at the end of your model. First of all I don’t think a softmax layer makes any sense here at all but besides that you are calculating the softmax for one single value which is always evaluated to 1:

``````softmax(x)_i = e^x_i / SUM_j e^x_j
# if x is a scalar: x = x_i, leading to
softmax(x)_i = e^x_i / e^x_i = 1
``````

Therefore your model can’t do anything other than predicting `1`. If you add a print statement for your output you will see that they are always equal to 1. Try remove the softmax layer, that should help.

Moreover there are a few issues with your code:

• you should change `if epochs < 251` to `if epochs > 251`
• consider using a for loop instead of a while loop
• try to calculate input and label as follows:
``````input = torch.rand(size=(64,1)) (this gives you a mini batch of 64 samples)
target = (input<0.5).float()
``````

A cleaner version of your code could look like that:

``````import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import matplotlib.pyplot as plt

device = "gpu" if torch.cuda.is_available() else "cpu"
print(f"device: {device}")

class Network(nn.Module):
def __init__(self):
super(Network, self).__init__()
self.L1 = nn.Linear(1, 4)
self.L2 = nn.Linear(4, 8)
self.L3 = nn.Linear(8, 10)
self.L4 = nn.Linear(10, 10)
self.L5 = nn.Linear(10, 1)

def forward(self, x):
x = torch.sigmoid(self.L1(x))
x = torch.sigmoid(self.L2(x))
x = torch.sigmoid(self.L3(x))
x = torch.sigmoid(self.L4(x))
x = self.L5(x)
return x

net = Network().to(device)
optimizer = optim.SGD(net.parameters(), lr=0.01)
criterion = nn.MSELoss()
epochs = 500

loss_hist = []
net.train()
for epoch in range(1, epochs+1):
input = torch.rand(size=(64,1))
target = (input<0.5).float()
out = net(input)
loss = criterion(out, target)
loss.backward()
optimizer.step()
loss_hist.append(loss.item())
plt.plot(loss_hist)
plt.show()
``````

thanks, it works! just one question(if you don’t mind):
what does `net.train()` do? I thought we’re training it in the `for epoch` loop?

`net.train()` puts all layers of the network to train mode, in opposite to `net.eval()`, what puts them in evaluation mode. The reason for that is that some layers work different during training and evaluation, as for example batch normalization layers. It is important to notice that `net.train()` does not train your model but in some sense prepares your model for training.

I don’t think that net.train() is even necessary here, since you are only using linear and sigmoid layers, which (as far as I know) work in the exact same way during training and evaluation. However I think it is a good think to always add that to your code, otherwise you might forget it one day when it would be necessary