Loss is more than expected (>1) Neural net

the_coder · July 11, 2020, 10:00am

I have a image dataset with 40 classes, i am trying to classify them using a neural network.
The shape of my input tensor i (1,1,64,64), so the batch size is 1 and the image is grayscale.

i ran the following code , but i am getting initial loss in around 3.4 , which is more than 1. What could be the reason that my model is not learning ?

class neural_net(nn.Module):

def __init__(self):
    super().__init__()
    self.input = nn.Linear(64 * 64, 256)
    self.hidden1 = nn.Linear(256, 256)
    self.output = nn.Linear(256, 40)

def forward(self, x):
    x = F.relu(self.input(x))
    x = F.relu(self.hidden1(x))
    x = self.output(x)
    return F.log_softmax(x,dim=1)

network1 = neural_net()
optimizer = optim.Adam(network1.parameters(), lr = 0.001)

for epoch in range(10):
    for data in train_dl:  
         X, y = data  
        network1.zero_grad()  
        output = network1(X.view(-1,4096))  
       loss = F.cross_entropy(output, y)  
       loss.backward()  
       optimizer.step() 
print(loss)

Also there is another issue i found, i am giving cross_entropy loss to calculate the loss but i am getting NLLLossbackword in the output. What could be the reason for that.

chetan_patil · July 11, 2020, 1:01pm

Hi,

During the first forward-pass, loss is expected to be around log(num_classes) ie
log(40) = 3.688.
Adding to your point of loss>1, it is not necessary that loss should be always <1.
nn.CrossEntropyLoss() is actually a combination of nn.LogSoftmax() and nn.NLLLoss(). Refer this doc.
That’s why it when you print loss, it shows <grad_fn = NLLLossBackward>

Haider_Ali_Shuvo · July 11, 2020, 1:36pm

The Cross Entropy loss applies softmax inside itself, So either use Cross Entropy loss on top of Unnormalized Scores or use Softmax+NLLL Loss

chetan_patil · July 11, 2020, 2:57pm

Your code uses a LogSoftmax() in the last layer.
In that case, the loss function you must use will be F.nll_loss(output, y) instead of F.cross_entropy(output, y)

alpha_omega_alpha · July 11, 2020, 3:34pm

Hi @the_coder,

In the training loop, you might also want to zero out the optimizer gradients instead of the network .
So, try:

for epoch in range(10):
    for data in train_dl:  
        X, y = data  
        optimizer.zero_grad()  
        output = network1(X.view(-1,4096))  
        loss = F.cross_entropy(output, y)  
        loss.backward()  
        optimizer.step() 
print(loss)

the_coder · July 11, 2020, 4:22pm

Still my model is not learning, it is giving same classification for all samples in the test dataset. so accuracy is around 2.5 %

chetan_patil · July 11, 2020, 4:25pm

Assuming the last layer is F.log_softmax(), did you change the loss function to F.nll_loss() ?

the_coder · July 11, 2020, 4:27pm

Yes, i changed the loss from cross_entropy to nll_loss , still the model refuses to learn.

chetan_patil · July 11, 2020, 4:34pm

Okay, then you can try increasing the layers in the neural network.
Also, just take one mini-batch and keep on using the same mini-batch and see if the loss really goes to zero or accuracy approaches 100%.
That’s one way to debug your network.