PyTorch Constant Loss

I am trying to do multiclass-classification using simple ANN .
Dont know why my loss is not decreasing it remains constant

CODE-

class NET(nn.Module):
def __init__(self):
    super().__init__()
    self.model=nn.Sequential(
        nn.Linear(6,512),
        nn.ReLU(),
        nn.Linear(512,1024),
        nn.ReLU(),
        nn.Linear(1024,17),
        nn.Softmax()
    )
def forward(self ,x):
    return self.model(x)

net=NET().to(device)

opt=optim.Adam(net.parameters() , lr=0.01)
Loss_fn=nn.L1Loss()


%%time
epochs=10
loss_arr=[]

for epoch in range(epochs):
     opt.zero_grad()

     outputs=net(train_data)
     loss=Loss_fn(outputs ,train_out )
     loss.backward()

    opt.step()
    loss_arr.append(loss.item())
    print("Epochs : %d/%d ,loss :%f" % (epoch ,epochs ,loss))

I want Mean absolute error that’s why i am using nn.L1Loss()

OUTPUT -

Epochs : 0/10 ,loss :10.401250
Epochs : 1/10 ,loss :10.401250
Epochs : 2/10 ,loss :10.401250
Epochs : 3/10 ,loss :10.401250
Epochs : 4/10 ,loss :10.401250
Epochs : 5/10 ,loss :10.401250
Epochs : 6/10 ,loss :10.401250
Epochs : 7/10 ,loss :10.401250
Epochs : 8/10 ,loss :10.401250
Epochs : 9/10 ,loss :10.401250
Wall time: 1.62 s

The loss changes for random input data using your code snippet:

train_data = torch.randn(64, 6)
train_out = torch.empty(64, 17).uniform_(0, 1)

so I would recommend to play around with some hyperparameters, such as the learning rate.

That being said, I’m not familiar with your use case, but a softmax output in L1Loss doesn’t seem to be the usual use case.
If you are dealing with a multi class classification use case, I would recommend to try out nn.CrossEntropyLoss and pass the raw logits to it.

1 Like

Thanks !! removing softmax layer worked

I want to Trouble you again .
Sorry for this ,Plz help me
I am facing same issue with another model.I am building simple digit classifier using ANN

code=

class NET(nn.Module):
    def __init__(self):
        super().__init__()
        self.model=nn.Sequential(
            nn.Linear(784, 128),
            nn.ReLU(),
            nn.Linear(128, 256),
            nn.ReLU(),
            nn.Linear(256, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
        
        )
    def forward(self,x):
            return self.model(x)


loss_fn=nn.CrossEntropyLoss()
opt=optim.Adam(net.parameters() , lr=0.01)

loss_epoch=[]
epochs=5
for i in range(epochs):
        
        opt.zero_grad()
        output=net(x_train)
        loss=loss_fn(output, y_train)
        loss.backward()
        opt.step()
        loss_epoch.append(loss.item())
        
        print("Epochs: {}/{} , Loss:{}".format(i,epochs,loss))

OUTPUT=

 Epochs: 0/5, Loss:2.301159143447876

 Epochs: 1/5, Loss:2.301161289215088

 Epochs: 2/5, Loss:2.3011562824249268

 Epochs: 3/5, Loss:2.3011701107025146

 Epochs: 4/5, Loss:2.3011698722839355

data- Mnist Digit data

Again, I would recommend to play around with hyperparameters, such as the learning rate.
Since the loss changes and no dropout (or other “random” operations are used), it would mean that the parameters get some updates but the overall training isn’t able to reduce the loss.

Start by using a lower learning rate (e.g. 1e-3) and try to overfit the current data sample.

1 Like

Thank you very much .It was due to high LR.
Loss starts decreasing when LR is 1e-7