Negative NLLLoss

When I use NLLLoss as criterion for my CNN Model I get negative loss as shown :
Is it fine?Does it say something of model performance??

image

Is that possible ??Also when I use CrossEntropyLoss I get +ve values like:
Epoch: 1 Training Loss: 0.985429 Validation Loss: 0.980953
Validation loss decreased (inf → 0.980953). Saving model …
Epoch: 2 Training Loss: 0.979373 Validation Loss: 0.978553
Validation loss decreased (0.980953 → 0.978553). Saving model …
Epoch: 3 Training Loss: 0.973842 Validation Loss: 0.977175
Validation loss decreased (0.978553 → 0.977175). Saving model …
Epoch: 4 Training Loss: 0.967670 Validation Loss: 0.973433
Validation loss decreased (0.977175 → 0.973433). Saving model …
Epoch: 5 Training Loss: 0.963760 Validation Loss: 0.971874
Validation loss decreased (0.973433 → 0.971874). Saving model …

Also…Can u plz help me by telling whether these values of losses indicate how well is my model…Like I now it depends on other parameters also…but just forloss how good the model is
I am pretty new to all this stuff…Plz help!!

Hi hs!

NLLLoss requires its input to be log-probabilities. To be valid
log-probabilities, these values must be non-positive. You are passing
NLLLoss positive input values so it is returning negative (and
meaningless) values for the loss.

If you pass the output of your final Linear layer through a LogSoftmax
layer, the logits produced by the Linear layer (which can be positive)
will be converted to valid (and non-positive) log-probabilities.

Best.

K. Frank

1 Like

@KFrank

I have done that as well.Here is my model code

model_1 = CNN(num_classes=5, hid_size=128).to(device)

The model code is:

class ConvNormPool(nn.Module):
“”“Conv Skip-connection module”""
def init(
self,
input_size,
hidden_size,
kernel_size,
norm_type=‘bachnorm’
):
super().init()

    self.kernel_size = kernel_size
    self.conv_1 = nn.Conv1d(
        in_channels=input_size,
        out_channels=hidden_size,
        kernel_size=kernel_size
    )
    self.conv_2 = nn.Conv1d(
        in_channels=hidden_size,
        out_channels=hidden_size,
        kernel_size=kernel_size
    )
    self.conv_3 = nn.Conv1d(
        in_channels=hidden_size,
        out_channels=hidden_size,
        kernel_size=kernel_size
    )
    self.swish_1 = Swish()
    self.swish_2 = Swish()
    self.swish_3 = Swish()
    if norm_type == 'group':
        self.normalization_1 = nn.GroupNorm(
            num_groups=8,
            num_channels=hidden_size
        )
        self.normalization_2 = nn.GroupNorm(
            num_groups=8,
            num_channels=hidden_size
        )
        self.normalization_3 = nn.GroupNorm(
            num_groups=8,
            num_channels=hidden_size
        )
    else:
        self.normalization_1 = nn.BatchNorm1d(num_features=hidden_size)
        self.normalization_2 = nn.BatchNorm1d(num_features=hidden_size)
        self.normalization_3 = nn.BatchNorm1d(num_features=hidden_size)
        
    self.pool = nn.MaxPool1d(kernel_size=2)
    
def forward(self, input):
    #print("INPUT SHAPE SKIP")
    #print(input.shape)
    conv1 = self.conv_1(input)
    x = self.normalization_1(conv1)
    x = self.swish_1(x)
    x = F.pad(x, pad=(self.kernel_size - 1, 0))
    
    x = self.conv_2(x)
    x = self.normalization_2(x)
    x = self.swish_2(x)
    x = F.pad(x, pad=(self.kernel_size - 1, 0))
    
    conv3 = self.conv_3(x)
    x = self.normalization_3(conv1+conv3)
    x = self.swish_3(x)
    x = F.pad(x, pad=(self.kernel_size - 1, 0))   
    
    x = self.pool(x)
    return x

class CNN(nn.Module):
def init(
self,
input_size = 1,
hid_size = 256,
kernel_size = 5,
num_classes = 5,
):

    super().__init__()
    
    self.conv1 = ConvNormPool(
        input_size=input_size,
        hidden_size=hid_size,
        kernel_size=kernel_size,
    )
    self.conv2 = ConvNormPool(
        input_size=hid_size,
        hidden_size=hid_size//2,
        kernel_size=kernel_size,
    )
    self.conv3 = ConvNormPool(
        input_size=hid_size//2,
        hidden_size=hid_size//4,
        kernel_size=kernel_size,
    )
    self.avgpool = nn.AdaptiveAvgPool1d((1))
    self.fc = nn.Linear(in_features=hid_size//4, out_features=num_classes)
    
def forward(self, input):
    #print("INPUT SHAPE is ")
    #print(input.shape)
    #print("INPUT SHAPE is ")
    #print(input.shape)
    #input = input.unsqueeze(1)
    #input = input.unsqueeze(0)
    x = self.conv1(input)
    x = self.conv2(x)
    x = self.conv3(x)
    x = self.avgpool(x)        
    # print(x.shape) # num_features * num_channels
    x = x.view(-1, x.size(1) * x.size(2))
    **x = F.softmax(self.fc(x), dim=1)**
    return x

According to your suggestion, I did thechange in the last part of CNN class …I changed x = F.softmax(self.fc(x), dim=1) to
x = F.log_softmax(self.fc(x), dim=1)
Now I get the NLLLloss Values as:
image

Will this Log_softmax layer also work proper for CrossEntropyLoss or I need to change it to simple softmax only??

Kindly pls.tell me if I am wrong…And also where and what code to insert for correction…I am pretty new to pytorch So pls help!!!
I would be very grateful :slightly_smiling_face::slightly_smiling_face::slightly_smiling_face:

Could you pls help me out about how to go about it??I have put my model code in the reply