# How to write custom CrossEntropyLoss

I am learning Logistic Regression within Pytorch and to better understand I am defining a custom CrossEntropyLoss as below:

``````def softmax(x):
exp_x = torch.exp(x)
sum_x = torch.sum(exp_x, dim=1, keepdim=True)

return exp_x/sum_x

def log_softmax(x):

def CrossEntropyLoss(outputs, targets):
num_examples = targets.shape
batch_size = outputs.shape
outputs = log_softmax(outputs)
outputs = outputs[range(batch_size), targets]

return - torch.sum(outputs)/num_examples
``````

I also make my own logistic regression (to predict FashionMNIST) as below:

``````input_dim = 784 # 28x28 FashionMNIST data
output_dim = 10

w_init = np.random.normal(scale=0.05, size=(input_dim,output_dim))
b = torch.zeros(output_dim)

def my_model(x):
bs = x.shape
return x.reshape(bs, input_dim) @ w_init + b
``````

To validate my custom crossentropyloss, I compared it with nn.CrossEntropyLoss from Pytorch by applying it on FashionMNIST data as below:

``````criterion = nn.CrossEntropyLoss()

for X, y in trn_fashion_dl:
outputs = my_model(X)
my_outputs = softmax(outputs)

my_ce = CrossEntropyLoss(my_outputs, y)
pytorch_ce = criterion(outputs, y)

print (f'my custom cross entropy: {my_ce.item()}\npytorch cross entroopy: {pytorch_ce.item()}')
break
``````

My question is toward the results my_ce (my cross entropy) vs pytorch_ce (pytorch cross entropy) where they are different:

``````my custom cross entropy: 9.956839561462402
pytorch cross entroopy: 2.378990888595581
``````

@alie There are two mistakes here.

1. You apply softmax twice - once before calling your custom loss function and inside it as well.
2. You are not applying log to softmax output. Inside `log_softmax()`:
`return torch.log(torch.exp(x) / torch.sum(torch.exp(x), dim=1, keepdim=True))`

@mailcorahul Thanks; after changing the log_softmax() function with yours, the two cross entropy beam closer but still they are not exactly the same. Is this expected or there is mistake somewhere else?

my custom cross entropy: 2.319404125213623
pytorch cross entroopy: 2.6645867824554443

Those two were the mistakes in the code. Can you post your full code again here?

Thanks, sure here it is:

import numpy as np
import torch
import torchvision
from torchvision import transforms, datasets
import torch.nn as nn

def softmax(x):
exp_x = torch.exp(x)
sum_x = torch.sum(exp_x, dim=1, keepdim=True)

``````return exp_x/sum_x
``````

def log_softmax(x):
return x - torch.logsumexp(x,dim=1, keepdim=True)

def CrossEntropyLoss(outputs, targets):
num_examples = targets.shape
batch_size = outputs.shape
outputs = log_softmax(outputs)
outputs = outputs[range(batch_size), targets]

``````return - torch.sum(outputs)/num_examples
``````

def my_model(x):
bs = x.shape
return x.reshape(bs, input_dim) @ w_init + b

#FashionMNIST Datasets for training/test
trans = transforms.Compose([transforms.ToTensor()])

#paramaters initialization
input_dim = 784 # 28x28 FashionMNIST data
output_dim = 10
w_init = np.random.normal(scale=0.05, size=(input_dim,output_dim))
b = torch.zeros(output_dim)

#pytorch CrossEntropyLoss
criterion = nn.CrossEntropyLoss()

for X, y in trn_dl:
outputs = my_model(X)
my_outputs = softmax(outputs)

``````my_ce = CrossEntropyLoss(my_outputs, y)
pytorch_ce = criterion(outputs, y)

print (f'my custom cross entropy: {my_ce.item()}\npytorch cross entroopy: {pytorch_ce.item()}')

break``````

@alie I am finding it difficult to understand the code because of its formatting, but I can see you’re applying softmax twice. You can either remove the line ` my_outputs = softmax(outputs)` or replace the 3rd line in `CrossEntropyLoss()` to `outputs = torch.log(outputs)`.

Thanks again, removing the softmax solves the problems; no both ce returns the exact same values

my_outputs = softmax(outputs)

1 Like