# I’m confused about the way that I calculate my loss function

I’m confused about the way that I calculate my loss

here is the function:

``````
def test_epoch(iterator, model, criterion):

train_loss = 0

all_y = []

all_y_hat = []

model.eval()

for batch in iterator:

y = torch.stack([batch.toxic,

batch.severe_toxic,

batch.obscene,

batch.threat,

batch.insult,

batch.identity_hate],dim=1).float().to(device)

text, length = batch.comment_text

length = length.to('cpu')

y_hat = model(text, length)

loss = criterion(y_hat, y)

train_loss += loss.item()

all_y.append(y)

all_y_hat.append(y_hat)

y = torch.vstack(all_y)

y_hat = torch.vstack(all_y_hat)

roc = roc_auc_score(y.cpu(),y_hat.round().detach().cpu())

return train_loss / len(y) , roc

``````

the way I calculated my loss in the above function is here:

``````
train_loss = 0

...

loss = criterion(y_hat, y)

...

train_loss += loss.item()

...

return train_loss / len(y) , roc

``````

and it gives at the first epoch

`Loss: 0.0148(valid) | roc: 0.547727 (valid)`

but when I calculate the loss in this way:

``````
all_loss = []

...

loss = criterion(y_hat, y)

...

all_loss.append(loss.item())

...

return np.mean(all_loss), roc

``````

it gives at the first epoch

` Loss: 0.7691(valid) | roc: 0.548824 (valid)

`

why the loss in the first way is totally different from the loss in the second way . and which one should I use or rely on ?

THANKS !

Could you provide an executable code snippet using random data to reproduce the different results, please?

Most likely because in first case:

1. You collect `loss.item()` which by default already averaged to a batch size
2. and later you average it by number of samples in the full epoch `len(y)`

and in the second case you:

1. append all `loss.items()` to the list
2. average it by the number of items in the list `np.mean(all_loss)`

Second way, I suppose gives a correct result. To correct the first case, you need to average by the number of batches `train_loss / len(iterator)`. This way results should be equal, I guess.

To summirize: `len(y)` from the first case is not equal `len(all_loss)` in the second case.

1 Like

In case you want some (sorry it is lengthy ) code example:

``````import numpy as np
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
shuffle=False, num_workers=2)
net = torchvision.models.resnet18()
net.fc = nn.Linear(512, 10)  # for CIFAR10
net.to(device)
criterion = nn.CrossEntropyLoss()
``````

Case 1:

``````def test_epoch_case1(iterator, model, criterion):

train_loss = 0
all_y = []

model.eval()

for batch in iterator:
X, y = batch, batch
X, y = X.to(device), y.to(device)

y_hat = model(X)

loss = criterion(y_hat, y)
train_loss += loss.item()

all_y.append(y)

y = torch.cat(all_y)
print(f'This function returns collected train_loss: {train_loss} averaged by number of samples in y: {len(y)}')

return train_loss / len(y)

``````

prints:

``````This function returns collected train_loss: 362.682909488678 averaged by number of samples in y: 10000
0.0362682909488678
``````

Case 2:

``````def test_epoch_case2(iterator, model, criterion):

all_loss = []
model.eval()

for batch in iterator:
X, y = batch, batch
X, y = X.to(device), y.to(device)

y_hat = model(X)

loss = criterion(y_hat, y)
all_loss.append(loss.item())

print(f'This function returns collected train_loss: {np.sum(all_loss)} averaged by number of batches in dataloader: {len(iterator)} = {np.sum(all_loss) / len(iterator)}')

return np.mean(all_loss)

``````

prints:

``````This function returns collected train_loss: 362.682909488678 averaged by number of batches in dataloader: 157 = 2.310082226042535
2.310082226042535
``````

Case 1 (corrected):

``````def test_epoch_case1_correct(iterator, model, criterion):

train_loss = 0
all_y = []

model.eval()

for batch in iterator:
X, y = batch, batch
X, y = X.to(device), y.to(device)

y_hat = model(X)

loss = criterion(y_hat, y)
train_loss += loss.item()

all_y.append(y)

y = torch.cat(all_y)
print(f'This function returns collected train_loss: {train_loss} averaged by number of batches in dataloader: {len(iterator)}')

return train_loss / len(iterator)

``````This function returns collected train_loss: 362.682909488678 averaged by number of batches in dataloader: 157
Hope it helps 