I’m confused about the way that I calculate my loss
here is the function:
def test_epoch(iterator, model, criterion):
train_loss = 0
all_y = []
all_y_hat = []
model.eval()
for batch in iterator:
y = torch.stack([batch.toxic,
batch.severe_toxic,
batch.obscene,
batch.threat,
batch.insult,
batch.identity_hate],dim=1).float().to(device)
text, length = batch.comment_text
length = length.to('cpu')
with torch.no_grad():
y_hat = model(text, length)
loss = criterion(y_hat, y)
train_loss += loss.item()
all_y.append(y)
all_y_hat.append(y_hat)
y = torch.vstack(all_y)
y_hat = torch.vstack(all_y_hat)
roc = roc_auc_score(y.cpu(),y_hat.round().detach().cpu())
return train_loss / len(y) , roc
the way I calculated my loss in the above function is here:
train_loss = 0
...
loss = criterion(y_hat, y)
...
train_loss += loss.item()
...
return train_loss / len(y) , roc
and it gives at the first epoch
Loss: 0.0148(valid) | roc: 0.547727 (valid)
but when I calculate the loss in this way:
all_loss = []
...
loss = criterion(y_hat, y)
...
all_loss.append(loss.item())
...
return np.mean(all_loss), roc
it gives at the first epoch
` Loss: 0.7691(valid) | roc: 0.548824 (valid)
`
why the loss in the first way is totally different from the loss in the second way . and which one should I use or rely on ?
THANKS !