LOSS value interpretation and result

Hi everybody, well I’m a little dubious about the result I got.

  1. Is it reasonable to have these figures for the first epoch?
  2. which Loss formula should I take to draw the LOSS_EPOCH graph? (Loss_sum, loss, or epoch_loss??)
for i, (xA, xB, score) in enumerate(train_loader, 1):
....
    loss = criterion(..)
    total_loss += loss.item()
    total_loss2 += loss.item() * len(xA)
    if  i%5==0:
         Loss_sum = ( total_loss / i ) * len(xA)  ## Loss_sum(total) in code below 
         loss= total_loss / i                            ##  loss(mean) in code below
         #PRINT 
    .
    .
epoch_loss = total_loss2 / len(train_loader.dataset)

....
#for evaluation 
   total_loss_eval+= loss.item()
   cum_loss += (loss.item()*len(XA))
   if i%3 ==0:
      print(f'total_loss_eval{total_loss_eval / i * len(XA)}')  

LOSS_eval= cum_loss / len(eval_loader2.dataset) ### EVAl Epoch in result below

Herein lies the first epoch output.

[0m 15s] Train Epoch: 1 [250/4500 (6%)]	 Loss_sum(total): 5.72 loss(mean) 0.11438
[0m 30s] Train Epoch: 1 [500/4500 (11%)]	 Loss_sum(total): 6.32 loss(mean) 0.12634
[0m 46s] Train Epoch: 1 [750/4500 (17%)]	 Loss_sum(total ): 6.49 loss(mean) 0.12977
begin validation ...
total_loss_eval   6.013869825336668
EVAl Epoch: 1 [500]	Loss: 0.12
save currently the best model to [/content/model.pt]

[1m 3s] Train Epoch: 1 [1000/4500 (22%)]	 Loss_sum(total): 6.67 loss(mean) 0.13343
[1m 19s] Train Epoch: 1 [1250/4500 (28%)]	 Loss_sum(total): 6.29 loss(mean) 0.12571
.
.
total_loss_eval  2.934242847065131
EVAl Epoch: 1 [500]	Loss: 0.06
.
.

[4m 43s] Train Epoch: 1 [4250/4500 (94%)]	 Loss_sum(total): 3.65 loss(mean) 0.07292
[4m 58s] Train Epoch: 1 [4500/4500 (100%)]	 Loss_sum(total): 3.56 loss(mean) 0.07124
begin validation ...
total_loss_eval  2.032051028476821
 EVAl Epoch: 1 [500]	Loss: 0.04

epoch_loss  0.0712374652425448
..........

It is totally reasonable to have these kinda losses.
As for the loss u should use, u should definitely use all three theoretically they should give similar graphics patterns, but u want to use only one then u can use epoch loss.

First of all thanks a million for your response. However, I supposed that, for the first few epochs, loss is usually more than 2 or 3 (seen in different pages and …). That’s why I thought that way. Again, epoch_loss was 0.071 for first. In epoch 9 I got this

`[42m 12s] Train Epoch: 9 [2500/4500 (56%)]	 Loss_sum(total): 2.06 loss(mean) 0.04125
[42m 28s] Train Epoch: 9 [2750/4500 (61%)]	 Loss_sum(total): 2.06 loss(mean) 0.04128
[42m 44s] Train Epoch: 9 [3000/4500 (67%)]	 Loss_sum(total): 2.06 loss(mean) 0.04121
begin validation ...

total_loss_eval2.040340803149674
 EVAl Epoch: 9 [500]	Loss: 0.04

Also, epoch_loss is 0.0409. Somehow, I’ve seen train loss should be lower than eval-loss, but this isn’t the case for me. If incorrect, please rectify me.

Some times ur training loss can be higher than ur eval loss, other times it may not be the case (which is from a theoretical stand point), it may be coincidence OR it maybe that ur model did not overfit (which is a good thing)

Also as to ur issue of why loss on 1st epoch is too small, can u tell me how u normalized ur data(If u did).