im trading my model, and the validation loss seems to be way smaller than the training loss. So at test time the model does not perform well. im not sure why that is. Please any suggestions?
Below is the image of both losses. The top is the training loss while the bottom is the validation loss.
There could be any number of reasons.
Perhaps you’re computing your loss with
sum() (instead of
your validation batch size is smaller than your training batch size so summing
over fewer batch elements gives you a smaller loss.
Perhaps you’re using
model.eval() and, for example,
turning on such things as
Dropout layers makes the performance of your
model worse in training mode.
Perhaps the character of your evaluation dataset is different than that of
your training set. If your evaluation dataset happened to contain a lot more
“easy” data samples than your training dataset, those easy samples could
lead to a lower loss.
Or perhaps you just have a bug in your code somewhere.
What happens if you pretend that your training dataset is your validation
dataset and use it to compute your “validation” loss. What size loss values
do you get then?
You want loss to be lower. That’s a sign the model is learning. So it might not be a bad thing.
Here are some additional points you can check, in addition to what @KFrank mentioned.
- Are you using dropout layers? If so, perhaps try lowering your p value to something in the range of 0.1 to 0.3.
- Are you using data augmentations in your preprocessing? Perhaps try lowering their thresholds.
- Does your training and validation sets have an equal representation of classes? If not, you might try adding positional weights to your loss function.