How to choose the DL model parameters?

Hi, I’m comparing two semantic segmentation models FCN and DeepLabV3 with the same dataset, same seed, same loss function Cross Entropy. But i’m confused about how to pick the right model weights.

  1. For each model, should I pick the mode weights with smallest test loss, or the highest evaluation metrics (mean iou, pixel acc)? Because I remember in order to prevent overfitting, we should choose the smallest test loss. However, in this tutorial, the author picks the greatest eval acc instead of smallest test loss.
# deep copy the model
if phase == 'val' and epoch_acc > best_acc:
    best_acc = epoch_acc
    best_model_wts = copy.deepcopy(model.state_dict())
  1. The FCN model has better training loss, training acc, training iou, test acc, test iou but larger test loss than DeepLabV3 model along all epochs. In such case, can I say FCN model beats the DeepLabV3 model?

In the common use case you won’t see the test loss until you finalize the model training and calculate the test loss/accuracy once. If you use the test loss for e.g. early stopping this would be considered a data leak and the model would perform worse during its deployment and usage of new data.

Thanks for your explanation. And sorry, I may mislead you.
For example, I train a model 100 epochs. for each epoch, i have a training loss and test loss (i’ve done train test split), and other eval metrics. Now i find epoch 50 has the smallest test loss, while epoch 90 has the highest test mean iou. Which model weights should I use? the one of epoch 50 or the one of epoch 90?