Validation Losses with Dropout/BN

I’m running some training on a large CNN with dropout and batchnorm layers and getting some funky validation loss numbers. I’ve read here that this is common when there are dropout layers in the NN. This might be a silly question but will the validation/training loss curves eventually converge? Or is there something else I have to do to accommodate for the large discrepancy between the values?

Dropout might cause your training loss to be higher than your validation loss.
You could use model.eval() after training a complete epoch and calculate the training loss as well as the validation loss. This would give you the current training loss without taking the average of all training batches during training and would disable effects of e.g. dropout.
However, you’ll spend time during each training epoch for this calculation, which is usually not needed.

You could run it once in a while to verify that the gap comes from the dropout layers.