Dropout- changing between training mode and eval mode

Hi!

So to my understanding, if i want to change the mode of operation for the dropout units,all i need to do is net.eval() when testing/validating and net.train(True) when training. Is that true?

If so, I am super confused: after adding dropout layers, the loss on the training is consistently HIGHER than the loss on the validation set (per example),and it seems to be by a factor of 2 (I’ve used p=0.5 at the dropout layers).

Before adding the dropout layers my net overfitted the data,but in the first epochs the loss was more or less the same.

(If that makes any difference, the criterion is crossentropy, and I didn’t specify a softmax layer since to my understanding it is not needed as crossentropy’s inputs should be the scores and not the logits

an example output:
[14, 6000] loss: 0.084
valid loss is 0.05202618276541386 and percents 0.9190970103721782

The first line is the training loss (per sample) and the second line is the validation loss…

I’m super confused, help will be appreciated!

If so, I am super confused: after adding dropout layers, the loss on the training is consistently HIGHER than the loss on the validation set

If you’re randomly dropping out units, wouldn’t you expect the loss to be higher, due to uncertainty that the network is operating under?

Well honestly it’s the first time I’m using dropout, but there should be a correcting factor during evaluation (when we use all the neurons output) so that the actual output should be on the same scale as the output during training.

Is the correcting factor missing, or should I review dropout again?!

Also, when it comes to the API was I correct? (Is this how it was meant to be used?)

the correcting factor is applied at training, and it does exist. The correction is a scaling correction, it is not always linearly correlated with the loss (especially not after going through a Softmax)