At test time, we use the “mean network” that contains all of the hidden units but with their outgoing weights halved to compensate for the fact that twice as many of them are active.
Hi,
I’m going to assume that when you say ‘perform the half operation’, you mean scaling the activations by 0.5 at test time. I main idea is that the ouput of the layer at test time should be the expected value, given some dropout probability p.
In the case of nn.fucntional.droupout, you have to use it like this: nn.functional.droupout(input, p=0.5, training=False)
If you were using the module version from nn.Dropout, you can call the eval function on the module, or the parent calss which contains the dropout module.
I have got the idea,.‘perform the half operation’ comes from “inverted dropout”,I got the answer from here,I think PyTorch’s implementation of dropout uses this method.