Variance or Confidence Interval for outputs

This question might be trivial, but I am wondering if there is any variance associated with the logits in the final fully connected layer. In particular, is there a confidence interval for each of logit in the final layer? The documentation says that the weights are intialised from a uniform distribution, U(-\sqrt{k},\sqrt{k}), so would the variance be that of the uniform?


1 Like

Typically you only use one copy of the network, so the variance of the initialization doesn’t give you much information here.
One approach to this is to run the same input through a network several times while dropout is enabled.
A classic in this regard is Gal and Ghahramani: Dropout as a Bayesian Approximation.

Best regards