What would be an efficient way to calculate standard deviation or similar metric to see how spread out the output of a network is over all of it’s outputs on a validation dataset running the network approx. 10,000 times. Output is a 1x10 tensor. I am new to this field, and was wondering if there is some efficient algorithm I don’t know about for calculating how “spread out” each of these approx. 100,000 values are.
One way I was thinking about doing this, is keeping a running average of the values outputted from the network, and running an MSELoss function on a 1x10 tensor containing this average value. This should give something close to the standard deviation, but the problem is that the same average value won’t be used for all of the outputs.