Suppose you train a standard CNN classifier, where each image in the training set either shows a cat or a dog. Such a model would ideally evaluate an image with neither animal as 50-50, and the same for an image with both a cat and a dog in it.

Can these cases be reliably distinguished by inspecting the pre-softmax outputs? You might expect two low activations in the former case, and two high ones in the latter?

Sorry if this is well-known. I’m sure it must has been studied, but a quick google search didn’t help.

I think this is not guaranteed: for K classes you have K-1 fractions that define output, and one dimension is essentially arbitrary (sometimes it is set to zero as “calibrating” dimension). So, while it is intuitive to think about K positive scores, sum of these scores is not enforced to be meaningful (you may contrast this with dirichlet parameters).

I agree. Still, there may be something to the intuition that the raw output of the “cat node” node should correlate with the presence of cats, even if you have K-1 other animals categories. It may of course be that the “cat node” really represents the signal “lack of non-cat animal”.

I’m not hoping for a mathematical theorem (although that would be nice), but wonder if it has been studied empirically. Since it’s such an obvious idea, I guess something would have been published on this.

I think negative numbers prevent that; in your example, “cat detector” can choose to increase cat scores or decrease all other scores. You could maybe achieve correlation if you enforce non-negative final network segment (weights & features), but then softmax (exponentiation part) would be not necessary. And sum of scores would still be undetermined, and may actually grow without bound.

Again, I agree, and by “lack of non-cat animal” I meant just that: instead of increasing the “cat output” on cat images, it may just as well decrease the other outputs (making them more negative). Still, it might be worth testing, and my question is if someone knows about any systematic empirical test of this (mathematically naive) hypothesis.