I want to do a binary classification on an particular object of which I have n images of it at different angles. I want to know if the object contains one feature or not, and containing this feature is dominant in the classification step. In other words, if I see that feature in at least one of the n views, the complete model should classify the input as “containing the feature”. I have one label for all n images (angles of view). The object images taken respect a certain symmetry, such that each views could be interchanged without affecting the output. For example, my object could be a cylinder and the views are obtained when we rotate around its center axis, thus all n views looks similar.
To classify, I want to do the following:
- Use the same model for each view, (sharing weights)
- and to terminate via special voting along the n outputs of the same model.
Step 1 is to be reasonable since all my objects view are similar.
Now, I don’t know what is the most efficient way for the voting part. I guess I should not learn it since I know the voting conditions. But is it better to output a label out of voting or is it possible to output some values for later binary cross-entropy?
I thank you in advance for your help.