I am sitting on an anomaly detection task and I had the following idea.
Assume I have 3 classes and each of them could have an anomalous sample. Then, I would train with 4 classes, the last one reserved for the anomaly. But in my training data I would never actually have any data on the class 4.
Now during inference, I would try to classify the data and if it is an anomaly it should have low scores for class 1 to 3 and thus be labeled as class 4.
Would this approach even compile? So expecting 4 class weights but having labels only for 3 classes?