Previously I had implemented an anomaly detection based on Densenet architecture and achieved good results on road scene(nuScene) data.
Now, I am trying to create a bounding box version using the Unet decoder and SSD head without any anchor generator.
I am bit confused with the number of classes here. In pixelwise wise segmentation, the binary classification is a well known problem, however, in case of detection, how is the number of classes handled?
For example, we have bbox labels for the anomalies, but not for the background, so how should I modify the loss function? I am not looking for complete code but conceptual answers would be most helpful to clarify my understanding.
Let me know if you require more details.