I have trained a face detection model, and ready to use it to predict object and the coorsponding bounding box. I have some questions about the predictions. Here is what I do now:
- The prediction is a vector of 6, which is
[top, left, bottom, right, background_score, face_score]
. - Get the
confidence
by applying softmax tobackground_score
andface_score
, theconfidence
is the softmax result of face score. - Sort the whole predictions by
confidence
and keep only the top 1000. - Filter the 1000 prediction by a
confidence threshold
, 0.9 or higher. - Do
none-maximum suppression
and get the final bounding boxes.
I wander if this is the right way to do it. If it is, I found the 2 variable 1000
and 0.9
is very hard to pick. Is there a better way to chose the 2 value?