Image Detection, how to filter out the predicted bounding boxes?

I have trained a face detection model, and ready to use it to predict object and the coorsponding bounding box. I have some questions about the predictions. Here is what I do now:

  1. The prediction is a vector of 6, which is [top, left, bottom, right, background_score, face_score].
  2. Get the confidence by applying softmax to background_score and face_score, the confidence is the softmax result of face score.
  3. Sort the whole predictions by confidence and keep only the top 1000.
  4. Filter the 1000 prediction by a confidence threshold, 0.9 or higher.
  5. Do none-maximum suppression and get the final bounding boxes.

I wander if this is the right way to do it. If it is, I found the 2 variable 1000 and 0.9 is very hard to pick. Is there a better way to chose the 2 value?

Any idea is appreciated. :thinking: