What I have to label for object detection with faster-rcnn?

Sorry i’m a little confused.
Do you mean a classifier model by that or what?

I have faster-rcnn model and in training forward pass I pass image and all boxes for all classes which are presented in the image as ground truth.

Can you expalin please which point exactly are you chasing using a single crop as input?