So I have the following dataset
I’m trying to create a model to draw bounding box around the birds. I was suggested to use a Unet for the purpose of image segmentation. So would the idea be to feed the bird image to the Unet, the Unet spits out a segmentation map of my image. Then feed the segmentation map into a Deep neural network that outputs 4 numbers representing my bounding box?
The issue I’m having is now I need a separate dataset for the Unet segmentation. But how is the data process selected, for example the bird dataset, a pixel is either bird or not. So do I need to find a segmentation dataset of only 2 types of pixels in the segmentation map? And does the segmentation data need to also be birds?
I’d appreciate any insight and advice, regards