How to select pixels of ROI from feature map

I’m unsure, but take a look at this approach and see, if you could reuse it.
If I understand it correctly, you are dealing with mask targets, but your model outputs just coordinates so you would want to create a mask using these coordinates?

1 Like