Question about object localization

I am building a model to detect a specific molecule in scanning electron microscopy images. This molecule is the only one I’m interested in, and there could be multiple instances of it in a single image. I initially planned to train a YOLOv8 model using images I labeled, but I’m concerned that, due to the difficulty in identifying the molecule, I may not be able to label all instances accurately, potentially introducing label noise. How much should I be worried about this, and are there strategies to address the issue? For example, if I were to train the model using only image patches where I am 100% certain the molecule is either present or absent, the training process would be easier. However, I’m unsure how to then use this model to localize the molecule in full images. I would appreciate any suggestions.

No matter what strategy you run, you need to have the right labels to gauge how well your model is doing.

Sure, but is there a strategy to train a network on a classification task (molecule present/absent) on a high quality dataset, then use this network for the more challenging localization task only for inference?

Assuming your patch classifier is a 100% correct,

You can use those weights for your detection/localisation backbone.

You would need to potentially retrain this network. Or may be freeze the entire network and retrain only a few layers.

Thanks for the reply. How would you handle the fact that the extracted patch and the original image are of different dimensions?

You’re going to feed the detector only the patch, not the original image.

Plus there are ways to handle difference of size anyways: How to Change Input Size from Pretrained Model?