I need advice for a OCR project

Hello, I am working in a important OCR ( Optical Character Recognition) project which I need to read labels in a industry field. Here and example:

This is what we have

So to read the labels in this package, what we need to do is, first locate the region of interest, then locate where have texts and finally to apply OCR. So the steps and results could be described as follow

1. ROI detect


2. Text Location


3. Text Recognition/OCR

I have been working in the Text Recognition (3 step) with good results using this pytorch network

Okay, this is a good point of start for all of you guys ;D

The main question of this topic is, what is your advice for steps 1 and 2?

I though to train a Faster RCNN (ResNet+FPN) to detect barcodes. I could generate white and empty synthetic labels with just a barcode (I found some webs with barcode fonts to dowload) but i am open to your proposes :slight_smile:

That looks pretty good. One recommendation I could make is you might want to check out the detectron2 model. It is the new state of the art object detection model and is in pytorch. You can find the github here

@Dwight_Foster yeah thank you I Knew detectron2 :). I had planned use it for faster.

Another option could be a segmentation with mask rcnn…

Yes you could do that. I think detectron2 can do masks too. You could also try the faster rcnn with a mobilenet backbone.