Hi everybody! I am working on a project where I need to be able to extract date codes from an image snapped on a smartphone. I have run similar images through traditional approaches like Tesseract, Google Vision API, etc with little to no luck nor consistency. I am assuming I will need to train or transfer learn from another model, but I need help finding the right approach. I have currently tried manually splitting the image based on white space and trying to identify each individual character cropped from the original image. This makes it super difficult to keep users of the app to stay consistent with their photo taking.
I have also tried the tensorflow implementation of YOLO called darkflow. After painstakingly massaging custom image data into the correct format and slow training, we ended up with a 200mb model that could barely identify a couple of the characters and it still isn’t a complete solution since we also need to put these characters into order and supply it to the user.
I have seen the convolutional recurrent neural net approach with CTC loss for handwriting detection, but when I tried it, it didn’t seem to work for this problem. I can try again, if you think I should!
Am I wrong for assuming this is an object detection problem? Is there another set of solutions that relate more specifically to OCR?
There are plenty of streamlined object detection services through Azure, AWS, and Supervise.ly so I am not concerned if I need to go this route, but I want to make sure I am going in the right direction.
For data, we have gigabytes of photos of actual datecodes. To label, we are looking into MTurk or Supervise.ly.
Needs to be transcribed to ==> line 1:\”JUL2219\”, line2:\”F04111444 31554\”