At MedApp there is always interest in making the app easier to use. These improvements can be made on several fronts, of which one is reducing the number of steps a user has to do for registering a new medicine. Currently, a user has to provide the name of the medicine, which is in most cases then recognized, such that some information about the medicine can be auto-filled with our available data. However this does not hold when a medicine is not present, moreover, patient-specific information like intake frequency cannot be auto-filled in this way.
The potential solution here lies in the pharmacy labels on de medicine box. Each pharmacist prints a label per patient which is than stickered on the medicine box. This label contains patient information, pharmacy information, medicine information, intake instructions and additional instructions like safety warnings and storage instructions. Providing all this information manually using the tiny keyboard on your phone can be a pain, so real ease of use gains can be made by auto-filling all this information.
Sounds easy enough, the patient makes a picture of the label and we process the information on there. While this, of course, is already harder then it seems, we also have to deal with a lot of different formats of labels from different pharmacies, users making partial or blurred pictures, an endless variety of intake instructions and many more small variances per label.
Luckily we were able to overcome these hurdles and after a short period in which we developed a proof of concept we were able to process almost all information on over 80% of the gathered label pictures. Once this is implemented in the app we can use the extra data and user feedback to further improve the developed algorithm such that, in the future, no user will have to fill in medicine information by hand.
Recognizing text from an image is a common problem that has been researched for as long as computers and optical scanners exist. Historically speaking this has been done with a lot of linear algebra and visual analysis, these days Deep Learning and Recurrent Neural Networks are taking over the scene. Recognizing the words on the medicine label correctly is, of course, essential for this project. However, building your own OCR algorithm can take a lot of time and in some cases requires a lot of data. Luckily, since it is a generic problem, there are multiple solutions out there that can be used for OCR. For local computation, Tesseract is a popular solution. However, these days tech giants like Google and Microsoft offer APIs for this purpose that works far better. They do come at a small cost, however, initially, you are provided with enough free credit to finance the proof of concept face. For this project, I implemented Tesseract, Google’s API and Microsoft’s API. Tesseract lost the battle quickly, not only due to its relatively low accuracy but it also doesn’t provide bounding boxes of the recognized text boxes, something that is very useful and provided by the other two solutions. Eventually, Google came out on top due to consistent slightly better accuracy that Microsoft’s solution.