Community mailing list archives

Re: OCR of incoming invoices

- 06/19/2015 04:52:10
Hi Torvald,

My question would be: do you really need OCR for that ? Because O.C.R. stands for Opthical Character Recognition which is used to transform (usually scanned) images into texts.
Now, from what you say the informations are already extracted (in the current system) from pdf's, which for me translates to the fact that these pdf's are provided in the digital format and not on paper to be scanned later and transformed by OCR software. Furthermore, if you ideally would like to extract the info from mail, that clearly means to me that you just need information extraction from digital (but non image) format. This could be achieved with the right libraries (by some integration approach or directly in Odoo probably). But, you have to make sure that the informations are extractable (meaning the pdf does not embed the invoice as an image but as text); you can test this easily only by searching some surely present text in the document. To be noted that invoicing systems usually produce extractable pdf's.  I did something similar with the integration approach and it is a workable way.
If, however, you have to deal with images, then you still can go with the integration approach, though things will be much more complicated, or you can choose a specialised software like Ephesoft (never tried, only aware of its existence), which also has a free community edition, plus integration with Odoo of course.


Zoltan Gabor
IT Consultant
Mobile: +40 741 224622

On 19.06.2015 10:42, Torvald Baade Bringsvor wrote:
<blockquote cite="" type="cite">
Hello Community,

Is there anybody out there that does OCR of incoming invoices? Our customer has this functionality in their current solution so that the supplier name and address and the invoice number is recognized from incoming PDFs.

Ideally I'd like this integrated when an invoice arrives by email...


Torvald Baade Bringsvor
Bringsvor Consulting AS - Odoo (formerly OpenERP) implementation partner

Post to: