OCR Invoices data extraction and analysis

Hello everyone,

I have a set of invoices (with different formats).
I want the robot to recognize, no matter the format, the quantity and price included in the invoice.
How to do that? Does it need an intelligent OCR?


Hi @Youssef,
If every invoice has different shape then it might be difficult. Try maybe experiment with CV (computer vision) package and activities related to PDF.

@Youssef yes you can do this by using read pdf activities, but only if the pdf is not a scanned imageā€¦ Also as @Pablito metioned you can use cv activities to scrape the data from scanned pdf. Give try


Hi @Youssef,
You can use read Pdf with OCR for accurate results for extracting the values from a PDF.

Hey @Youssef

If the pdf is having these words quantity and price always you can use read pdf and perform string manipulations, advantages of using this technique is fast and it works in background. If the pdf is scanned image then the only option is read pdf with OCR but it works only on foreground.

Goutham Vijay