Read different scanned invoice

mohammedamaan · March 20, 2019, 2:11pm

Hi. I have a scenario where I need to read different invoice. These invoice are scanned and are available in a folder. The invoice structure will be different for different invoice but I need to capture similar data like invoice number, date, amount, etc. How can I achieve this. If it can be explained as a step would be appreciated.

anil5 · March 20, 2019, 2:20pm

Hi @mohammedamaan,

You can use read PDF text with OCR activity, and see if the output is readable and because the invoices are scanned the data won’t be proper.

If the data is proper you can use regular expressions to extract the data.
If the data is not proper, use computer vision activities.

mohammedamaan · March 20, 2019, 3:23pm

Hi @anil5 ,
As you said the data is not proper and I have no clue on how to use Computer vision. Can you just give me a short description how I can use it.

anil5 · March 20, 2019, 3:27pm

Hi @mohammedamaan,

Go through the provided link and install the activities.

After installation , use CV screen scope first indicate on screen from where you want to extract the values.

Inside screen scope, use CV get text to extract the values and using CV activities the data extracted is accurate.

To know more about CV activities, go through the activities guide for computer vision.

mohammedamaan · March 20, 2019, 3:40pm

Ok. Thanks mate.

mohammedamaan · April 1, 2019, 8:44am

Hi @anil5,
I tried using the CV automation and seems to be good. But the issue i am facing how can I read data from scanned invoice for which the formats are not same. Also is there any way where we can read without opening the pdf file.

mohammedamaan · April 7, 2019, 10:45am

The accuracy is not 100% while reading pdf with microsoft OCR. How can i make it more accurate.

Jeven_Delacruz · May 30, 2019, 7:29pm

If you’re dealing with image try to convert the PDF scanned docs to image(s) and use API Integration OCR

You can try using Microsoft Azure Computer Vision or Google Vision.
For Microsoft use the Microsoft Vision Activities > Handwritten Text (Mode as Printed)
Microsoft Vision requires the following:

Service URL “https://[region].api.cognitive.microsoft.com”

Subscription Key

Hope it helps

Regards,
Jeven

pranaysai46 · March 31, 2020, 4:11pm

hello @mohammedamaan

am a learner ,can you please attach the work flow how to get the invoices from different images

thanks.

Topic		Replies	Views
How to rad invoice number from scanned PDF Help studio	10	2213	November 7, 2019
Getting problem regarding reading invoice Help	4	3129	January 12, 2018
Read different invoices - get specific values Help	2	2167	April 8, 2019
OCR Invoices data extraction and analysis Help	4	1164	July 17, 2019
Regarding Multiple PDF with Computer Vision Help	5	2071	August 30, 2019

Read different scanned invoice

Related topics