Unable to capture PDF Invoice information using OCR

@Tom1989 Attached workflow will work if you get pdf output as u mention only, if output is different attached workflow will not work u may required to do changes.

new.xaml (16.4 KB)

Hi @Tom1989
new.xaml (20.7 KB)
Untitled.xls (289 Bytes)
I have used the free ocr api from https://ocr.space/ocrapi , before going through my workflow install uipath.web.activities . it is a free api and can process 25,000 pages a month . you can check the other terms and condition at https://ocr.space/ocrapi. Right now in the workflow I have used my apikey. once you register there you can use your api.

1 Like

Hi @Manjuts90, Thank you very much for your help. It indeed works. But I might face a new challenge if the invoice is not a standard document. It will all be clarified during the next meeting with my client.

1 Like

Hi @Rishi1, your logic is too complex but I am able to receive desired output. Appreciate your help. Thank You.

Hi @Tom1989
thanks finally you liked the solution , its not complex it is so simple like calling api and api give response in json , just parse that json and put the data in excel that’s it. Let me know if you need any further assistance on it or any other issue.

Hi @Rishi1

If I capture similar information from a website instead of an invoice, can you tell me whether I can use the same workflow (.xaml file) you shared with me with little modifications? (Can parsing be done on a website?)

if not, can I capture those details and create a datatable on UiPath indicating the text (which will be used as headers in Excel sheet ex. ‘Production Country’, ‘Colour’, ‘Model No’) and the information related to that text ( will be used in the excel following headers ex. ‘Australia’, ‘Blue’, ‘A13H17EWV’)

Hi @Tom1989
My workflow works on image and pdf , we can pass the file either from system or from url(if it is on server). for text extraction from website you have to use other activities like get text or data scrapping .

Okay.

Hey @Rishi1 @Manjuts90

I made necessary changes to my workflow design to extract the details using ocr and json packages on UiPath and the task was executed flawlessly till now.

But today, I am unable to get the same output as the robot enters the data in the search window instead of product name.

The product column is not properly extracted and filled in the excel.

Attached below is the Workflow design along with the PDF required to start the task.

Please can you help me fix this problem?Solution_Goodman_POC.xaml (65.7 KB)

Zalora Sample Invoice_Goodman Project.pdf (50.4 KB)

Hi @Tom1989
As the free ocr API changes the result for the same pdf , that’s why you are getting different data in excel . I already mail them for this , if they keep changing the output of OCR after every 10-15 days we won’t be able to use that API .
They also include one feature of table ocr may be because of that it got changed , now I have used the table ocr feature of them .
Have a look of the latest workflow attached ,it is working fine now , once I get the response from them I will reply to you.

Solution_Goodman_POC.xaml (66.0 KB)

Hi @Rishi1,

I am facing the same problem again.

Please, take a look at the attached XAML file:

Solution_Goodman_POC.xaml (66.0 KB)

Hi @Tom1989
I checked , it is giving the same data in the excel as previous . but still if you fell it is changed then you have to use the paid version of this OCR API

It works intermittently. Did you check with them?

give me your email id I will forward the reply that I got from them.

tanmay.bhure@jos.com

Hi @Rishi1

I run your workflow with different file path. It gives me some error.

Where we have to pass the file path and API Key.

Can you please guide me I am very new for this.

Untitled

I got error. Please give suggestions.

Please give me suggestions. @Tom1989 @Rishi1

Hi @Jayesh_678
sorry for the late response, I was not following the threads here.
Are you still struggling with error or is it resolved?

@Rishi1 I tried to download the XAML file but it says activities are missing. can you please help me with the updated file? I am testing the OCR space as well.