MultiPage Issue with Document Understanding

Hello Everyone,

On daily basis I get PDFs, where some has 1 page and some has 2 pages and I need to extract the total amount from that PDF, now if the pdf has too many list of orders then the total amount falls in second page but if the list of orders are less then the total amount is on the first page.

I extracted when the orders are many and the total amount is on Second Page, but when I debugged with less orders its not capturing the total amount.

Please suggest on how to tackle this issue.

Hi @8SEVEN ,
I think we can extract all PDF to get all text then split to get ‘total amount’
eg: regex
→ to detail, can you share your sample file?
regards,

Set up a Document Understanding project in UiPath with the “Form Extraction” or “Intelligent Form Extraction” scope, depending on the complexity of your documents.

And you can achieve this by document classification
Like

PDF Classification
– Classify Document (Classify as “First Page” or “Second Page”)

Data Extraction
– If Classified as “First Page”:
– Extract Total Amount from First Page
– If Classified as “Second Page”:
– Go to Second Page using “Next Page” activity
– Extract Total Amount from Second Page

Define your document types and train the model to understand the structure and layout of your PDFs. Make sure the model recognizes key fields like “Total Amount.”

Cheers @8SEVEN

Hello @Palaniyappan

Thank you for the help.

But now when I debug this its chooses the first document type ,rather I have followed the steps and all these in the below image are different type of documents of same format , just that some has 1 entry and some 2,3,4, and so on.

But when I debug it still chooses the first document type rather than checking all the templates that I have selected for these document types.

In Data extraction , if I enter specific Document type id then it works but when I delete that and enter classification result as classifyoutput(0) then it chooses the first document type , please advise on how to handle this scenario.

Thanks

Hi @8SEVEN ,
If you are working with invoices, purchase orders or receipts then you can directly use public endpoints listed here in your DU project. Otherwise, you can also train your custom model through the AI center with adequate samples of documents. This is you no need to worry whether total is coming on first or second.