How to read accurate text from pdf if it is unstructured?

I have read invoice pdf from below text I am getting.(used Read pdf text activity)
text data:

Item Description Qty HSN Base Price Central Tax (0%) State Tax (0%) Integrated Tax Total
(18.00%)
Network 1.0 990000 2,329.79 0.00 0.00 419.36 2,749.15

Total Charges 2,329.79 0.00 0.00 419.36 2,749.15

Actual data from pdf:

Item Description Qty HSN Base Price Central Tax (0%) State Tax (0%) Integrated Tax(18.00%) Total

Network 1.0 990000 2,329.79 0.00 0.00 419.36 2,749.15

                Total Charges     2,329.79            0.00                  0.00                  419.36        2,749.15

I want data:
Total=2,749.15
Base price= 2,329.79
Integrated Tax(18.00%) =419.36

Thanks in advance

Read pdf will not be sufficient in this scenario. Especially the table rows are real tricky to decipher using conventional RPA means. You might get lucky with some regex magic, but it’ll be very volatile most likely.

For this there are tools/integrations that can help, called IDP (Intelligent Document Processing). This is an AI/ML method for extracting data from semi-structured documents, like invoices. They require models that need to be trained for proper accuracy, but many pre-trained models exist for the more common documents, especially invoices which are pretty much ‘the’ example of any hyper automation tool.

UiPath has Document Understanding in this family you can use, but there are other brands in the market that might fit your needs better. Note that there may be additional license costs involved using IDP.

Thanks for your quick response.
But I have installed DocumentUnderstanding.ML.Activities.
Unable to get how to use this.
Can you please elaborate for this scenario.

Hi,

You can use RegEx expression to retrieve data

thanks for your response.
Can you please suggest exact Regex expression for the above text ?

It takes a wee bit more than a forum post to explain the usage of document understanding.
My advice is to do the UiPath Academy training(s) on this subject.

Can you please let me know why below error occurs?
UiPath.smartData.OutOfProcessUi.Host.exe
To run this application , you must install missing .net frame work missing?

UiPath.intelligentOCR.Activities-6.14.1
UiPath.DocumentUnderstanding.activities-1.24.0