How to extract data from PDF tables

HI Everyone ,

I’m very new to UI Path and automation as well , can you please help me to fetch values from PDF and write those values into excel sheet row by row .

Im trying to use screen scarping but im not getting the desired output(Im getting wrong values)

@NEWBEEIE what kind of PDF you are using… Native PDF or image formated PDF?

sampledata.pdf (110.8 KB)

Native PDF ,

Im trying to pass the rows in second page to an excel sheet

Thank you for your response

Use Read PDF activity to read PDF file. The output will be of datatype string.
Use string manipulation/ apply regex on the previous result and add content into a datatable.
Once all records are inserted into datatable, use write range activity to write into excel file.

I’m trying to pass condition to if loop , im getting this validation error . can you please help

also can you please brief me on regex

Check datatype for item. It should be string. else give item.ToString.Contains and check

Hi @Madhavi ,

The datatype is string.

@NEWBEEIE The datatype of the variable item is object. You can see that in the Property window → TypeArgument. Change it to String

1 Like

HI , Can anyone please help me with a xaml file

Hello @NEWBEEIE, in the Xaml File go ahead and change the value marked below from object to String.

TypeArgument : is nothing but the data type of the values stored inside the IEnumerable. Since here you have the array of type string (as you are splitting the string) the Type Argument should be String.

@RishiVC1,

I ahve modified the flow , but here my problem is extracting the values row by row and writing them in th excel sheet , can you please help with that

share the pdf and flow created by you.

PDF_2.xaml (46.8 KB)
sampledata.pdf (110.8 KB)

These are the files

which table you are trying to get here ?

I’M trying to get the table in the second page(2) .

NATIONAL PARTNERSHIP FOR QUALITY AFTERSCHOOL LEARNING from page 1

Please find the attached Zip. The Project was created with the Below details
Uipath Community Studio version : 18.4.0.6

Activity details:
“UiPath.UIAutomation.Activities”: “[18.3.6897.22543]”,
“UiPath.System.Activities”: “[18.3.6897.22524]”,
“UiPath.Excel.Activities”: “[2.4.6884.25683]”,
“UiPath.Mail.Activities”: “[1.2.6863.29868]”,
“UiPath.PDF.Activities”: “[1.2.6863.34697]”

PDF extraction.zip (118.5 KB)

4 Likes

Hi @RishiVC1,

can you please explain me “varStr_Read_Data.Split({“Number of Coils Number of Paperclips”,“Example 3:”},stringsplitoptions.None)


I tried to fetch the table in second page , can you please help me to resolve this error

Hi @RishiVC1 ,

This is the modified PDF for fetching the table values from second page .

PDF_Data_Extraction.xaml (15.8 KB)

Please go through the documentation here : String.Split Method

I would request you to go through the foundation training from UiPath academy too.

varStr_Read_Data.Split({“Number of Coils Number of Paperclips”,“Example 3:”},stringsplitoptions.None)

Here we are trying to split the string variable varStr_Read_Data into various parts. Since the selection of strings on the basis of which we are doing it (the spit) is unique, as a result of that the string will always get divided into 3 parts.

Part 1: Part of the string before “Number of Coils Number of Paperclips”
Part 2: Part of the string Between “Number of Coils Number of Paperclips” and “Example 3:”.
Part 3: Part of the string after “Example 3:”.

okay But what about stringsplitoptions.None ?
Answer :String.Split Method

Have fun at learning :slight_smile:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.