How can I pull table with specific name and variable page number from PDF file?

Huseyin_Kizil · August 6, 2023, 9:25pm

I have a PDF file that I received from E-Mail. This PDF document contains multiple tables and the page numbers of these tables vary. I want to read table with specific name. For example, the name of this table may be “Income Statement”. How can I do this, I have no idea.

Also how can I write a Regex Query about this topic?

Can you help with this automation please?

I would be very happy if you can help me in detail from my UiPath Forum and Personal accounts.

Linkedln : Hüseyin KIZIL
E - Mail : kizilhuseyin670@gmail.com

Nguyen_Van_Luong1 · August 7, 2023, 12:32am

Hi @Huseyin_Kizil ,

I think you can try
Step 1. You first read the pdf file using pdf read activity - save the string and you will spot a pattern
Step 2. Perform some string manipulation to get the string output from Step 1 to look like a csv format (replacing double spaces with “,”)
Step 3. UiPath can now save that string directly as a temporary csv file, but I see that you want the result in excel. To save the result in excel, there are two more steps.
Step 4. You can now read the temporary CSV file and save the content to a Extracted datatable. Further to keep things clean, you can delete the temporary CSV.
Step 5. Finally, using the excel activity write the Extracted datatable to an Excel file
But to detail, Can you share your PDF?
Regards,
LNV

Parvathy · August 7, 2023, 12:49am

Hi @Huseyin_Kizil
=> Use “Get PDF Page Count” activity and store the output in a variable say PageCount.
=> Initalize Count value to 1.
=> Use While loop give the condition as
Count< PageCount
=> Use Read PDF Text or Read PDF with OCR to read the PDF and store the output in a variable say str_text.
=> Use an If Condition to check whether str_text contains the particular table name like below condition:
str_text.Contains("Income Statement")
=> If the condition is true we can extract the table.
=> If the condition is false Increment the count by 1.

Hope it helps!!

Topic		Replies	Views
Find and search data in PDF file Studio	12	1165	October 31, 2023
Extract specific table within PDF Form with RegEx Studio studio , question , activities_panel	12	1644	March 8, 2023
Different pdf to excel Help studio	14	2425	February 11, 2020
Unable to extract table data from pdf file Studio studio , question , tools	4	1199	October 10, 2022
Data Scraping from PDF with multiple pages and tables into excel Studio question , community	6	2807	February 8, 2022

Most Active Users - Yesterday
ashokkarale
Yoichi
singh_sumit
sven.wullum1
adi.mehare
ppr
sonaliaggarwal47
shahidh.aqeel.shahul
Akash_Javalekar1
Sami_Rajput
More details...

How can I pull table with specific name and variable page number from PDF file?

Related topics