How to extract blocks of tabular data which spans multiple lines from PDF

KL_Low · June 6, 2021, 2:42am

Hi all,

May I ask what would be the best method to extract blocks of tabular data which spans multiple lines from PDF?
A sample of how the data looks like in PDF is attached below.
Please note that the “blocks” of data can also span multiple pages.

Will need to extract into DataTable where columns can be like:
“Seq”, “Serial Number”, “Reading 1 Start”, “Reading 1 End”, “Reading 2 Start”, “Reading 2 End”…

2021-06-06_10h36_13

Thanks!

kumar.varun2 · June 6, 2021, 2:50am

Hello KL Low

Welcome to the UiPath Community.

Try to use Read PDF text activity to convert it into string and then use regex to extract the required data. Please share the pdf file or the text extracted for further help. This activity is available under UiPath.PDF.Activities package that you can download from manage package section.

KL_Low · June 6, 2021, 1:40pm

Hi Varun Kumar,

Thanks for the guidance. Let us try to make it work and update back here.

Regards,
KL Low

Cristian_Negulescu · June 18, 2021, 8:48am

Hello Rana,
In this video, I have 17 examples with code on how to extract data from PDF (try some VB.NET logic on your PDF ):

Thanks,
Cristian

Topic		Replies	Views
Extract PDF tabular data Studio datatable , excel , pdf , activities , data_scraping	10	1408	February 24, 2020
How to extract the multiple value's from the multiple rows and columns Robot robot , studio , question	3	584	January 3, 2023
Extract table from pdf as it is Activities pdf , studio	15	3960	March 4, 2024
How to extract multiple text details and table info from PDF file Studio	6	270	October 31, 2023
Table Extraction in Pdf Help	4	669	February 11, 2020

Most Active Users - Yesterday
ashokkarale
MD_Farhan1
Ajay_Mishra
postwick
Dheerendra_vishwakarma
Anil_G
chandreshsinh.jadeja
Gautham_Pattabiraman
vrdabberu
aravindbalineni123
More details...

How to extract blocks of tabular data which spans multiple lines from PDF

Related Topics