Extract table from PDF using Regex

Hi Team,

I would like to extract the details from pdf using regex… is there any way to extract the entire table from pdf using regex and write it in datatable… i hope you can help me with this… please find the sample pdf EDUCATION DETAILS.pdf (399.6 KB)

Hi @Muthulakshmi_Thangamuthu!

Please follow this guide to see if it helps:

Best Regards

hi @Muthulakshmi_Thangamuthu , Are you allowed to use CV activities?

If yes, then you can CV Extract table inside CV application scope…which will extract the table details to datatable. Then you can right the datatable to Excel. Job done…

Please check this post…idea is similar

Hello Padman,
In this video, I extract tables from PDF and write data in Excel:

0:25 Install PDF Activities
1:10 READ PDF text, Get PDF page count, Extract PDF
5:40 Read PDF with OCR
6:55 Join PDF and Manage PDF passwords
9:30 Extract Images From PDF and Export PDF as Image
12:00 Extract table from PDF use-cases 1 replace some spaces with | (one column has multiple words)
24:00 Run the robot to see the result
25:40 Extract Table from other PDF use-cases 2 delimiter is 2*spaces " " easy split
31:50 Extract Table from complex PDF use-cases 3 unstructured data the logic will be based on IsUpper and IsLower
40:25 Extract the price value from PDF

Cristian Negulescu