Extracting table from PDF and splitting row by column

supermanPunch · April 20, 2022, 8:26am

We would need to Identify the Different PDF formats from the Start if at all possible, We would require to Gather all the formats, and it’s Samples and Test whether the Extraction using Simple Methods like Regex/String Manipulation is possible.

Once, we have Gathered all the pdf formats that would appear as Inputs, we should be able to Categorise them, based on their Keywords, So that we can Use a Particular Regex Pattern/String Manipulation Technique to Extract Data for that Particular PDF and so on for other PDF formats.

If the above is not at all under the consideration, then you might need to use Some Third Party Tools for Extraction, which can be used to Extract any Table format in PDF but also may require you to Purchase it’s License or Only have a Trial Version of it.

One Such Component is Provided in the Below Post, which looks very similar to your case :

You could Browse More Such Components in the UiPath Markeplace :

However, If you Currently need the Extraction for the PDF type provided, you can Check the Below Workflow which is done using Regex Patterns:
Extract_TableData_Regex.xaml (9.2 KB)

Let us know your decision and Thoughts.

Topic		Replies	Views
How to extract the pdf file without borders for the table and stored into excel RPA Discussions coding	8	1676	April 21, 2022
How to extract the Table form native PDF with Regex Studio studio , regex , question , data_manipulation , help , linq , data-extraction	4	656	April 20, 2023
Extracting text from a table in PDF file Help	9	2414	July 6, 2019
How to extract Table data rows from PDF? Activities pdf , activities , question	3	469	April 3, 2023
How to extract table data from pdf RPA Discussions general	10	3526	April 23, 2022

Most Active Users - Yesterday
rlgandu
mkankatala
ashokkarale
postwick
Yoichi
Anil_G
Parvathy
avejr748
lrtetala
MF.RPA
More details...

Extracting table from PDF and splitting row by column

Related Topics