Table extraction pdf

devendra_bhardwaj2 · June 20, 2023, 9:40am

How to extract the tabular data from pdf which contain text as well as structure data and page number is not fixed too…

arjunshenoy · June 20, 2023, 9:42am

Hi @devendra_bhardwaj2, welcome to the Community.

If the data in the file is not available in a fixed position or if the file is of semi-structured/unstructured format, then you could consider the option of implementing Document Understanding to extract such data.

Using DU, you can build & retrain ML models which will help you extract the data from files like this.

Hope this helps,
Best Regards.

Rohit_Jere · June 20, 2023, 11:24am

Hello @devendra_bhardwaj2 ,
Is the pdf structure fix?
If at all its fixed you can simply use regex along with String.indexOf function to get contents from pdf.

devendra_bhardwaj2 · June 20, 2023, 11:38am

No it is dynamic data…

Topic		Replies	Views
How can extract a table from a PDF file.The table structure is not fixed Activities pdf , activities , studio	1	178	January 11, 2024
PDF Table extraction Studio	9	15562	July 15, 2023
Unable to extract complete table in Document understanding Studio studio , question , document_understanding , activities_panel	16	907	October 30, 2023
Extract Table data from PDF Help datatable , studio	19	16499	August 29, 2019
Extract specific table within PDF Form with RegEx Studio studio , question , activities_panel	12	1680	March 8, 2023

Most Active Users - Yesterday
pikorpa
prashant1603765
Anil_G
ben.smith
jrdev2
More details...

Table extraction pdf

Related topics