Extracting table in PDF document dynamically

Dear Team,
Can we extract Multiple rows of table in PDF?
I have 5 rows of table in 1st PDF and 8 rows of table in 2nd PDF, is it possible to extract the tables dynamically?

@Ram_Shiva_Reddy

You might need to integrate with Python
There was an library where it is able to extract the Tables from PDF file

Check below video for your reference

Hope this may help you

Thanks,
Srini

Hello sir,
Thank you for your response.
Is it possible to extract the tables varying with rows through Document Understanding?

Hi @Ram_Shiva_Reddy ,

Even though we could perform the task using Document Understanding, we would firstly like to understand what is the quality of the Document at hand, are there going to be documents which are digital and Scanned or only Digital Documents/PDF’s.

Let us know more about the Document Samples, It’s types, Number of templates that you would be receiving, then we should be able to make an appropriate suggestion towards the steps needed.

Please find the attached documents for your reference.
Need to extract the tables from both pdf’s
Test - 1.pdf (45.3 KB)
Test - 2.pdf (50.5 KB)

@Ram_Shiva_Reddy

Yes, you can use Document understanding, As @supermanPunch said you have to understand your PDF document of which quality etc., and later you can train the document to extract the required fileds

Hope this may help you

Thanks,
Srini

Sir,
It’s a structured format pdf. It has same Key and same format. but only the table rows varies. Once if I give the path the bot has to extract the all files with irrespective of table rows.

@Ram_Shiva_Reddy

Check below video for your reference

Hope this may help you

Thanks,
Srini

Sir,
This is partially useful. Because within the video the pdf has same number of rows. If the pdf varies with rows, then through Document Understanding the table is not extracting. If I indicate only for 4 rows then for the next pdf which contains more rows doesn’t extracts. If I indicate the highest number rows pdf as default then whatever the text present below the table also extracting within the table. So here I only need the table to extract irrespective or rows

@Ram_Shiva_Reddy

I think below post relates your question

Hope this may help you

Thanks,
Srini

@Ram_Shiva_Reddy ,

Since, the PDF shared are digital documents. You could check with the workflow provided in the post below :

I did test it out with your PDF, it does seem to extract the data properly. Do let us know if it is still not working.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.