Dear Team,
Can we extract Multiple rows of table in PDF?
I have 5 rows of table in 1st PDF and 8 rows of table in 2nd PDF, is it possible to extract the tables dynamically?
You might need to integrate with Python
There was an library where it is able to extract the Tables from PDF file
Check below video for your reference
Hope this may help you
Thanks,
Srini
Hello sir,
Thank you for your response.
Is it possible to extract the tables varying with rows through Document Understanding?
Hi @Ram_Shiva_Reddy ,
Even though we could perform the task using Document Understanding, we would firstly like to understand what is the quality of the Document at hand, are there going to be documents which are digital and Scanned or only Digital Documents/PDF’s.
Let us know more about the Document Samples, It’s types, Number of templates that you would be receiving, then we should be able to make an appropriate suggestion towards the steps needed.
Please find the attached documents for your reference.
Need to extract the tables from both pdf’s
Test - 1.pdf (45.3 KB)
Test - 2.pdf (50.5 KB)
Yes, you can use Document understanding, As @supermanPunch said you have to understand your PDF document of which quality etc., and later you can train the document to extract the required fileds
Hope this may help you
Thanks,
Srini
Sir,
It’s a structured format pdf. It has same Key and same format. but only the table rows varies. Once if I give the path the bot has to extract the all files with irrespective of table rows.
Sir,
This is partially useful. Because within the video the pdf has same number of rows. If the pdf varies with rows, then through Document Understanding the table is not extracting. If I indicate only for 4 rows then for the next pdf which contains more rows doesn’t extracts. If I indicate the highest number rows pdf as default then whatever the text present below the table also extracting within the table. So here I only need the table to extract irrespective or rows
Since, the PDF shared are digital documents. You could check with the workflow provided in the post below :
I did test it out with your PDF, it does seem to extract the data properly. Do let us know if it is still not working.
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.