Computer Vision - Extract table from a PDF

Hello,

I am stuck in a problem related to the latest Computer Vision activity… Can anyone please let me know can we fetch a column in PDF by looping the table? If its possible please share the solution.

I want to extract the table in the PDF image attached.

Thanks in advance

Hi @Vyshakh_Jain
First of all you have to extract the table from pdf using ocr activity ,if you are getting the correct output from the ocr then it is good else you have to use other ocr available in the market like freeocrapi or you can use python script to extract table from pdf. once the correct data is extracted successfully from the pdf you can do anything on that data .

1 Like

Hi @Rishi1

Thanks for the reply. Appreciate it.

However, is it possible to extract the table data row by row using Computer vision activity?

1 Like

Hi @Vyshakh_Jain
You can do it using computer vision activity but i will say they are not 100 % accurate ,all the ocr activity comes under computer vision and they try to recognize the character based on the algorithm .
If you want to extract that from that then your pdf has to be perfect like there should be perfect sepration between rows and columns .
you can try for your pdf.

How about extracting a table from an application using CV ( we have to use CV since this is a citrix application)