Can we read PDF table without using OCR

Hi,

I want to read whole pdf table without using OCR and save the table in excel.

Any idea?

Thanks

Hi @Snehamayi_Sneha,
Yes, we can read by using read full text activity but that does not give accurate results. so, we go with OCR.
And also there are external packages like ABBY for performing tasks on PDF.
Cheers
Vashisht.

Hi @Snehamayi_Sneha
you can use Read PDF Text activity but for that data in the pdf has to be of string type. but i don’t think pdf table will be extracted as it is. You can give it try I am sure it won’t work perfectly anyhow you have to use the OCR.

just try with abby flexi capture it is helpful to save the table in excel by export document.

ya thanks… But we dont want to use any OCR to read.any other options we are looking for.

if it is not a scanned file,you can extract the whole pdf by read pdf activity ,by using some regular expressions you have to split the table .

Hey…
By using Read pdf text the output is not coming.

is it scanned file?

it wont extract as tables to extract tables you have segregate by using read full pdf activity .better way to do is ocr activity.

1 Like

There is only Read PDF text right i didnt find any Read Full text activity.I tried using read pdf text columnwise but still its empty messagebox

1 Like

just install the pdf packages from manage packages you can find the read full pdf activity.

1 Like

hello @Snehamayi_Sneha

@priyankavivek is right
for refernece

Thanks

can you share me the pdf?

1 Like

variable name text is in the scope ??

yes the output of read pdf

1 Like

for me its working.
just update uipath.pdf package activity then try

please share pdf file with me. i will try

Thanks


can u take any pdf and give me a sample

1 Like

hello @Snehamayi_Sneha

i attached sample that read pdf and print in output panelpdf.zip (2.5 MB)

Thanks

Hey Sandeep…

Thanks let me try.

1 Like

No still i didn’t get any output because the pdf contains an image of table. so that it’s not able to read.