How to extract data from digitize pdf

Hi Team,

I want to extract data from digitize pdf So, please help me by using which activity we can extract data correctly.
As already i used tesseract ocr, omnipage ocr, read pdf text activity but data is not extracted properly by using this activity.
Please let me know which other activity we can use to extract data properly.

Hi @Smitesh_Aher2
The approach that you are using should have worked technically.
Could you also try with just read pdf text, without using any OCR engines maybe.
See if this can help you.
Probably after the extraction you can use regex to get what you want.

If this still does not help, please do attach an informative screenshot for helping you further.

Hope this helps.
Please mark it as a solution if it resolves your issue.
Happy Learning :slight_smile:

Hi @Smitesh_Aher2

Please use the Document Understatnding:

Please follow the link,

If you found helpful, mark as a solution
Happy Automation

Hey @Smitesh_Aher2 Try with Document Understanding method. and can you show the document sample.so that way we could help you more .

Hi @Smitesh_Aher2,

If Read PDF Text from a Native (Digital) Document doesn’t give you a proper output then you can also try with UI based automation on PDF.

Refer to the following YouTube Video - https://www.youtube.com/watch?v=AetgInrwM1s

Cheers!