Dynamic OCR data Extraction from PDF

strqsr · October 15, 2019, 9:53am

I’m testing out a usecase where we extract data from images, furthermore it’s unstructured data, so I’ve been manually scrapping the data position of each image but I was wondering if anyone could help me think of a way to loop it effectively?

I thought of using anchor base and find ocr text but the thing is the only commonality would be the email address’ @ on the image. If anyone has a suggestion please do let me know! Thanks.

apurba2samanta · October 15, 2019, 10:43am

Hi @strqsr,

Can you share screenshot of the data?

Thanks & Regards,
Apurba

strqsr · October 16, 2019, 9:09am

It’s just a bunch of random images that has an email address, I’m trying to see if it’s possible to extract the details that are in common e.g: email address/phone no.

Topic		Replies	Views
Dynamic text extract using OCR Help ocr , activities , data_scraping , question	1	1196	December 22, 2019
Extract unstructured data (table) in a PDF Help studio	4	3776	April 25, 2018
How to extract data from pdf files on a dynamic way with OCR Activities pdf , ocr , activities , question , tesseract-ocr , ocr-engine	5	1702	October 15, 2022
Looping pdf files in the folder and extracting particular data from each pdf file Help	9	3806	October 17, 2019
Extract unstrucured Data From PDF and not with a fixed Position on each Page Help uiautomation , pdf , activities , studio	7	2298	August 16, 2019

Most Active Users - Yesterday
ashokkarale
prashant1603765
sharazkm32
V_Roboto_V
sonaliaggarwal47
Ranveer_S_Thakur
Aki1111
arivu96
chaitanyaKumar
manasrlenka25
More details...

Dynamic OCR data Extraction from PDF

Related topics