How to get the specific data from the pdf using ocr

studio

#1

how to get the specific data from the pdf which contains image .

1> here i need to get only the date and total amount from this bill .

how can i do this please can anyone help me with this .


#2

HI,

Refer the above links


#3

ya i can read all the text but how can i read the perticular text .like pattern .


#4

Use Get OCR Text activity and do screen scrapping


#5

hi , ya i got it .
lets consider their are some bills .if we use the screen scrapping we can get the data at a perticular position . then how can we do this ,


#6

what if we get the data dynamically .


#7

Hi @akshaygm12,

ok scrap the data from the particular position in the PDF


#8

Hi @akshaygm12

If it is a real text, you should be able to use the activities such as Get Full Text or Get Visible Text and then filter it by using regex matches or .split method.


#9

ok ill try this .


#10

Hey @akshaygm12

If the positions are going to remain constant(even though the data changes each time) you’ll have no issues while screen scraping.
The best way to go about it to use screen scraping, it’ll get you the best result and you wouldn’t have to use any extra functions as well.

Regards,
Rishabh.