Hi guys I need help reading a section of pdf

Peter_Jenkinson · December 4, 2020, 9:57am

I am using google ocr to read my pdfs but it reads the whole page of each one

so what I want to to do is make is
read only a certain section , A rectangle or some thing
The pdfs have the same layout they just different info

What are the options
can the screen scrapper read multiple pdfs?

Or can I just set a region or rectangle or something before I run the google OCR?
Or maybe there is a easier way?

Thanks in advanced Guys and Gals

Peter_Jenkinson · December 4, 2020, 10:15am

here is the section I want to read on all the pdfs

Peter_Jenkinson · December 6, 2020, 8:19am

any help out there

any links I can follow please

jeevith · December 6, 2020, 8:36am

Hey @Peter_Jenkinson,

OCR is great for continuous homogenous rich text but sometimes when you have tables it is dicey and does not perform so well. Causes for this are mostly the text resolution, contrast and fonts embedded in the PDF.

Here is an alternative approach you could try:

Step 1. Open this PDF of your in word manually, yes word supports PDF formats but on occasions cannot render the correct formatting. If you are not interested in the formatting anyways, then this should be a good start point.
You can use word activity to open PDF in UiPath and read its content and save it as a variable.

This method depends on the type of PDF, if the PDF is saved with image as the content, this method will not work because the text then is present within an image and not rich text. So word will only show an image and read content (variable) from word will be empty.

Step 2. Then you perform either standard string manipulation or Regex expressions to extract the text you are interested in.

Hope this helps a bit!

Peter_Jenkinson · December 7, 2020, 9:57am

thanks for the help but this does not work for me

Peter_Jenkinson · December 7, 2020, 9:57am

Can you use a cropped image rectangle on a pdf somehow?

Topic		Replies	Views
Can you use a cropped image rectangle on a pdf somehow? Help ocr , activities , question	2	680	December 8, 2020
PDF Text Help	6	1283	May 13, 2019
Getting a specific part of PDF with OCR Studio studio , question , activities_panel	3	1007	December 2, 2022
OCR PDF Help pdf , ocr , activities , question	7	1038	December 4, 2019
Read PDF text Issue Help activities	3	1073	May 25, 2018

Most Active Users - Yesterday
Anil_G
ashokkarale
jinal.shah
Gautham_Pattabiraman
postwick
chandreshsinh.jadeja
vrdabberu
Ajay_Mishra
sven.wullum1
Vyshnavi_Nalumachu
More details...

Hi guys I need help reading a section of pdf

Related Topics