Scanned PDF files

Dear Team,
How to read Scanned PDF. I Have Scanned PDF from which I have to extract Some data like Company Name,PO Number,Invoice Number,Table of PDF,. Google OCR Is not capturing well .
Microsoft OCR is showing “Capture Error”.
Is it possible Uipath tools?.

Thank you
Ram

1 Like

Buddy @Ramalingaiah

You can extract the data from scanned pdf, but I wonder why ocr didnt… because it would work for sure buddy…
No worries buddy
For your case you can try with scrape relative activity keeping the company name, po number and other fields as relative clipping region and get the text with ocr scrapping…you can get this activity by clicking alt+ctrl+d…were desktop recording wizard appears…in which you can find the scrape relative under text option…

That would work buddy.
Cheers

1 Like

Hi @Ramalingaiah,

Use computer vision activities to extract values from scanned PDF’s as OCR will not give exact output.

Refer to the below post on how to install computer vision activities.

1 Like

Dear Palaniyappan/anil5,

thank Palaniyappan/ anil5 for replaying ,

Regards
Ram

2 Likes

Hi Palaniyappan/ Anil5,

i need content read for the png format file and is it possible in the Uipath? .
i have attached screenshot ,

than please help out me .

Thank you
Ram

2 Likes

@Ramalingaiah
Yes Buddy you can read that with read pdf ocr activity buddy @Ramalingaiah and the output would be of type text

1 Like

Hi Palaniyappan,
thank you for replying
i need content read for the png format file and is it possible in the Uipath? .
i have attached screenshot ,
read text Telphone Number and Names And Invaice No below given examples
Tel:614-489-8316,
Name: Josepb j Gatto,
Invaice no: OH43017,
please give me suggestion and any link
Thank you

2 Likes

Buddy @Ramalingaiah

If its a image then you can do the following

  1. use start proces to open the image and send the filepath as input

  2. once opened you can use screen scrapping with ocr and you have that option in design menu with screen scrapping.

  3. scrape the portion that you want to scrape and increase the scale of ocr google to more than 5 or 6 or untill the text is obtained correctly, you can do that by just increasing the scale in scrape wizard and refresh them


    Once after clicking this scrapping the region you want
    you will be getting like this

  4. Once you get the output from scrapping as output variable named out_text

  5. Use a split method to convert each line into array so that you can get each terms you want like with assign activity

  6. out_text_split_array =out_text.split(Environment.Newline.ToArray(),StringSplitOptions.RemoveEmptyEntries) where out_text_split_array is of type string [ ],

  7. then to get tel value and name and invoice you assign activity like
    out_tel_value = split(out_text_split_array,“:”)(1).ToString
    out_Name_value = split(out_text_split_array,“:”)(1).ToString
    out_invoice_value = split(out_text_split_array,“:”)(1).ToString

Thats all buddy you are all done…
Kindly let know this works or not @Ramalingaiah

Cheers

2 Likes

Did that work buddy @Ramalingaiah

1 Like