Can't get words position using Microsoft OCR and Read PDF with OCR


I’m trying to do the following thing. I have a scanned document in the pdf format. I use “Read PDF with OCR” activity plus Microsoft OCR. There is a possibility of extracting KeyValuePair having used the Microsoft OCR thing. I store it in a variable and then generate the data putting the variable to “input” → “positions”. When I use “Output Data Table” to see the result, it contains only one column with the extracted pdf text. There are no each word positions. Is it possible to fix it?

Can you share your xaml and the pdf

Yeah, sure I’ll do it in 5-6 hours, just don’t have an access to my computer right now.

@arathi Here it is.
Main.xaml (11.3 KB)
109970.pdf (502.7 KB)

hi this is what I am getting after reading the pdf. I have tried using language “rus” and “russian”. Let me know :slight_smile: whats the result for you :slight_smile:

Good day :slight_smile:

Hey :-). Thank you for your attempt. My results are a bit better if I use “Russian” with scale range of 0.7-1. But still I can’t get the word positions.