How to get specific word of pdf that pdf is contain scanned invoice image


#1

I want to know, How i am getting specific(ex: Invoice No: 0012) word from the pdf file.
Above pdf file is contain scanned invoice image only.
Sometimes these invoice image is contain different angle.
How to solve this?


#2

Hi @Ahsan_Mohamed,

If it’s a scanned PDF then you should use the Read PDF With OCR and once you get the result from it, you would have to find a pattern of the invoice and then try to extract this information by manipulating the string with methods like Substring, IndexOf, Split and even with Regex (Matches activity in UiPath). That’s the only solution that I can imagine right now.


#3

Yah. It’s fine. But that invoice situated in different angle. So How to get specific word?


#4

That’s why I said that once you get the whole text from this PDF, you may try to find a pattern that will work most of the time or even work every time, it will depend on the different ways that your PDF is going be loaded. So, in summary, you will have to test it a lot to make sure that you treated all possibilities.