Extract number from pdf

Hi all,
i have to extract a specific number from PDFs that all have the same layout.
However the PDFs have no specific UI elements. So what is the best option to extract the PDF?


Hey @tobschroer

You can use get OCR text or Screen scraping :slight_smile:

Refer to these links:

Thanks @Rishabh_Lakhera.
I have tried it with the get text Action. But my message box is always empty.

Try Screen Scraping and play around with the ocr,fulltext and native options!
Always works for me :slight_smile:

1 Like

@Rishabh_Lakhera . It´s working right now. But only once. When the PDF isn´t open in the background a workflow exception appears and if i close the PDF, open it a second time and then start the workflow i get totally other Outputs in my messsage box. Do you have any idea?

If the PDF layout is fixed, We can use get PDF text to get all PDF content into a string variable. Can try using string manipulation for extracting the required Number.

Hi @tobschroer,
Perhaps try to change the scale in your Screen Scraper Wizard - e.g. when I need to OCR scrape a more complex text (that includes also capital letters and dots in dates), I increase the scale to 5. Does this solution help?

Check this out :slight_smile: