Hi there @Danyel_Ural,
Just to clarify, the PDF documents are not digitised, i.e, they are scanned images?
In this case, you’d have to read with OCR, as you’ve noted, then parse the extracted text, either via Regex or Text Manipulation.
Given that there are 28 pages, is there a range of pages the price can appear in, for instance:
The first 3
The last 3
Pages 10-12
This will allow you to refine the extraction to only the relevant text, ultimately speeding up execution, as OCR can take considerable time to complete.
If you can provide some example (dummy) documents, I may be able to provide a rough example.
For example here I want the 1961 (Spot prices in the lower left corner).
I used “click activity” to show that he gets the whole page as one big element window. Thats my problem. I tried with “double click acitivity” and send hotkey ctrl+c and ctrl+v to insert it into an excel sheet but he copied the 2 from 2-Nov-18.
This is a nice idea.
I tried it out and now I have a text file and I see that the number I am looking for is always in row 21.
Now I could paste it into an excel sheet and delete all except row 21.
Is there a way to extract row 21 from a .txt file in an easier way?
I sadly dont know all the possible ways but I could imagine that there is a way to put the string into a datatable and put only a certain row from the datatable into an excel sheet?