Native scraping method for PDF

Hi there!
I’m trying to get information from PDF file.

  1. “Read PDF” works, but the problem s that it gives me unstructured data (mix of numbers and letters)
  2. “Get text” captures by block, not by specific field (which is not helpful for me)
  3. Screen scraper’s:
    “full text” works as read “red PDF”, no structure, mixes number and letters
    “Microsoft OCR” does not recognize text
    “Google OCR” recognize badly
    “NAtive” recognizes very well, however, when im trying to output data through “message box” or “write line” it gives me an empty field.

The question is how can I output data through native scraping method from PDF file.

P.S. on some forum topics I saw Regex and split on spaces, what are those methods?

@Indiroy - you can use image scrapping method to scarp data from PDF

Hi @Indiroy
You have addressed quite a few issues here. Just to answer a part of your question from post scriptum, a regex (regular expression) is a sequence of characters that define a search pattern (usually this pattern is used by string searching algorithms for “find” or “find and replace” operations on strings, or for input validation (quoting after Wikipedia). In practice, you can find out more and test this e.g. on the websites like the ones below (but surely, you can google much much more than that):



One of the applications of regex can be found e.g. in here: