Get info from PDF

Hello everybody,

I need to do a process where my robot reads a pdf file and gets information from it. The thing is that the pdf file can sometimes have more lines than predicted. Some values are static which are easier to get but how do i manage if one pdf comes with 1 item and another pdf comes with 7 items. Im not very familiar with the regex concept, i would appreciate if someone can provide an example of how to achieve this or a guide. Thanks in advance.

Hi @Raul_Cruz

After reading the pdf and storing results in string varaible

Then try by spillting string based on new line

But how do i know if theres a new line while spliting?


Have a read of my Regex Megapost.

First you need to get your text into UiPath.
Then decide whether to use String manipulation or Regex.

Take a look at this string manipulation mega post by @Adrian_Star.

Hopefully this helps.

I would work on providing a sample of your text from the PDF, then define the expected output, and tell us about the pattern.



1 Like

Another thing to look into, since I assume you are taking about line items for something like an invoice, is the possibility of using document understanding where you can grab a lot of the info needed by using the machine learning extractor in combination with regex extractors or form extractors. More details on how to make a document understanding workflow can be found on the UiPath academy and if you have further questions this forum is great place to ask!

1 Like

May I know the usecase ? @Raul_Cruz