@retrieving values from pdf

pdf

#1

Hi team,
I need to get the following values from the pdfs attached.
there are 2 pdfs in different format, so I’m not able to use Start index and End index.
I have searched forum as well, din not find any related solution.

The values to be fetched are “invoice no”,“vendor taxid”,“date”,“currency”,“item description”,“untaxed amount”,“tax amount”,“total amount”.

Experts, looking forward for your help.:slight_smile:

Thanks in advance. Invoice-325073.pdf (8.4 KB)
Invoice-592124.pdf (28.5 KB)


#2

Hi @akhila.a,

In you particular scenario I would suggest you to use Regex functions in order to help you gathering the information needed from different PDF layouts. Basically, you are going have to do the following steps:

  1. Read PDF Text activity to convert your PDF file to a string variable
  2. Matches activity to get the information needed from that string variable

For example, to get invoice id you could use the following Regex pattern:
(?<=Invoice).*?(\d+)

image

image

https://regex101.com/r/lAdgn4/1/


PDF opening problem
#3

Hi,
thank you so much for the reply…
yes I’m using Read PDF activity.
can some one attach a workflow for the same, I tried but did not get the solution.

thanks in advance.