I have a pdf document which is an Invoices and I want to extract the information like Invoice No, Issue Date, Total Amt,Tax Amt etc. Can anyone please tell how I can automate this using UiPath.
-
use Read PDF Text activity to read data from PDF file and will give you output as String.
-
And then use String manipulation methods or Regex to get required details from it.
Hi
we can either use READ PDF or READ PDF with OCR activity where pass the filepath of pdf as input and get the output with a variable of type string named str_output
–now use a ASSIGN activity where we can either use REGEX or SPLIT or SUBSTRING method to get the term we want
this can done only after reviewing the output of the pdf that we get from pdf activities
Cheers @waseem
Hey,Can you help me out on how to use Sub string to extract data eg: subtotal,Tax
sure
can i have a string obtained from the pdf if possible with those terms in it
so that i can come up with a expression
cheers @waseem
BILL_2020_0002.pdf (32.0 KB)
from this I need to extract Reference no i.e 2020/0002 and PO no i.e P00013
Fine
I don’t have my system
Can you do me a favour
Kindly read this pdf with READ PDF or READ PDF with OCR and share that string output so that i can give you the expression based on it
Cheers @waseem
is this text fixed Vendor Bill BILL/ 2020/0002
here you go
check this sample workflow i’ve used your sample pdfReadPDF.xaml (6.0 KB)
No this is Not fixed all invoices have different reference number
is that format is fixed?
4digits/4digits ??
Yes the format is fixed
to get that value you can use this below regex
\d{4}/\d{4}
Thankyou
working??
@waseem
I tried using Sub string and it worked.
Thankyou for your help.
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.