I was trying to extract specific text from pdf to excel. could someone help me how to do that?
I have a pdf file with invoice num, invoice date, due date. I want to extract these three things from the pdf document and put it into excel.
Hi @venky
–read the pdf first and then use regx to extract the required data and then prepare data table and then insert the extracted data into data table and finally use write range activity to write into excel
If possible share pdf so that we can suggest regx to get the nvoice num, invoice date, due date
Hi @venky
I used Anchor Base for Find Element, but this is not work in new Adobe Reader DC.
I needed to compare the data with Excel with the data in pdf.
Maybe it will help you
@venky Is it possible to share screenshot or pdf
we can use computer vision activity to access these elements individually and get them as a string…kindly have a look at this buddy
Cheer @Sajuri
@indra, @kalyanDev The above screenshot is the pdf file which I am working.
Did that activity works on your scenario buddy @venky
@venky You can use regular expression to get the data from the pdf
to get invoice number regex is here
to get invoice date regex is here
Dim match as Match = Regex.Match(<pdf data>,"Invoice [#\d]{6} ",RegexOptions.IgnoreCase)
Invoice = match.Value
match =Regex.Match(<pdf data>,"Invoice date: [\d]{2}[\w]{2}[\s][\w]{3}[\s][\d]{4}",RegexOptions.IgnoreCase)
invoice date = match.value
match =Regex.Match(<pdf data>,"due date: [\d]{2}[\w]{2}[\s][\w]{3}[\s][\d]{4}",RegexOptions.IgnoreCase)
duedate = match.value
Thank you buddy, but I just wanted to help him My problem with pdf was solved in UI tutorials
But thank you for your help @Palaniyappan
Fine
Cheers @Sajuri