How do I get the info I want from this PDF?

Hi,

Believe me that I’ve tried to do it myself. I need to get the date from this invoice. This is from a UiPath class but I can’t find the solution. I’ve tried with Get Text Activity and the closest I get from the actual element is the below:

After that, I tried with Find Element - Get Attribute Activities and I got nothing because I can’t find the “aaname” attribute.

The only way I got the expected result was with OCR, but I believe it has to have a solution without OCR.

Here’s the PDF
Session_10_exercise_1_NPO_1_perfect_match.pdf (95.7 KB)

Cheers,

1.install pdf package

2.use matches activity
° If invoices are the same, you can create a regex using US date.
° If not, do an advanced regex using regex between 2 words.

HI,

FYI, another pattern of regex. Can you try the following pattern for output text of ReadPDFText activity (Let’s say strPDF) ?

System.Text.RegularExpressions.Regex.Match(strPDF,"(?<=DATE\s+)\d{1,2}/\d{1,2}/\d{4}").Value

If you want to use Matches activity, the following pattern might be better, for excluding DUE DATE.

(?<=(?<!DUE\s)DATE\s+)\d{1,2}/\d{1,2}/\d{4}

Regards,

Hi,

I used the Read PDF Text Activity and then the Matches activity with the US date regex and I get the below in the Message Box

“System.Linq.Enumerable+d__97`1[System.Text.RegularExpressions.Match]”

Hi,

Can you try the following sample?

The pattern is the following.

(?<=(?<!DUE\s)DATE\s+)\d{1,2}/\d{1,2}/\d{4}

Sample20221011-3.zip (91.1 KB)

Regards,

Hi,

I’m a bit confused here.

  1. I used the Read PDF Text Activity and set strPDF as the output value.
  2. Then used the Matches Activity setting the strPDF as the input and dateValue as output to print it.

Is that how I should use your suggestion?

Hi,

I just uploaded sample project in my previous post. Can you check it?

Regards,

Thank you so much, @Yoichi !!

I still have a lot to learn!

Cheers,

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.