Linking Regx With Excel value

Hi all, I need a regex pattern that will work with an excel cell which will have different invoice amount values. Excel cell value + Pattern will go into a folder of differnent PDF’s then try match the value in the pdf.

Sample:

Excel value will always be changing
excel cell value - 1236

Expected Output:
Match all these possible outcomes of Invoice total amount format structures:
-1923.00
-1,923.00
-$1923

The Pattern:
I found this pattern which is able to highlight the amount regardless of format - \d+(.\d+)?

Each pdf displays the invoice total amount in different, just attempting to make sure the cell value and total amount value in the Invoice match.

Any help will be appreciated, Thank You

Try this pattern to identify amount from pdf:
-$?\d,?\d+(.\d+)?

Does excel cell value is just a number everytime or do you want to match 1236 with $1236 or 1,236.00 or 1236.00

Yes the cell value will be a number everytime and yes I want to match 1236 witt $1236, 1,236.00 or 1236.00 that is correct

ok. In that case, try this:

  1. save your pdf amount in string variable(say var1)
  2. Replace the characters -, $ and , with “” like var1.replace(“$”,“”).replace(“-”,“”).replace(“,”,“”). This doesn’t throw exception even if var1 doesn’t contains $ or , or -. After executing this command, just number part will be remaining like 1236.00
  3. Convert this var1 to double using Convert.Todouble(var1)
  4. Also convert your excel cell value to double, if it’s not in double format

Now you can compare excel cell value and pdf amount

1 Like

Thank you replying, I will build this automation with sugessted methods. Much appreciated

1 Like

Hi @Mpumi_Setshedi

Please mark my post as solution, if it helps you.

Another solution…

  [\$\d,.]+
1 Like

What if the currency is R1236 or ZAR1,236.00

is that all the variations you have ? or you have more??

You can replace them or any unwanted characters using replace method as I said in my previous post:

like var1.replace(“ZAR”,“”).replace(“R”,“”)

@Mpumi_Setshedi - Please check this…

Hi @Mpumi_Setshedi
You can regex method to try the matching

assume that you are iterating through your datatable [ format of excel is preserved] using for each row

Then use this to validate whether the excel amount value matches pdf amount value

System.Text.RegularExpressions.Regex.Match(pdf_amount,row(“Amount”).ToString).Count<>0

This will validate whether the pdf_amount variable had the value as per the excel data

Hope it helps you
Regards,
Nived N
Happy Automation

Thank you so much, those the only currency variations. So if 1236 is my cell value, I will be able to make the pattern reference that value then use it to find the Invoice amount no matter the format of the number?

Yes…As shown above, the Regex pattern covers the variations which you gave.

Please try and let us know.

Hi Surya, may i share my flow to check if architecture is correct?


Main.xaml (12.1 KB)

1 Like

I’m not able to figure out the complete process from your xaml file. Also some dependencies are not resolved in my studio.

I suggest you to use the regex pattern “-?($|ZAR|R)?\d,?\d+(.\d+)?” to match the amount from PDF and excel.
Once you get the amount value(both for pdf amount and excel value), you can use replace method to replace all unnecessary characters as I said in my previous replies.
Convert both PDF amount and excel values to double.
Then use for loop and if condition to match the values as per your requirement.

1 Like

Thank for the pattern once again. In my folder of invoices some say:

  • Total Amount Due 1234
    -Total 1234
    -Total Due 1234

How can i cover these case to extract number

@Mpumi_Setshedi - you mean -1234 ?

Yes, To make sure the number value is always after Total Amount Due, Total, Total Due

@Mpumi_Setshedi - Please check this…

1 Like