please help me find out regex expression for below values[
DESCRIPTION- Clear front covers
UNIT PRICE- $0.49
LINE TOTAL- $49.0
output.txt (803 Bytes)
Can you try the following sample?
Sample20211111-5.zip (3.0 KB)
please explain me how to build regex expression.
We need to find its rule if we want to write regex pattern.
In this case, at the beginning of line there is number. We can write it as
Other characteristic patterns is there are 2 prices with dollar sign at the end of line.
We can write it as
And remaining is DESCRIPTION as .*?
So we can write this line as the following pattern.
Then we need to extract each items and we use named group as the following.
your solution working for my process.
Thanks & Regards
I am trying to create separate variable to get value of quantity
Var_Quantity= System.Text.RegularExpressions.Regex.Match( Var_ReadPDF,"^(?\d+)\s").ToString
but unable to get value
please correct me
It would be the following, for example.
Var_Quantity = System.Text.RegularExpressions.Regex.Match(Var_ReadPDF,"^\d+(?= )",System.Text.RegularExpressions.RegexOptions.Multiline )
Is there any course which explain how do we write regression or select strings.
Check out this string manipulation megapost.
how to get separate values ?
eg- description, unit price, line total
how to get separate values ?
In the above sample. we can get each data separately as the following.
Does this work for you?
item.Groups("QUANTITY").Value item.Groups("DESCRIPTION").Value item.Groups("UNIT_PRICE").Value item.Groups("LINE_TOTAL").Value
this solution working
thanks & regards
This values are only for one PDF document. If I want to extract values for remaining pdf and pdf format is fix but values are different. then how can i automate this ?
because selector is not used here.
Test.zip (5.0 KB)
Invoices.zip (331.8 KB)
invoice_template.xlsx (10.5 KB)
1 Extract the data from the PDF and enter them in the downloaded Excel file.
2 Extract the data from the downloaded PDF based on the following condition:
a. Quantity should be greater than or equal to 2.
b. Unit price should be greater than or equal to 2.
c. Line total should be greater than or Equal to 100.
d. Due date should be greater than 01-April-2019.
e. Payment term should be due on receipt.
3 Enter the extracted data in the invoice_template.xlsx.
4 Rename the invoice_template.xlsx file to “Output.xlsx”
This values are only for one PDF document. If I want to extract values for remaining pdf and pdf format is fix but values are different.
It’s because there is thousand separator (.) in some targets.
The following expression will work for 6 pdf files you shared, at least. Can you try this?
Yes, I tried this expression, it gets same value for remaining invoices also
Can you elaborate your issue?
It seems no problem in my environment.
Only last invoice values are written into excel template .I want to write all invoice values in same excel template.
instead of $ sign writes rupees sign