I need to extract the total but sometime subtotal too appears like this
Total: 567
Sub Total: 3481
subtotal 38813
sub Total 38813
I need to get only the total
not subtotal
Please help with regex
I need to extract the total but sometime subtotal too appears like this
Total: 567
Sub Total: 3481
subtotal 38813
sub Total 38813
I need to get only the total
not subtotal
Please help with regex
If your text appears exactly as you have provided it, this regex will work:
(?<=^Total: )\d+
.
But the total can come in the middle of the sentence
Please provide an example for me to test.
Thus the Total: 567
Sub Total: 3481
today subtotal $38813
sub Total 38813
One additional thing added to this
If Total tax is provided can we omit it
Thus the Total: 567
Sub Total: 3481
today subtotal $38813
sub Total 38813
the Total Tax: 6672
Need only total, omit the sub total or subtotal and Total tax
Does the total always come before the subtotals after it?
you mean the lines
This may vary pdf to pdf
It may be simpler in this case to iterate over the matches for regex Total:\s+
, and use a For Each loop to find the first instance not containing sub
or Sub
. The regex above would capture the first and second lines, and the For Each loop would omit the second line, leading you to your line with the total. You can extract the value using my original regex, (?<=Total: )\d+
. This also works for your preceding examples.
@Sweety_Girl Can you Check this regex and Check it for all the types of Input that you have and verify if it satisfies :
Every thing is good except 1
That is,
If total is found in between the sentence like this,
Ram increases the Total 456
It must take the total unless the ‘sub’ comes before it as subtotal
It sounds like it will be easier to use a regex or different logic for each different document type. It will either be very difficult or impossible to handle all cases for all document types.
Yup… But we have more than 30+ formats pdfs
(?<![sS]ub)\s?[Tt]otal: (?<total>\d+)
or (?<![sS]ub)\s?[Tt]otal\s?:\s?\D?\s?(?<total>\d+)
if you expect money symbol