Extracting Data using Regular expression

I am trying to extract pan number from pan card pdf using regular expression but i am not able to get the output . can anyone please help me with this issue? Below are the screenshots


image

@Seema_Jethe

It’s working

image

but can you check your input once again?

Thanks

1 Like

HI @Seema_Jethe

Here is the solution,


new pan2

Thank you.

@Seema_Jethe

Check as below

image

Hope this will help you

Thanks

@Seema_Jethe

Attached code
PANCardExtraction.zip (2.2 KB)

Thank you.

@Srini84 I am writing the whole pdf content to text file and then trying to extract pan number from the text file using regular expression can u help me with this ?
image
image
image

@Seema_Jethe

Can you share the screenshot of the text file you are extracting to?

Thanks

@Srini84 Below is the screenshot of text file where i am extracting pdf data and i want to extract pan number from this text file using regular expression Can u please help me with this issue?

@Srini84 Below is the screenshot of variable used as output of matches activity
image

@Seema_Jethe

Then write as below

pancardoutput(0).ToString

As it is a IEnumberable(Collection) you can write as below

Hope this will help you

Thanks

Hi @Seema_Jethe

Try this expression

System.Text.RegularExpressions.Regex.Match(“InputString,”(?<=\d{2}.\d{2}.\d{4}\n)(\S+)").tostring

Regards
Gokul

@Srini84 Thanks your solution helps but could u please tell me what is IEnumberable(Collection) and why have we used “pancardoutput(0).ToString” this expression?

@Seema_Jethe

The output for the Matches activity is a IEnumberable by default which we can say multiple results in a list / collection

If you have any idea of list then the results will store as {“1st Result”, “2nd Result”…}

So to call the result you can use by Index of the list

here the list name pancardoutput and 0 is the index of the result

So even we have multiple results we are calling only the 1st Result

Hope this clear your doubt

More reference check as below

If this helps, mark as solution, so that others also benefit from this

Thanks

@Srini84 thanks

1 Like

@Srini84 if i use some other pancard then pan number may not always be at index 0 in the list then is there any solution for such scenarios?

@Seema_Jethe

Let’s see how Regex works

If you are using multiple pancards at one time you will multiple matches which will have multiple results

But I believe it wont be your case, 1 pancard at one time right?

Then it will always store in the 0 index only

but if you are expecting multiple pancards at one time and multiple results at one time then you can use For each activity to loop into your results

Hope this will help you

Thanks

@Srini84 Yes i have only one pan cad at a time but when i run my flow using some other pan card then i get error at assign activity which has the expression as pancardoutput(0).ToString. below is the screenshot


image

Okay, that is because Regex is returning the null value, maybe the OCR is not able to give you correct results

In this case you can use IF Condition and write as below

pancardoutput(0).ToString <> “”
Then → Place assign
Else → give your logic / you can throw an exception

Seems not a good idea to continue this thread, I suggest to open a new post if you are facing any other issue

Thanks

@Srini84 ok thanks

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.