With RegEx Search specific data from PDF File don't understand how it works and can't find example

Hey Everyone,

I want to get with RegEx data from a PDF like:

PDF: Day-Date:2021-11-02
Result: 2021-11-02

(this is an Example because I can’t give out the real data so please accept in this case that it is not possible to get the Date with a Regex thanks)

I can’t use RegEx for the Date because the style of the date is different in every PDF. Now I want to use RegEx Search, but I don’t understand how this works in specific and I can’t find a description. I can’t use Achore Base because I want to use this robot on more than one PC and I read in the Forum that it is a bad idea to use the Anchore Base if I want to use the Robot more than on one PC.

I hope someone can help me with that.

1 Like

You can create an array to store the different format you will be expecting in a pdf. Example below

{“dd-MM-yyyy”,“dd/MM/yyyy”,“dd/MM/yy”,“dd.MM.yy”,“dd.MM.yyyy”}

And Parse the date variable to have a standard date format to be used in the processing.

DateTime.ParseExact(strExtractedDayeFromPdf. ToString, arr_DateFormat, new CultureInfo(“en-US”), DateTimeStyles.None).ToString(“dd-MM-yyyy”)

Make sure to include namespace System.Globalization

Attached file may help you. Cheers!!!

DateExtract_Asnmnt_18.4 sam.zip (20.8 KB)

1 Like

@uandi_ks

Check below for your reference

Hope this may help you

Thanks

Hey @rahulsharma ,
Thanks for your help but I can’t do this because the Date is an Example, I use for the date anther Data Like 12as2345jnmad and this number changing every time. Sorry I think my Example was not the best. so, I need to fide the combination from the before coming word like:

the number: 12as2345jnmad

so, I wont to identify the number over ‘number:’ because the number can be 123ik or 12340-2340-1234 I think that is not possible to find with regex. Or is it possible to find something like this with a regex?

1 Like

Hey @Srini84 ,
Thanks for your help I don’t think that this help me please Read the comment above. I think I used a wrong Example sorry for that.

Hey @uandi_ks

I’m afraid it’ll be hard to suggest a solution if we don’t get the exact text formats. I do understand that you can’t share as it’s client sensitive.

I’ll just say if you can get the pattern from that string or may be number of patterns then you can use the similar logic above. Just to add you csn extract only numbers from a string like 12as2345jnmad by using a regex /d. Rest is on understanding the format of date(what’s dd, what’s mm and what’s yy), once you get it then you can simply parse it to any other date format.

1 Like

This syntax would extract any word that comes after Day-Date and before the end of the line:

(?<=Date:).+

image

Let me know if this solves your problem.
Best,
Charbel

Hey @Charbel1,
thanks for your help, I didn’t now that this is possible in Regex. This is exactly wat I looking for. Thanks for this.

I’m happy to help! And yes, Regex can find almost all types of matches that you wish in a string.:ok_hand:

Feel free to tag me when you have another question.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.