Trimming words in a data table


#1

I am facing issues in rxtracting words in a data table,

I want to extract line number 2 in the below data datatable…as in get only the date and time part

Priority
01.01.2018 09:59:17,9276630 Egretli,Osman
(EGRETLI_O700)
ECC_U2K2_01 System added to user AITIHAD_L500 in the request for action
‘Change User’ with validity 01.01.2018-31.12.9999 01.01.2018 09:59:17,9289660 Egretli,Osman
(EGRETLI_O700)

any suggestions?


#2

@shreyaank entire text you have shared was in an single cell of datatable or you have share data in multiple cells ?


#3

it was in single data table, the data table contains many more rows n columns but I want to extract the line below Priority-just the date time stamp, also the data table each time generated for various PdF files may vary but this Priority is constant in all data tables, and date time stamp is also constant below Priority word.


#4

Is the timestamp always in the same text position?
If yes try tu use substring, if no try to use REGEX to get the text.


#5

so what would be the syntaxt for that?


#6

Or I can see other option. Always comma is at the end of the timestamp?
if yes, you can find first comma positions and then use sub strung (before reading number of chars in string.


#7

\b(\d{1,2}.){2}\d{4}\s(\d{1,2}:){2}\d{1,2}\b regex pattern for your date format


#8

it works for all date time formats in general, but the issue is that I have many date time formats in the data table, how to filter out the date time format which is right below the word Priority or near that word?


#9

@shreyaank

Can you try this

(?<=Priority\n)((\d{1,2}.){2}\d{4}\s(\d{1,2}:){2}\d{1,2}?)(?=,)

The pattern will fetch the date value in between Priority with newline and comma(,)


#10

Thanks bro . It did work.But in the below scenario I’m trying to fetch the date time below the line Approved by using

(?<=Approved by\n)((\d{1,2}.){2}\d{4}\s(\d{1,2}:){2}\d{1,2}?)(?=,)

but its not showing the date time. Below is the datatable.

Approved by EBOY at Path NEW N CHANGE (U2K2) and
Stage GRAC_ROLEOWNER 09.02.2016,11:12:12,609 Boy,Elçin (EBOY)
Role Z10.TR.ALL.CD-DISPLAY in system ECC_U2K2_01 is

Can you help>


#11

Do you this value?

Is Stage GRAC_ROLEOWNER and Boy,Elçin will be constant?


#12

No Approved by is our key word here, I have to extract date time below the word Approved by, rest all details may differ each time as im extracting the date time on which the item was approved.


#14

@shreyaank Try this

(?<=Approved\sby(.*)\n(.*))(((\d{1,2}.){2}(\d{4}),(\d{1,2}:){2}\d{1,2},\d*)?)(?=\s)


#15

(?<=Approved\sby(.)\n(.))(((\d{1,2}.){2}(\d{4}),(\d{1,2}:){2}\d{1,2},\d*)?)(?=\s) right?

It doesnt work- it says invalid pattern in look behind. im using rubular


#16

@shreyaank

Its shows pattern error in look-behind because of (.) wildcards used in the look behind portion(.)*.

Try to run the same pattern in UiPath or Dot Net regex testers like http://regexstorm.net/tester


#17

Nope. It doesnt work in ui path as well.
Approved by EBOY at Path NEW N CHANGE (U2K2) and
Stage GRAC_ROLEOWNER 09.02.2016,11:12:12,609 Boy,Elçin (EBOY)
Role Z10.TR.ALL.CD-DISPLAY in system ECC_U2K2_01 is

So any idea if we can consider the key words to be Approved by, at path, and stage will it help in extracting the date time?


#18

Hi @shreyaank,

Try this one,

((\d{2}.\d{2}.\d{4},\d{2}.\d{2}.\d{2}).*)

https://regex101.com/


#19

It isnt dynamic. My issue is it should work for any data table containing key words- Approved by, at stage,at path and the date time would be nearer to that.

Approved by EBOY at Path NEW N CHANGE (U2K2) and
Stage GRAC_ROLEOWNER 09.02.2016,11:12:12,609 Boy,Elçin (EBOY)
Role Z10.TR.ALL.CD-DISPLAY in system ECC_U2K2_01 is


#20

Also the code sent by you isn’t extracting the date time format alone, it extracts characters after date time as well


#21

@shreyaank do you want to select the datetime in a string which has all the keywords Approved by, at Path,and,Stage ?