Data Extraction based on starts with in text file

Hi,
i want to extract the data from text file that data contains merged with serial no which is i highlighed in yellow colour in the attachment i want to separate those datas into next line.
anyone can help us

text data as below
0001 sdsafs kg 15.36 0002 sfrasf no 36.45 0003 dgdgskg 98.55 0004 fsafas ds 466.54 0005 jkgfvhjh no 98.36 0006 sdsafs kg 15.36 0007 sfrasf no 36.45 0008 dgdgskg 98.55 0009 fsafas ds 466.54 0010 jkgfvhjh no 98.36 0011 sdsafs kg 15.36 0012 sfrasf no 36.45 0013 dgdgskg 98.55 0014 fsafas ds 466.54 0010 jkgfvhjh no 98.36

Hi,

Check this xaml, I have split the string based on 0001 to 0010.

RegexSplit.xaml (6.2 KB)

1 Like

@sarathi125 : this is not work for the my input this is the my input

0001 2652919 Artemetherand Lumefantrine Tablets [20mg/120mg] no 100,000.000 0002 1089725 PRINTED BLISTER FOIL-[208/0.025mm][COMBIART 20/120mg][18s & 24s][WHO] [KRSG][ENG][REVD. FOR SSL][1669] kg 2.659 0003 1034123 PRINTED BLISTER FOIL-[216/0.025mm][CAM][COMBIART 20/120mg ][18s & 24s] [WHO][KRSG][ENG][REVD. FOR SSL][1670] kg 2.597 0004 1034124 PRINTED BLISTER FOIL-[216/0.025mm][COMBIART 20/120mg][18s & 24s][WHO, WITH ACTM LOGO][CAM CHANGE PART][KRSG][ENG][REVD. FOR SSL][1668] kg 2.688 0005 1024201 COATED PVC-[PVC/PE/PVDC][212MM][CLEAR][0.330mm][250Mic PVC] [25Mic.PE][90Gsm PVDC][Total:453GSM] kg 16.726 0006 1022441 COATED PVC-[PVC/PE/PVDC][216MM][CLEAR][0.330mm][250Mic PVC] [25Mic.PE][90Gsm PVDC][Total:453GSM] kg 17.042 0007 1031929 LITERATURE-[COMBIART][20/120mg][GLOBAL][SSL, KRSG][ENG][REVD FOR SSL] [F:6763; B:6764] no 4,166.667 0008 1037896 PRINTED CARTON-[74x15x106mm][COMBIART][20/120mg][1x24s][WHO-NIGERIA] [KRSG][ENG][SCRATCH INFORMATION][REVD FOR SCRATCH & for SSL][4393] no 4,166.667 0009 1031234 PRINTED CARTON-[230x165x112mm][COMBIART][20/120mg][30x1x24s][OUTER CARTON][WHO][KRSG][ENG][Revd. For SSL][4639] no 138.889 0010 1234567 SHIPPER-[5PLY][510x470x350MM-ID][180/150/150/150/180GSM EACH] [B.S. NLT 14 KG/CM.SQ] no 7.716 0011 1005510 RIBBON-102MM/360 METERS/8.5 MICRONS-IT1122 SCRATCH RESISTANT WAX RESIN THERMAL TRANSFER RIBBONS no 0.006 0012 1026664 SHIPPER LABEL -[100x150mm][PLAIN YELLOW COLOURED][70 GSM AVERY DENNISON CHROMO PAPER][16-22 GSM ADHESIVE][55 GSM RELEASE PAPER][FASCOAT SUPERIOR][PRODUCT IDENTIFICATIONNUMBER: TR-1377] no 15.432 0013 1987456 BOPP TAPE-[72mm][TRANSPARENT][SSLLOGO PRINTED][30 MICRON] no 0.152\f"

Replace the regex pattern with this one and try “(?=00+[1-9])0\d{3}”.

no its not working

Hi @Shriharsha_H_N,
Would you like to include the serial numbers along with your output like,
001 fsdfsdf sdfsd fd
002 sdfsdf fdsfsdf fdsf
003 fsdfdsf dsfdsf dfsdf

or you want to remove the serial numbers and display only the data like,
dasdad kg 12.5
adsadsad ds 35.22
dasdas ds sf 12.2

Also what would be your last serial number? which is less than 100? or less than 1000 or more than that?

It is working, and I have received the below output, if it is possible share your xaml.

This is my expression with regex pattern
System.Text.RegularExpressions.Regex.Split(strInput, “(?=00+[1-9])0\d{3}”)

10/04/2019 17:04:56 => [Debug] Starting RegexSplit execution.
10/04/2019 17:04:57 => [Info] PathCombineCheck execution started
10/04/2019 17:04:58 => [Info] 15
10/04/2019 17:04:58 => [Info] 1089725 PRINTED BLISTER FOIL-[208/0.025mm][COMBIART 20/120mg][18s & 24s][WHO] [KRSG][ENG][REVD. FOR SSL][1669] kg 2.659
10/04/2019 17:04:58 => [Info] 2652919 Artemetherand Lumefantrine Tablets [20mg/120mg] no 100,000.000
10/04/2019 17:04:58 => [Info] 1034124 PRINTED BLISTER FOIL-[216/0.025mm][COMBIART 20/120mg][18s & 24s][WHO, WITH ACTM LOGO][CAM CHANGE PART][KRSG][ENG][REVD. FOR SSL][1668] kg 2.688
10/04/2019 17:04:58 => [Info] 1024201 COATED PVC-[PVC/PE/PVDC][212MM][CLEAR][0.330mm][250Mic PVC] [25Mic.PE][90Gsm PVDC][Total:453GSM] kg 16.726
10/04/2019 17:04:58 => [Info] 1022441 COATED PVC-[PVC/PE/PVDC][216MM][CLEAR][0.330mm][250Mic PVC] [25Mic.PE][90Gsm PVDC][Total:453GSM] kg 17.042
10/04/2019 17:04:58 => [Info] 1034123 PRINTED BLISTER FOIL-[216/0.025mm][CAM][COMBIART 20/120mg ][18s & 24s] [WHO][KRSG][ENG][REVD. FOR SSL][1670] kg 2.597
10/04/2019 17:04:58 => [Info] 1031929 LITERATURE-[COMBIART][20/120mg][GLOBAL][SSL, KRSG][ENG][REVD FOR SSL] [F:6763; B:6764] no 4,166.667
10/04/2019 17:04:58 => [Info] 1037896 PRINTED CARTON-[74x15x106mm][COMBIART][20/120mg][1x24s][WHO-NIGERIA] [KRSG][ENG][SCRATCH INFORMATION][REVD FOR SCRATCH & for SSL][4393] no 4,166.667
10/04/2019 17:04:58 => [Info] 1031234 PRINTED CARTON-[230x165x112mm][COMBIART][20/120mg][30x1x24s][OUTER CARTON][WHO][KRSG][ENG][Revd. For SSL][4639] no 138.889
10/04/2019 17:04:58 => [Info] 1234567 SHIPPER-[5PLY][510x470x350MM-ID][180/150/150/150/180GSM EACH] [B.S. NLT 14 KG/CM.SQ] no 7.716
10/04/2019 17:04:58 => [Info] 1
10/04/2019 17:04:58 => [Info] 10 RIBBON-102MM/360 METERS/8.5 MICRONS-IT1122 SCRATCH RESISTANT WAX RESIN THERMAL TRANSFER RIBBONS no 0.006
10/04/2019 17:04:58 => [Info] 1026664 SHIPPER LABEL -[100x150mm][PLAIN YELLOW COLOURED][70 GSM AVERY DENNISON CHROMO PAPER][16-22 GSM ADHESIVE][55 GSM RELEASE PAPER][FASCOAT SUPERIOR][PRODUCT IDENTIFICATIONNUMBER: TR-1377] no 15.432
10/04/2019 17:04:58 => [Info] 1987456 BOPP TAPE-[72mm][TRANSPARENT][SSLLOGO PRINTED][30 MICRON] no 0.152\f
10/04/2019 17:04:58 => [Info] PathCombineCheck execution ended in: 00:00:00

@nimin i dont want serial no if its comes dont have problem

Hi @Shriharsha_H_N,

If you want to include the serial numbers,
then Assign New_String_Var = System.Text.RegularExpressions.Regex.Replace( strInput,"(?=\b0\d{3}\b)", Environment.NewLine)

If you want to exclude the serial number, assign,
New_String_Var =System.Text.RegularExpressions.Regex.Replace( strInput,"(\b0\d{3}\b)", Environment.NewLine)

Please verify the regex from here.

Warm regards,
NImin :slightly_smiling_face:

Thanks a lot its working

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.