Regex Based Extractor

bertc · January 2, 2020, 2:13pm

Hey Guys :),
i’m facing a problem, it’s my first time working with regex. can you give me some tips?

I use the regex based extractor and have the following text:

HK - Allgemeinanteil 359,25 33,00 Ant. 10,886364 1,00 Ant. 10,89 20% 2,18
HK - Energieabgabe 264,27 2.143,02 m² 0,123317 93,52 m² 11,53 20% 2,31
HK - Gebrauchsabgabe 28,11 2.143,02 m² 0,013117 93,52 m² 1,23 20% 0,25
HK - Grundpreis 6.083,69 2.143,02 m² 2,838840 93,52 m² 265,49 20% 53,10
HK - Messpreis 530,58 2.143,02 m² 0,247585 93,52 m² 23,15 20% 4,63

this is ocr data from a PDF

This data is written into a txt file from where it is to be processed further in SAP.

Now my problem I need only one column.

HK-STRING(Column) and the column where 10,89…11,53…1,23…265,49------23,15

I need these 2 columns for further processing

There can be a maximum of 4 places and 2 after-commercial.

can you please help me?

Best Regards Chris

mc00476004 · January 2, 2020, 2:16pm

Can you share the pdf with me?

Yoichi · January 2, 2020, 2:20pm

Hi,

Can you try the following expressions?

System.Text.RegularExpressions.Regex.Matches(strData,"HK - .*?(?=\s)")

System.Text.RegularExpressions.Regex.Matches(strData,"[\d,]+(?=\s\d+%)")

I’ll attach sample fyi as the following.

Sample20200102b.zip (13.9 KB)

Regards,

kadiravan_kalidoss · January 2, 2020, 2:26pm

Hi @bertc,

Split the whole string output through newline and iterate it,

use the following regex pattern to identify the column 1 and column 2 values,

Regex pattern for HK String: “HK - \w+”
Regex pattern for Column 2: “\d+,\d+(?= \d+%)”

Thanks!

bertc · January 3, 2020, 7:32am

Thank you it works! <3

Best Regards Chris

system · January 6, 2020, 7:43am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to use Regex based extractor activity Activities uiautomation , activities , question	4	1164	October 16, 2020
Having problem with regex based extractor Activities ocr , regex	7	711	May 29, 2022
Regex Based Information Extraction Test Suite question , test_suite	8	281	September 21, 2023
Need help with idea for string manipulation Activities pdf	7	1243	October 18, 2021
PDF Extraction using regex expression Studio	4	168	January 8, 2024

Most Active Users - Yesterday
Anil_G
ashokkarale
Ajay_Mishra
Gautham_Pattabiraman
BHUSHAN_NAGAONKAR1
vrdabberu
ABHIMANYU_THITE1
lrtetala
samantha_shah
shyamala_shyamu
More details...

Regex Based Extractor

Related Topics