Dynamic string Manipulation

hi, i have 2 pdf files .
i used READ PDF text and write text .
but i could not get get dynamic data .columns fixed values changes pdf to pdf
Columns Fixed but values changes dynamically.
how could extract these types of data ?
i attached word document
String manipulation.docx (12.2 KB)

1 Like

Hello @Anand_Designer

Use regex to get your dynamic data… :grin:

hello @Anand_Designer,
please mention what value you want to extract.
I’ll try to write the code for you :slight_smile:

i want to extract values --(0.200 M3, 08338697400 )

@Anand_Designer

What is static and what is dynamic on your file ?

in my drive PDF files(mostly 50 pdf files is there) static columns below like ----

IMPORT CUSTOMS BROKER WEIGHT VOLUME CHARGEABLE PACKAGES

values changes in every pdf like below…
ABC LTD 57.000 KG 0.200 M3 57.00 KG 1KG

how to use by regex @mz3bel

You want the number before the word M3 and from next line you want the 11 digit number.

The format is same for all PDF’s then we can continue with REGEX

Thanks
@Anand_Designer

i used activity ==Read pdf text,Write text activity

the format is same all 50 pdf files… i want extract 0.200 M3,57.00 KG ,1KG

You want only three values
0.200 M3
57.0 KG

Send one pdf file also

Thanks
@Anand_Designer
1 PKG

yes , then export to string to Data Table

@Anand_Designer

Try this :point_down:

1 Like

@Anand_Designer

You can use the below code to extract the specific text from the string.

text.Substring((text.LastIndexOf(“Start”) +5), (text.IndexOf(“End”) - (text.LastIndexOf(“Start”) +5)))

To extract the 0.200 M3 use can use the below code

text.Substring((text.LastIndexOf(“KG”) +2), (text.IndexOf(“M3”) - (text.LastIndexOf(“KG”) +2)))

but before you use this make sure the string delimiters are fixed.

1 Like

To extract 0.200 M3,
you can use the below Regex -
“KG +\d+[.][\d]+ \w+”

and to extract the 57.00 KG,
you can first store this regex expression into a variable
“KG +\d+[.][\d]+ \w+ +\d+[.][\d]+ [\w]+”
and then replace the first regex expression (i.e. the one you used to extract 0.200 M3 value) then you will be left with your second value only.

And for 08338697400,
you can use this Regexexpression-
“\w+ / \d+[-][\w]+ / \w+ +\d+” and then split the value using spaces into an array and get the last array value.

Please try using this and if you still face some challenge feel free to ask :slight_smile: