Extracting text between two Str Delimiters

MarkC1500 · February 5, 2019, 8:55am

Hi.

I have 20000 pdf’s that i need to extract data from. Unfortunately the ui elements are all over the place so i can just scrape data, which is very annoying.

I have “Read PDF Text” then removed all unwanted characters (as there was arabic in the PDF’s)

I now how a String with full text from the pdf.

I need to extract the text in the string that lies between the words “SECOND PARTY THE BUYERS NameEnglish” & “Name Arabic”.

The data i need is consistently between these two strings.

Thank you

reda · February 5, 2019, 9:04am

Hi @MarkC1500

If you have data between

you can create an array of strings, and assign your string split like the expression bellow

yourArrayOfStrings = yourPdfText.Split({"SECOND PARTY THE BUYERS NameEnglish",“Name Arabic”},StringSplitOptions.None)

Now in yourArrayOfStrings you will have 3 elements your are interested in the second one meaning : yourArrayOfStrings(1)

Regards,
Reda

c.ciprian · February 5, 2019, 9:13am

strTest="Name english some text Name arabic"
strExtract=strTest.Substring(strTest.IndexOf("Name english")+12,strTest.IndexOf("Name arabic")-strTest.IndexOf("Name english")-12)

strExtract → “some text”

MarkC1500 · February 5, 2019, 9:39am

Hi @c.ciprian @reda

I get this error with the second option:

Source: Assign

Message: Length cannot be less than zero.
Parameter name: length

Exception Type: System.ArgumentOutOfRangeException

First option is splitting after first occurrence of “Name Arabic” and finishing at second occurrence of “Name Arabic” in my pdftxt string.

Your guidance would be appreciated.

Thanks again

reda · February 5, 2019, 9:48am

Is the combination
“SECOND PARTY THE BUYERS NameEnglish” & “Name Arabic”.
unique around the string that you want to extract??

Because in this case you want to use RegEx

Mudita123 · February 5, 2019, 9:52am

Hey
I have created a code by filling a form on web and i want to send the code into excel file. This code is in popup window. Can anyone help me?

PAD · February 5, 2019, 9:55am

Hi @MarkC1500,
How about the solution below, where the whole input string was: "“SECOND PARTY THE BUYERS NameEnglish ABC XYZ Name Arabic”, and the result I got was “ABC XYZ”.

It is the solution adapted from the one proposed here:

PAD · February 5, 2019, 9:58am

Please find the workflow in here:
string in between.xaml (7.5 KB)

Mudita123 · February 5, 2019, 10:02am

Hi,
I am generating a code from a registration page. when i fill all the data in the from after that i got a popup with code. Can anyone tell that how to take that code in excel file?
Please help me as soon as possible.

reda · February 5, 2019, 10:11am

Hi @Mudita123

Have you created a separate thread for this?

Mudita123 · February 5, 2019, 10:17am

no, how do i create this?

PAD · February 5, 2019, 10:27am

Hi @Mudita123,
Just go to the right category (most likely “Rookies”) and create a post with “New Topic” - make sure it has a good title and you provide enough info to help you asap

Mudita123 · February 5, 2019, 10:39am

I had created it.
Thanks

PAD · February 5, 2019, 10:40am

no problem, @Mudita123

Mudita123 · February 5, 2019, 10:41am

can you go through the topic name “Extracting data from web into excel”? Please help me out!!

MarkC1500 · February 5, 2019, 11:59am

Been Hijacked a Little bit

Thank you for your help guys.

@PAD I used your solution in the end. needed to do some tweeking and exceptions for files with no data etc. That’s the “Name” extracted…

I have set up another work flow to extract the email address. This is all fine.

But now i have 1 foreach in workflow 1 for name and 1 foreach in workflow 2 for email. How and when do i append all this data to spreadsheet. Do i need to just nest the foreach’s inside each other?

Thank you

PAD · February 5, 2019, 1:10pm

@MarkC1500
I would place adding values to your excel separately for each Name scope and then for each Address scope.

MarkC1500 · February 5, 2019, 1:23pm

Sorry, but how is this done?

I’m also wary of this just in case it doesn’t match the email to correct name etc, maybe if no data is taken. I’m not sure.

MarkC1500 · February 5, 2019, 1:25pm

And also if i’m doing one after the other with append and the process fails half way through then ill have names and no emails and if i start again will do the whole thing again, but start putting emails in where it left off.

If that all makes sense

MarkC1500 · February 5, 2019, 1:55pm

I will do it in one sequence for now until i have sorted all the other issues.

Next problem is, the email address I’m after is not consistently surrounded by the same text!!
If possible, i’m not sure extracting text surrounding “@” will work either because it could be the 1st, or 2nd or 3rd @ symbol i’m after.

Im not sure there is a work around for inconsistent data ??

Topic		Replies	Views
Substring after line containing specific text Help	32	13019	May 8, 2020
To extract values between two strings of text Help studio	22	18309	December 7, 2018
Extract String from Text file Help	24	13357	January 20, 2020
How to manipulate a part of string: Split, Trim, Substring, Replace, Remove, Left, Right Tutorials activities , faq	114	320141	February 14, 2025
Extract specific text from Text file Mobile Automation question , mobile_automation	3	52	September 19, 2024

Extracting text between two Str Delimiters

Related topics