The above text is an example of the text I will be scraping. I have used a Get Full Text activity to select the Div that this text is under. Although the resulting text returns a lot of whitespace and special characters.
The regex I have tried using does not seem to return the desired result I am after.
Using the Replace activity I tried the following: - Other.*
I want to remove all text after ‘- Other’
^.*?(\d{4})
I want to remove all text up to the first occurring 4 digit number (ie. 2008 in the above example)
I then used a Matches activity which has the following regex: (-\s+)[^-]
I used this to break int lines with a dash character (-) into separate lines.
I then have an Excel Application Scope and I iterate through each item of the previous output to write back to the excel worksheet. Although the resulting text is not what I had expected the regex to return.
This is extremely helpful and another option for me. For example, If I am to search for Goods Code 68101190.
The description may contain more than one line with the same number of ‘-’ characters. I will, in all cases, only need to copy the last line of a number of dashes. For example, in the below screenshot, there are two rows with - - -, I will only want to include the final line. What would be a way for me to check for this and ensure that I will not get two rows like this?
Well, I suppose if your goal is to always extract the line before the Other (= thus, always the previous row), you could find out the index of the row that contains the Other and subtract 1 from it
I have tried 10+ different methods to the following: I need to search for the value stored in one excel in the Nomenclature excel and then copy all rows back to the original excel.
It is explained in the link below if you have the time I appreciate it.