I am using the Get Text activity to capture the URL of a website. I need to extract just an ID from the URL and store it as an attribute for future use. How can I pull only the integers after “office_id=” out of this URL? In this example, I need “63137085”, but it is possible that the in other URLs, the quantity of integers is more or less than 8.
Alternatively, this ID is stored in the code of the page as follows:
INPUT TYPE=“hidden” NAME=“office_id” VALUE=“63137085” so if I can capture the Value here, that would be fine as well.
Uri exampleUrl= new Uri("https://qa.examplesite.com/admin/lawfirm/lf_office_overview.jsp?firm_id=63137084&office_id=63137085&view=CR")
string office_id = HttpUtility.ParseQueryString(exampleUrl.Query).Get("office_id")
GetAttribute - Url attribute should give the Url of webpage.
After you have the URL as a string variable, you can use regex (uipath activity Matches) to take out just the digits after ‘office_id=’ and store it as a variable
Thanks Dave - I’ve been able to capture the Url thanks to vvaidya, and I understand what you are suggesting but my head is exploding trying to write an expression to capture just the integers after the “office_id”.
No problem - it really can be quite powerful, but does take a bit to understand. I am definitely new at it, but have found it useful in many scenarios, so add it to your toolkit when you are thinking of string manipulation in the future.
I’ll give you one way of grabbing the info based on what you’ve mentioned. Note that it may not be the best way or the only way, just the first way I could think of:
Regex Pattern: (?<=office_id=)(\d+)
This pattern uses a positive lookbehind. Positive lookbehind means it is searching for a string given in the first parenthesis, then grabbing the info after it based on the parameters given in the second parenthesis.
(?<=office_id=) - this piece is used as a filter of sorts. The parentheses () is just used to contain the expression. the ?<= portion is the positive lookbehind. office_id= is the string you are looking for that is immediately preceding the actual string you want.
(\d+) this is the exact string you want. The parentheses () is just used to contain the expression. \d means to look for any digit 0-9. + means to match the preceding character (any digit, in this case) 1 or more times.
In studio, the Matches activity will output the result as a list of all matches (type IEnumerable). In order to get your office ID to a string, you’ll have to use another assign activity. Assign the matches output to a variable, in this case I’ll call it RegexMatch, and then use an assign activity as follows:
Assign OfficeID (type string) → RegexMatch(0)
You have to indicate the match number within the parenthesis. You want the first match in this case, which is why you put in the (0), as the first index is 0