Extract URL from webpage elements

cgoldstein · May 15, 2024, 6:21pm

I need to get a URL from a webpage, but since it’s not simple HTML, I’m not sure how to extract the information. Right now I’m using a Get Text activity and it’s getting the anchor using CV (computer vision). It’s also using CV to scrape the text, but it’s making mistakes with zeros and ohs, 5 and S, things like that. UI Explorer can’t find exactly what I’m looking for since it’s in an IFrame.

If I’m on the webpage and click Inspect on the text/URL I need, I see this:

Are there any activities that I can use to retrieve the URL after the “Direct link to download:” string?

ashokkarale · May 15, 2024, 6:30pm

@cgoldstein,

You can extract the full text of the <p class="ok"> into a string
Split the string with : character
The returned arrays second item would be your link.

Thanks,
Ashok

cgoldstein · May 16, 2024, 1:38pm

Thanks @ashokkarale , can you help me a little more?
I used a Get Attribute activity. I selected the target as the part of the page that will have the URL. I set the Attribute to “class=”“ok”“” (using two double-quotes around ok to escape them), and Saved to my DownloadLink variable.

During runtime, I get this error: Get Attribute ‘msedge.exe Q2’: Attribute not supported by the current UiNode.

Baskar_Gurumoorthy · May 16, 2024, 1:47pm

Hi @cgoldstein

please refer the below thread.

First of all you can keep UiPath.uiautomation.activities with latest version.
Also can you please try with find children or find element activity.

ashokkarale · May 16, 2024, 2:40pm

@cgoldstein,

How you build the selector, can you show please?

Thanks,
Ashok

cgoldstein · May 16, 2024, 3:14pm

Thank you for helping me!
I’m updating the UiPath.uiautomation.activities to 23.10 now, but I doubt that’s the issue. This is just over my level of development. The webpage isn’t simple HTML, it’s built in the vendors package and seems to be in an IFrame. UIexplorer can’t find the text/element. I’m not sure how to find the selectors/elements/children.

If I right-click and view frame source, I see the HTML and the URL I need to get:

<div class="details">
 <p class="ok">
  Report has been attached to Salesforce Case xxxxx
 </p>
 <!--<p class="ok"><a href="https://customer.yyyyy.com/servlet/servlet.FileDownload?file=012345g6HHRR1" target="_blank">Direct link to download: https://customer.yyyy.com/servlet/servlet.FileDownload?file=012345g6HHRR1</a></p>-->
 <p class="ok">
  Direct link to download:
  <br/>
  https://customer.yyyy.com/servlet/servlet.FileDownload?file=012345g6HHRR1
 </p>
</div>

I just don’t know what activities to use to get the URL after the Direct link to download. Once I have a whole string, I can figure out how to parse it and just get the URL.

postwick · May 16, 2024, 4:50pm

Why are you using CV? If you’re getting selectors you don’t need CV. CV is for image based automation (ie Citrix).

You should be able to just indicate the link (which will be tag=‘A’) as the UI element for Get Text.

cgoldstein · May 16, 2024, 6:13pm

@postwick , when I target the element in the Get Text activity, I get a message “Could not detect any text elements using native scraping.” It automatically switches to CV.

postwick · May 16, 2024, 6:43pm

Sounds like it’s not able to identify it as an “A” tag. You’ll have to just Get Text on its parent element (tag=‘P’ class=‘ok’) and then use RegEx to get the URL out of the resulting text.

cgoldstein · May 16, 2024, 8:35pm

That’s pretty much what I just did. I’m grabbing more text than I need and using a RegEx to extract the URL.

Topic		Replies	Views
How can we extract links from website Backend Studio studio , question , activities_panel	5	1006	August 24, 2021
GET URL FROM WEB Help uiautomation , activities , question	7	3782	April 28, 2020
Get URL from web page - a missing option in 2024.02 StudioX? Studio studiox , question	14	283	April 29, 2024
Hidden text URL extraction from web text Activities excel , activities , question	5	1975	February 6, 2023
How to extract url with uipath studio from a web page Studio	8	125	October 12, 2024

Extract URL from webpage elements

Related topics