Extract Email Address from URL

Hi everyone, I’m new to UiPath (1 week) and currently using the free web version.

I have a list of 1000+ websites in a google sheet column. And would like to use UiPath to go into each row’s website URL, extract the following data - Contact Email, Company Logo, LinkedIn URL and export back into google sheets 2nd/3rd/4th column for the respective rows.

How can I do this?

Thank you.

@thrashtalker

Welcome to the community

  1. Are all the websites identical?
  2. If not how different they are?
  3. Open ui explorer and a website and identify the data you need by indicating that element…repeat for another website…see if there are any similarities in the selector like same same tag…etc
  4. If this is similar then we can extract from all

Cheers

you can approach this task: this way

  1. Set up UiPath:
  • Ensure you have UiPath Studio installed on your machine.

2. Read Data from Google Sheets:

  • Use the “Read Range” activity in UiPath to read the list of website URLs from the first column in your Google Sheet. Store this data in a DataTable.

3. Loop Through the URLs:

  • Use a “For Each Row” activity in UiPath to iterate through each row of the DataTable, where each row contains a website URL.

4. Web Scraping:

  • For each website URL, use web scraping techniques to extract the Contact Email, Company Logo, and LinkedIn URL. You might need to create separate sequences for each website’s scraping, depending on the structure of the websites.
  • You can use the “Data Scraping” wizard in UiPath to extract structured data from websites. Ensure you handle any possible variations in website structures.

5. Write Data to Google Sheets:

  • Use the “Write Cell” activity in UiPath to write the extracted information (Contact Email, Company Logo, LinkedIn URL) into the respective rows of the Google Sheet in the 2nd, 3rd, and 4th columns.

@thrashtalker

Thanks all for the input. The main challenge is that all the websites are very different.

Is there a function that allows recognition of the item of interest and extracting the data respectively?

@thrashtalker

If the names are same then can use cv extraction

If the names are different then check the possible tqgs it is getting and see if you can derive at similarity

If they are not similar then you cannot extract…you need to build seprate flows for each based on type of url

Cheers