Counting the most common words from a website scraping bot

Hello Community,

I want the bot to type “headphones” (or something similar) into Amazon then scrape 100 listings into a datatable.
From the results, I want the bot to find the most common words from the listings and order them:
ex:
1 Wireless - 80 times
2 Bluetooth - 50 times
3 Phone - 7 times.

What is the best approach to collecting this data as described?

Thank you

Hi there
You are using Studio or StudioX?

Either way you can try this sequence
Use browser container, open the url you need, then use Table Extraction, then inside a for each row in your dt you can use for example a Regex to extract the words you want to count then use a counter maybe

Using studio. Are you thinking the words from Regex would become their own unique variable?

@PPIM_RPA

  1. Use the “Type Into” activity to input the search term (e.g., “headphones”) into the Amazon search bar.
  2. Use web scraping techniques to extract the necessary information from the search results. You can use the “Data Scraping” wizard in UiPath to create an extraction pattern and scrape the required data into a DataTable. Ensure that the columns in the DataTable capture relevant details such as product names, descriptions, or any other information you need.
  3. Use the “Filter Data Table” activity to remove any rows that do not contain valid listings. You can filter based on criteria like empty product names or descriptions.
  4. Once you have the filtered DataTable, you can use various methods to count the occurrence of words in the listings. One approach is to use a combination of LINQ and Regular Expressions.
  • Iterate through each row in the DataTable using the “For Each Row” activity.
  • For each row, use the Regex.Matches method to extract all the words from the product name or description. Use a regular expression pattern to define what constitutes a word (e.g., \b\w+\b).
  • Count the occurrences of each word and store them in a Dictionary or a separate DataTable.
  1. Sort the dictionary or DataTable in descending order based on the word occurrences.
  2. Use a loop or iterate through the sorted data to display or perform further actions based on your requirements.
1 Like

Yeah You need a regex expression for each outcome you want.
exa. regex to get only the word ‘New Car’ then a variable for each one and counter

Like that for example.

Thank you for your reply

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.