Addressing Multi-Sector Identification Challenges in Web Scraping Workflows

Hello, I am creating a workflow in which I use an ‘If’ condition to check if the ‘Economic Sector’ is equal to any of the sectors stored in an array of strings. If it is equal, the program is executed; otherwise, it checks the next sector within the array of strings.

The problem arises when a website displays more than one economic sector. In such cases, when it checks if the economic sector is equal to the string within the array, the bot concludes that it’s not equal.

To make it more systematic:

  1. For each Variable: ‘Allowed Sectors’ = {“Activities related to employment”, “Public administration and economic and social policy”, “OTHER SERVICES”, “PROFESSIONAL, SCIENTIFIC, AND TECHNICAL ACTIVITIES”, “Research and development”}
  2. Open the website and extract the economic sector using ‘Get Text: ‘EconomicSector’ (variable)’
  3. Let’s say the ‘Get Text’ extracts the following: “REAL ESTATE ACTIVITIES PROFESSIONAL, SCIENTIFIC, AND TECHNICAL ACTIVITIES ACTIVITIES OF HOUSEHOLDS AS EMPLOYERS OF DOMESTIC PERSONNEL”
  4. If Condition: ‘EconomicSector’ = ‘Item’

Boolean = False

In this case, I would like the bot to recognize that “PROFESSIONAL, SCIENTIFIC, AND TECHNICAL ACTIVITIES” is present and set the boolean to True.

This is the web of the example: Base de Datos Nacional de Subvenciones

Thank you in advance!!

This is the workflow of the example. SectoresAdmitidos" means “AllowedSectors” and SectorEconomico means “EconomicSector”

@Carla_Munoz

What you can do is…use a break activity once it is matched that is in then side…so that once it is matched you can set value to true and the loop breaks…and will not check further …if not matched the value will be false after the loop as well

Hope this helps

cheers

The thing is that the array variable ‘Allowed Sectors’ holds this values = {“Activities related to employment”, “Public administration and economic and social policy”, “OTHER SERVICES”, “PROFESSIONAL, SCIENTIFIC, AND TECHNICAL ACTIVITIES”, “Research and development”}

And the Get Text creates a variable with the value “ REAL ESTATE ACTIVITIES, PROFESSIONAL, SCIENTIFIC, AND TECHNICAL ACTIVITIES, HOUSEHOLD EMPLOYERS OF DOMESTIC PERSONNEL; HOUSEHOLD PRODUCTION OF GOODS AND SERVICES FOR OWN USE" and in the array only have the sector in bold. So… How can I achieve that?

Hi @Carla_Munoz ,

Could you maybe check with contains method instead of = ?

@Carla_Munoz

then use contains instead of equals

cheers

Like: If EconomicSector.Contains(AllowedSectors) ??

@Carla_Munoz ,

I believe it should be out_SectorEconomico.ToLower.Contains(item.ToLower)

Assuming each of the sector values is present in item the iterative instance from for each loop.

1 Like

@Carla_Munoz

you can follow with loop like mentioned by @supermanPunch

or can use like this without loop

AllowedSectors.Any(function(x) out_SectorEconomico.ToLower.Contains(x.ToLower.Trim))

this does not need loop…it would give true if atleast one value matches…else will give false

cheers

2 Likes

Thank you so much! This was very helpfull :smile:

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.