Data extraction, xml code supports wildcard?


I am just wondering, whether I can use any kind of wildcard in the ExtractMetadata code in ExtractData activity.
E.g.: I would like to search for div tag the class of which starts with ‘active’:


1 Like

I’m looking for the same answer…so far it doesn’t seem to be working. I have a table with rows that have alternating classes for highlighting and I’d love to wild card that out while still being able to select the existing class.

Has anyone received an answer on this? Im dealing with a very similar situation. I tried wildcards inside my xml with no luck. I have a web page that i need to scrape that changes classes very often. It would be nice if i could just use wild cards instead of having to update my xaml file ever other week.

Is there any solution on this?? even i am facing the same issue. @loginerror

Unfortunately there is no support for wildcards in the XML code of the data scraping activity. Not yet, at least, we’ll look into it :slight_smile:


Is there any update on this? The webpage I’m scraping has an XML value that also changes every day or two, so I’m looking for a solution that will work reliably without changing it constantly.

And if there isn’t a way to do wildcards, I know that variables are supported.

So do you know if there is a way I can extract this boxed string from my screenshot? I looked around in UiExplorer and didn’t see anything that could produce that string.

Hi @bencod

It is possible to assign the value of the ExtractMetadata field as a string variable:

You could then ‘prepare’ your variable before the ExtractData activity happens by appending concatenating your string. See below how I included the yourVariableName in the string:

"<extract><column name='" + yourVariableName + "' attr='text' exact='1'><webctrl class='normalsection todaynavigation' tag='div' idx='1' /><webctrl class='full-width' tag='div' idx='1' /><webctrl class='sectioncontent' tag='div' idx='1' /><webctrl class='stripenav' tag='div' idx='1' /><webctrl tag='ul' idx='1' /><webctrl tag='li' /><webctrl tag='a' idx='1' /></column></extract>"