Unable to browser attach in ForEachRow

Apologies if i missed out something obvious or scraping wikis are not allowed. I am trying to get the names or filepath which contains the names of something on this page: Akuoumaru | Genshin Impact Wiki | Fandom , and then go to every single similar page like so : Genshin Impact Wiki | Fandom and get the same element.
I am able to do Steps 1-4 successfully, but it fails once i put everything in a ForEachRow of an excel data sheet.

A. Open excel that has 2 columns, Name and URL and load them into variables for use.
-Start Loop

  1. Open Browser (URL retrieved from step A.)
  2. Data Scraping → Attach browser → Extract Structured data from a particular DIV
  3. Assign the data from the ExtractDataTable into variables and add them as a new row to MasterDataTable.
  4. Close Tab

-End Loop

B. Convert MasterDataTable to excel.

The problem i am facing is in 2. Data Scraping and Attach browser.

Either of the following Failures happens:

  1. The selector wildcard is accepted and all the opened browser tabs navigate to the correct URL, but all the results scraped are that of the first page instead of being dependent on the browser tab that is opened
  2. The selector fails and no data is scraped, causing step 3 to fail as no data can be assigned. (There is no row at Position 0)

Selector Problem

The Selector Editor shows: "<html app='chrome.exe' title='Akuoumaru | Genshin Impact Wiki | Fandom' />"
I have tried variants of the following selectors, all leading to either failure scenario 1 or 2.
a. Changing the title name into a wildcard like so :
"<html app='chrome.exe' title='* | Genshin Impact Wiki | Fandom' />"
→ Failure 1 or 2, depends on whether idx is also set.
b. Changing the title name into a variable from my first excel:
"<html app='chrome.exe' title='{{currName}} | Genshin Impact Wiki | Fandom' />"
(I also logged currName to ensure the value is inside) → Failure 2 usually
c. Changing the selector to use URL as suggested by UI Explorer, which i provided as a variable from my first excel:
"<html app='chrome.exe' url='{{currUrl}}' />"

Edit: For reference, when i use the selector wizard with two different URLs, these are the values generated. They were how i came up with the various wildcard/variable solutions above.
<html app='chrome.exe' title='Blackcliff Slasher | Genshin Impact Wiki | Fandom' />
<html app='chrome.exe' title='Akuoumaru | Genshin Impact Wiki | Fandom' />

Any help is appreciated, i would guess the problem lies in the website selector.
Is UI Path studio not meant to be used this way, and every browser attach has to be done through the wizard?

Edit: I have tried the suggested changes to no avail, here is the new file and relevant excel file with current incorrect results and expected correct results.

third.xaml (20.5 KB)
excel_files.zip (19.9 KB)

Hi @SJJ ,

Could you please share the xaml you have developed so far?

Kind Regards,
Ashwin A.K

I have uploaded it in the main post, thanks in advance!

Hi,

For now, regarding first problem.

As Output-DataTable property is InOut type, data will be appended to previous data.
To solve this, please put ExtractDataTable = Nothing just before the ExtractData activity.

img20220324-3

Regards,

@SJJ That means the root cause is at Attach Browser. As you are saying you have different urls to navigate but the attach browser was pointing to the always first url since the selector was able to identify only first url

Can you check the selector of atleast of two different urls using attach browser. So that we can see where the selector was changing and can do code change accordingly

Hi,

For 2nd problem:

I think it’s better to modify it as the following, regarding browser handling.

Regards,

Hi, thank you for your time. I have revised the project with your suggestions but unfortunately the issue still persists.
I don’t seem to be able to run the project without the “Browser Attach” Activity, can you advise on that?
This time, i have attached the excel data file required for it to run, i hope you have some time to look at it. This image below is the row i select (first and last element) from the Data scraping wizard.
image

Hi, thank you for your time.
as per your request, the two selector generated automatically when i use two different sites are as follows:
<html app='chrome.exe' title='Blackcliff Slasher | Genshin Impact Wiki | Fandom' />
<html app='chrome.exe' title='Akuoumaru | Genshin Impact Wiki | Fandom' />

Meanwhile, in my original post i did explain the different selector wildcards and variables i tried to use to circumvent this problem.

Hi,

I added some tuning as the following. Can you try this?

Sample20220324-6.zip (49.4 KB)

Note: As there is no excel application in my current environment, I use Workbook-ReadRange/WriteRange. But this has nothing to do with the problem.

Regards,

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.