I’m trying to extract a table that spans multiple pages from a website. I’m able to use data scraping to extract all data however some rows in the table have links that contain useful information… how can I extract the entire table plus the information inside these links?
I can’t share the table due to privacy reasons but it looks something like this:
Basically, the unique ID column has a combination of unique codes and links (the one labelled “Information ()”. so I’m looking to extract the unique codes inside the Information link with the entire table.
I just tried this but the url doesn’t show in the extracted table. I highly suspect that it’s an issue with the elements on the website. for example, when I click on one of those Information links on the table, it opens like a mini dialog box displaying the unique IDs, so maybe it’s actually not a url item, if that makes sense… I even tried to inspect element, but there’s no information that displays the embedded link…
is there a way for me to do a Click & get full text row by row on this web table - just for the rows that contain “Information” instead? it may be a slower process but I’m willing to explore that option… please let me know what you think… thank you
You may try with the get attribute activity and getting the attribute which contain the specified link and making the selector dynamic by table row or table column or Idx keep
incrementing!
@ceceliaa34
configure the correlated columns as described above, configure 1 additional column for the link
like
UNIQUE ID, UNIQUE ID URL, COUNTRY, CODE
as in the firs col id and link is alternating, we can tell the wizard properly what is first, second element. But when indicating inital to the columns and do later the postedit then we should achieve it.
In case of URL is public then please share it with us. Thanks
Not an expert in web languages but I’ve checked the table source code, it looks like those “Information()” values in the table are not actual links. They don’t have any href elements. rather it’s nested in between the <a class if that makes any sense.
Clicking on the actual Information() blue links on the table only displays a popup window that has the code, i.e. it doesn’t open another web page, just a small pop up window that displays the code. there is no URL to be extracted hence why I was thinking of scraping each text row by row based on if statement
i.e. if UniqueID Contains “Multiple”, then click on Information(), scrape the codes and exit, and go to the next row
else just scrape every text in that row as is and add to Datatable
I’m just looking for the best way to implement this… any ideas?
Currently i have not got all. Can you elaoborate mor on what you want to achieve within the Business Process goal and how would it be done, when the process is executed manually by a human. Thanks
hi, basically we’re just looking to copy all information in the web table.
When captured manually, the person has to go row by row and do a copy paste from web table to excel, if a particular row contains Information(), the person has to click into it, copy the unique IDs from the pop up window, exit and paste into excel, then proceed to next row.
I hope I made it clear, but please let me know if you want to me to explain better. thank you
adding an additional datacolumn to the datatable (holding later the code)
for each row loop
IF Activity - UNIQUE ID Value has (X) in Text X= any Number
THEN: click and extract the Code, add it to the datatable using added column
ELSE: do nothing
For clicking the link we use a dynamic selector incorporating the row index that we can retrieve from index output of for each row acitvity