Question on Data Extractor in general and what others would have done

I hit a web page “table” which I needed to extract data from. Turns out the “table” is really a Slick Grid and not a traditional HTML table. Repeated attempts of using Data Extractor all failed which lead me to use the UI Explorer heavily to figure this out.

Question: What data structures is Data Extractor capable of handling ? Definitely HTML tables but anything else ?

What I ended up figuring out is there is a definite pattern within the Selectors for rows, columns, and “cell” values within this “grid”. With that now known, I’m going to basically loop across the row selectors using a “idx” value and then just pull out the cell values I need.

To get the value of row 1 and the third column, I use the following selector stack and a Get Text activity:

Get Text (

<html app='chrome.exe' title='xGL - * - Order Lines' />
<webctrl css-selector='body&gt;section&gt;div&gt;div&gt;div&gt;div&gt;div&gt;div&gt;div&gt;div&gt;div&gt;div' parentid='orderlineGrid.dataGridView' tag='DIV' idx='26' /> **#### Row #1**
<webctrl css-selector='body&gt;section&gt;div&gt;div&gt;div&gt;div&gt;div&gt;div&gt;div&gt;div&gt;div&gt;div&gt;div' parentid='orderlineGrid.dataGridView' tag='DIV' idx='3' /> **#### Column #3 ... Order Line Value)**

)

The idx for row 1 is (so far always) 26. Row 2 is 27 and so on.

So far the “idx” values (for rows and columns) have remained consistent across logins/logouts and screen transitions. Still makes me a little nervous as I’d rather Data Extractor work !!! Or, pull down the data via CSV which this web application has a facility for but there are reasons why I can’t count on that working every time.

This works … so far. I’m still somewhat new to UiPath at this level of data extraction. For complex non-HTML table “tables” / “grids” scrapping, just curious what other features UiPath has that could work here or what others have done.

Well, the “Works So Far” pretty much wasn’t working :frowning:

The idx values are not consistent like I originally thought / hoped.

Has anyone dealt with data scrapping Slick Grids ?

Hi @soneill437
Based on this you can do change the selector by unselecting the css selector

Thanks
ashwin S

Thanks for the reply. That did not help.

From what it looks like, Slick Grids are a problem area for RPA regardless of the RPA vendor. At least, IMHO. The HTML source for the screen where the data is a little bit of HTML and two client-side JavaScript chunks. I have not looked at those yet.

Regardless of the JavaScript, the issue is how the data is displayed on the screen and how data is available within the JavaScript rendered web source. A current page “grid” has 100 rows in it but you can’t see all 100 rows. Slick does not create a traditional table. Slick apparently loads and shows some number of the 100 rows but not all of them for display. As you use the scroll bar, I think it loads yet more rows into the current “grid” page. I think it basically creates a moving window of data for display in the browser.

Another issue is the number of rows I’ve found loaded at a time isn’t inconsistent.

I’m going to try messing with scroll bar in UiPath to see what that does for helping me here.

What I found is doing data scrapping from Slick Grids (specifically in how the xGL Ad platform uses it) is VERY VERY painful. I was able to get it finally working scrapping data. To give some idea of how painful it is, to scrap 100 rows worth of data from a Slick “grid” it took 8 minutes. That’s minutes … not seconds. The rows are known as “Order Lines” from the screen I was (“was” as in no longer going down this path) working with. The ultimate end is literally to come up with a unique list of Order that contains Order Lines that need approval. Conceptually very simple.

I just found an order that contains 13,000+ OrderLines. I just need the Order id. BUT, I have to data scrape through ALL 13,000+ to get to the next batch of the next Order ID’s worth of Order Lines. If my math is right using the 8 minutes per 100 rows, that is 17 HOURS !!! of “white noise” scrapping just to get to the next possible group of Order Lines related to another Order.

So another approach is being developed.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.