Data Scraping - columns with the same ID

Hello!

I have a problem with scraping data from a table which has columns with the same selector ID in html code.

Example:

<extract>
	<row exact="1">
		<webctrl tag="tr"/>
	</row>
	<column exact="1" name="Column1" attr="text">
		<webctrl tag="div" class="specific_div_name"/>
	</column>
	<column  exact="1" name="Column2" attr="text">
		<webctrl tag="td" class="specific_td_name"/>
		<webctrl tag="a"/>
	</column>
	<column exact="1" name="Column3" attr="text">
		<webctrl tag="td"/>
		<webctrl tag="b"/>
	</column>
	<column exact="1" name="Column4" attr="text">
		<webctrl tag="td" class="specific_td_name_2"/>
	</column>
	<column exact="1" name="Column5" attr="text">
		<webctrl tag="td" style="white-space: nowrap;"/>
		<webctrl tag="b"/>
	</column>
</extract>

In the situation above I get the same values for columns 3 and 5, although the source has different values. The problem here is that these columns dont have specific IDs in html code on the website.

Is there any way to distinct them?

Thanks in advance

@pboleszc
lets assume Column3 is the third column in web table. Give a try on forcing the retrieval by using an index to the column:

<column exact="1" name="Column3" attr="text">
		<webctrl tag="td" idx="3"/>
		<webctrl tag="b"/>
	</column>

do it similar to for the other column as well

Thanks for a try, but unfortunately it doesn’t help :frowning: Adding “idx” causes that the column is not read (as there isnt any “idx” tag in the html source)

the idx tag i not needed in the source. it is managed by the UiPath internals.
Unfortunately we cannot inspect the web element structure but maybe it can be simplified by following:

<column exact="1" name="Column3" attr="text">
		<webctrl tag="td" idx="3"/>
	</column>

can you post a screenshot of this table structure?

Here you are - a screenshot + part of html code of the table

@pboleszc
it was helping for the first step, but not all was inspectable.

give a try on following (just for analysis reasons)

<extract>
	<row exact="1">
		<webctrl tag="tr"/>
	</row>
	<column exact="1" name="Column1" attr="text">
		<webctrl tag="td" idx="3"/>
	</column>
	<column  exact="1" name="Column2" attr="text">
		<webctrl tag="td" idx="4"/>
	</column>
	<column exact="1" name="Column3" attr="text">
		<webctrl tag="td" idx="5"/>
	</column>
	<column exact="1" name="Column4" attr="text">
		<<webctrl tag="td" idx="6"/>
	</column>
</extract>

and check if third up to sixth column is extracted (mandatory that the extract datatable selector is valid)
for the image , field etc extraction have a look here:

in case of it is failing again:

  • redo it with extracting entire datatable (img, fields etc values will be missing)
  • redo it by reconfiguring column definitions, but with simplified selectors (no classes, idx to columns etc)

Great thanks! It works when adding idx to all column tags! That’s exactly what I was looking for!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.