Www.covid19india.org - Table Data Extraction - including difference Info And Totals Row

Hi All,
I was unable to extract data from covid site (https://www.covid19india.org/). I am not getting what I was expected. Please let me know the exact steps or Data definition.

I was followed below link but unable to get it, same way I wanted the data.

https://forum.uipath.com/t/i-fail-to-extract-data-table-in-https-www-covid19india-org/229958

Expected table:
State | Confirmed | Active | Recovered | Deceased
Delhi | 53,116 | 27,512 |23,569 | 2,035

Attached the screenshot how I was getting the data.

@sachinsm
have a look on this extract definition (was a quick done). And based of this you can do the finetunings:

<extract>
	<row exact="1">
		<webctrl tag="div" class="App" idx="1"/>
		<webctrl tag="div" class="Home" idx="1"/>
		<webctrl tag="div" class="home-left" idx="1"/>
		<webctrl tag="div" class="table" idx="1"/>
		<webctrl tag="div"/>
	</row>
	<column exact="1" name="Column1" attr="text">
		<webctrl tag="div" class="App" idx="1"/>
		<webctrl tag="div" class="Home" idx="1"/>
		<webctrl tag="div" class="home-left" idx="1"/>
		<webctrl tag="div" class="table" idx="1"/>
		<webctrl tag="div"/>
		<webctrl tag="div" class="cell" idx="1"/>
		<webctrl tag="div" class="state-name" idx="1"/>
	</column>
	<column exact="1" name="Column2" attr="text">
		<webctrl tag="div" class="App" idx="1"/>
		<webctrl tag="div" class="Home" idx="1"/>
		<webctrl tag="div" class="home-left" idx="1"/>
		<webctrl tag="div" class="table" idx="1"/>
		<webctrl tag="div"/>
		<webctrl tag="div" class="cell statistic" idx="1"/>
		<webctrl tag="div" class="total" idx="1"/>
	</column>
	<column exact="1" name="Column3" attr="text">
		<webctrl tag="div" class="App" idx="1"/>
		<webctrl tag="div" class="Home" idx="1"/>
		<webctrl tag="div" class="home-left" idx="1"/>
		<webctrl tag="div" class="table" idx="1"/>
		<webctrl tag="div"/>
		<webctrl tag="div" class="cell statistic" idx="2"/>
		<webctrl tag="div" class="total" idx="1"/>
	</column>
	<column exact="1" name="Column4" attr="text">
		<webctrl tag="div" class="App" idx="1"/>
		<webctrl tag="div" class="Home" idx="1"/>
		<webctrl tag="div" class="home-left" idx="1"/>
		<webctrl tag="div" class="table" idx="1"/>
		<webctrl tag="div"/>
		<webctrl tag="div" class="cell statistic" idx="3"/>
		<webctrl tag="div" class="total" idx="1"/>
	</column>
	<column exact="1" name="Column5" attr="text">
		<webctrl tag="div" class="App" idx="1"/>
		<webctrl tag="div" class="Home" idx="1"/>
		<webctrl tag="div" class="home-left" idx="1"/>
		<webctrl tag="div" class="table" idx="1"/>
		<webctrl tag="div"/>
		<webctrl tag="div" class="cell statistic" idx="4"/>
		<webctrl tag="div" class="total" idx="1"/>
	</column>
</extract>

giving this result
grafik

1 Like

Awesome, thanks a lot ppr. If possible please let me know how you was done this, I mean steps :smiley: :smiley:

@sachinsm
give me some time and i will setup a more detailed one if possible. Once I have done I will share with you and give some feedback on the steps

Ok, thanks for this. Really it was very helpful. If possible workflow if you created.

@sachinsm
before investing time into webdata retrieval can you please check if the following api link offered by the site will give the data of interest as well?
https://api.covid19india.org/v2/state_district_wise.json

If yes then I would recommend that we do focus on this solution approach.

Let us know your feedback

@ppr
This is really good Approach. I was finding the different approach and you gave me more information. :smile: :smile:
By this we can get the in-depth information and without UI.
We can send the request for State or district or both then it should get the report. But I did not find the Total cases in the API :thinking:

@sachinsm
let me think a few thoughts by having my dinner

about the totals etc. the JSON will be parsed and then some aggreations (e.g. sums) will be done.
However please check on the api page: https://api.covid19india.org/ which set is of your interest

@sachinsm
for not delaying you in your tasks find starter help doing following:

  • reading the table including the differences info:
    grafik

  • reading the total line as a seperate datatable (feel free to merge it if it is wished)
    grafik

About the doing:

  • for data most of the configuration was done by point an click
  • as the first row is another class, point and click was starting on first data line

the total was done postedit and adopting the extract config XML from data extract
here you can refer on a more clean row and column definition without duplicates within the selectors

Find starter Help here:
covid19india.org_V2.0.xaml (14.0 KB)

Update: V2 is doing a hover to the a h5 very down. This triggers to load all data before extraction

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.