Get row item after extracting from data scraping

MOCKUP DATA of how my data look like.
i cannot share the workflow due to confidential data.

  1. go to site
  2. data scraping
  3. for each ( each extract data table)
    then on top is the outcome.
    I cannot get more than 1 row of data in each time…look at “qualification” and “hobby” columns that has more than 1 row for each individual but on the right image there is only 1 row of data for “qualification” and “hobby” columns.

How do i get those missing row of data?
any advice?

1 Like

tried adding data to build datatable and it output exactly the same.
I think the data scrapping only scrap the first row of each table …i assume

1 Like

Hi @jackyWHJ,

By looking at the two tables you have shared, my understanding is, the one on the left is the structure of the source where you are getting data from, and one on the right is the output that you got after the data scraping. Please correct me if I’m wrong…

Is that the extract structure of the source that you are scraping the data from? or is it some sort of a grid/ table in the site which has some data for the blank fields as well?

looking at the one on the left and right, by comparing, the pattern I see is that the output had captured only the rows that had values to all the columns.

I’m thinking, if the source has blank values like that, may be it is considering the rows that only has all the cells populated with some sort of a value…

Please correct if I have misunderstood the scenario… Just trying to figure out the cause… :slight_smile:

2 Likes

Thank you very much for your reply.

My workflow is to enter “name” and scrap their profile with more than 1 datatable.
I used for each to loop thru name and enter site then scrap the data.
So therefore, in my uipath there are lots of datatable.

Each color is a datatable on the site so i need to sort of combined the “extracted datatables” then output to excel.
the columns are fixed, rows are varies

For e.g. look at the left, a person can have 1 or more qualification.
so therefore i extract all datatable then tried to use for each activity to loop thru each extracted datatable and the outcome is on the right side.

Some datatable can have blanks or multiple rows of informations. my goal now is to combined them such that left is what i can achieve

1 Like

break it down simpler

DT 1 (3X3)
JOHN COOK 30
MARY ACCOUNTANT 25
AMY CLEANER 40

DT 2 (3X3)
USA 2200 4
USA 3000 2
SPAIN 1800 7

DT 3 (3X2)

180 75
165 50
163 50

DT4 (1 COLUMN, ROWS ARE VARIES)
HIGH SCHOOL
DIPLOMA
DEGREE
MASTER
(BELONGS TO JOHN)

HIGH SCHOOL
DIPLOMA
(BELONGS TO MARY)

HIGH SCHOOL
(BELONGS TO AMY)

then i used for each on each of these extract data table hope to get all the rows of data out.

1 Like

Hi @jackyWHJ,

Yes, now I get the idea on what you are trying to do. So you are actually trying to combine the datatables to get a particular person’s qualification, hobby, country etc. Cool…

The challenging thing here is that one person can have many rows for qualification & hobby. great…

I will try to explain this in a simpler way using the database table joining since I’m more experienced in that domain :slight_smile:

Assuming that you have some idea about how table joins work in databases, we use some unique common key to join two tables. Let’s apply that to this scenario.

So you have a person datatable (yellow). and qualification (blue) datatable. For example, let’s say John has a ID value of 1. so, there should be a mechanism to say that John (ID=1) has the said qualifications. So, ideally, in your qualification datatable, there should be another column indicating the link to John (ID = 1). In that way, you can use those two columns to join the two datasets without any problem as both has a matching value for that ID column…

Person dataset
ID | Name | Job | Age
1 | John | cook | 30

Qualification dataset
Qualification | Person ID
High school | 1
Diploma | 1
etc. etc.

So you can join the two datatables Person.ID = Qualification.PersonID

This article may be helpful in getting an idea about what I explained more…

Hope it helps…

Thanks
Lahiru

2 Likes

i get what u mean but im new to uipath and not exactly knows how uipath works.
u are trying to feed in primary key in each table so that i could join them together.
Do you have any idea on using get row item?

1 Like

Sorry for my late response. I was thinking whether there is anyway to identify what qualification each person has… Doesnt it show in the source at one go?

This is something that came into my mind btw… how about this…

  1. Lets keep one datatable instead of many for each… so the one common datatable has all the required columns. From person to sport…

  2. Scrape the person data into few string variables. Then the rest again to string variables.

  3. Add all the values in all variablse into the datatable as a row after collecting each…

  4. Loop to the next and do the same…

This way you can populate all the data without the need of a join… and will be accurate.

This is something similar and easy…

Does this sound like something easy for you? Let know if its complex so i can try to help to make it much easy for you…

2 Likes

I think what I did is also something like that I use build data table then after scrap write into the build datatable which is the image on the right side

2 Likes

Another approach,

Get data to yellow, red and green datatables as you do now. And merge them into one datatable as you have done now…

Then, loop through that data table to get qualification and sport data for each person into a set of variables…

Ex.

  1. Ger one person… check qualifications… assume you have two qualifications for the person…
  2. get the first qualofication into a variable…
  3. Add the qualification to the main datatable (similar to the example post i quoted in my previous comment)
  4. Get the next qualification, add to the datatable
  5. Loop to the next person
    Continue the same way…

Hope this works for you as well… its quite easy than having multiple datatables

2 Likes

The issue now is idk how to get the “second” qualification out. I only can get one “qualification”

1 Like

Is it possible to write datatable variable to string sort of like reformat it then write to excel

1 Like

The most efficient way is to write to excel directly using the datatable. We can use write range activity and specify the datatable as the source for the excel. It is the easiest.

2 Likes

I tried without merging and write directly into exce
U can refer to my other post as well

2 Likes

Cool… Thanks for sharing… that approach seems to be much easier… better solutions than mine actually :slight_smile:

That was something new for me as well… Im also bit new to uipath. Got the certification recently though and trying to help everyone with what i gathered during mu learning while learning things like this. Still there is a lot to learn obviously… Happy to help… and thanks for sharing your post…

2 Likes

Well that wasn’t what I wanted because in the excel… After new a new datatable, it will end up in a new row new column which is very hard for me filter and condition my data.

No problem bruh this community let us learn from one another :slight_smile: if you have a solution to align thr datatable u can let me know thanks man

3 Likes

Sure… I’m currently thinking of a way to get it solved… i will let you know if i come up with a suitable solution…

2 Likes

Hi @jackyWHJ,

I was working on your issue for sometime now. I have actually come up with a solution… which is still in progress. But I thought of sharing my work up to what I have done…

SO I have a excel file that has few tabs for person, and for each person’s country data etc… It is actually used to simulate the web scraping you have.

This is how it work…
First, it extracts the person data from the person tab into a datatable. Then loops through the datatable and step by step, it extracts relevant country data from the respective person detail tab. So for Person 1, there is a tab named ‘Person1Country’. the ‘1’ in ‘Person1Country’ is dynamycalling changing in the query through a loop counter… For the 2nd person, there is another tab named ‘Person2Country’ and like wise…

The approach here to write the excel file is…

  1. at this point, we are looping throug the person datatable. So at a given point in the loop, we are accessing a datarow of the person datatable. So I have used invoke method and datatable.clone() options to add the selected person datarow to another datatable which I use to write into the excel (only applicable for Person since we use it to loop through).

  2. Write the given person data to the excel

  3. Write Country data to the same excel by specially specifying the cell we need to write. This cell number is dynamically changed using the loop counter.

attached is the solution I have upto now. I have added comments as well for easy understandability of the workflow… I will proceed for the qualification part as well… for now, This can be used for you to get an idea… and you may get some idea based on that for writing your excel…

Don’t get confused with the reading excel part which I have used to extract data to simulate your data scraping thing… :slight_smile: I have included the source and target excel files as well…

CombineDataTable_95775.zip (29.4 KB)

Hope it helps for now!!!

3 Likes

Thanks alot man. U have spent lots of effort and time sorry about that. This idea is very good worth trying… Do u think it’s possible to take each person’s information (multiple datatable) and write to excel for each datatable in each tab. So 1 person = 1 excel file
So my master list will reference to the number of person I have to check

1 Like

Hi @jackyWHJ,

I was able to get the qualification as well. the Excel looks great and is exactly what you want to see as the output… I actually didnt include the Sport part. This will actually give you the idea on how to do it…

Let me know how it looks like for you!!CombineDataTable_95775.zip (31.6 KB)

Checkout the target excel file as well…

2 Likes