Group by sum and add the grouped data to a new datatable

Hi Devs, please I have a rouped sum, the challenge is I will like to add or copy the grouped data to a new datatable.I have the below linq code but it takes a lot of time to complete if the records are large.Please is there a better alternative.
(From c in dtGroup.AsEnumerable Group c by k=c(“Item”).ToString.Trim into grp=Group
let summation = grp.Summation(Function(x) cdbl(“amt”)))
let rb = New Object())k,summation)
Select u = dtNew.Rows.Add(rb).CopyToDatatable

1 Like

large can be relative as also a lot of time

On a 250 000 row case we did following
Assign activity:
Groups | List(Of List(Of DataRow)) =
(From c in dtGroup.AsEnumerable
Group c by k=c(“Item”).ToString.Trim into grp=Group
Select g=grp.ToList).toList

Then we used a Parallel For each Activity and modelled the the group member processing

We never found out how long our not-parallel approach will consume time, as we aborted after 15 / 30 / 45 mins and only a small set was processed. Doing it with the parallel approach we succeeded in 15 seconds. After some optimization we reduced to 9 seconds.

1 Like

Please how do I add sum to the group list @ppr

the same way as you have done within the LINQ

assign:
summation = YourLoopVarFromParallelForEach.Sum(Function(x) cdbl(“amt”)))

assign:
str Key= YourLoopVarFromParallelForEach.First()(“Item”).ToString.Trim

1 Like

I’m a bit confused, please can you share a xaml workflow for this as it will help in clear understanding

have a look here:

Outerlist: Groups, inner List group members

1 Like

It’s clear @ppr, a question I’ll like to ask is why add .First to StrKey

In the LINQ you are grouping on item column and did use later the key with k
As this information is not present we are retrieving the key value from the first group member

1 Like

Thanks for the explanation, the code is still running cause I have up to 800000 records.Once it finish running I will revert the number of minutes it ran and mark as solution.Regards

@ppr, please confirm if this is the right input for add data row - New Object(){strKey, summation} as arrayrow and output dt as dtResult

Are you there?

yes

dtResult is of datatype: DataTable and is an empty datatable defining two Columns:
the first for the key, the seond for the sum

Ok,I have implemented that but my worry is that the code has run for almost an hour and is still running so I taught maybe there’s an issue with my inputs or whether infinite loop occurred

maybe you can share your xaml or the screenshot of your current implementation
hope it is verified that a PARALLEL For Each was used

  • how many groups are involved?
  • What is the average of group members count?
  • No other unneeded activities are in used e.g. log messages?

Explore it by drag out the add datarows from parallel for each body and comment it out
and let only get executed the sum computation and or the key retrieval (this allows us to identify which action is consuming time)

Never run it on debug mode, only on run mode for getting a better undestanding on the execution performance

perfect,
maybe for temporary reasons drag out the comment out block, so it will not cause processing costs for the test.

if test run will rund more then 15 mins, then please answer all questions from above. thanks

1 Question: we do see the variable ListBuilddt - is it the List(Of List(Of Datarows)) returned back from the grouping LINQ?

[quote=“ppr, post:16, topic:390230”]
ListBuilddt - is it the List(Of List(Of Datarows)) returned back from the grouping LINQ?
[/quote] Yes, it is

I also tried debug so as to know the issue, I discovered that the indicating arrow does not step into the activity inside parallel for each

perfomance optimization can be tricky on the begin. Currently you are on an exploration phase.

Please tell how us many groups are sent into the test?
What is the average group member size?

? was not asked

1 Like

I don’t really know the average group count but the number of rows dt for second data is 270000