Find duplicate rows in excel and add particular column value

Hi I do have an excel where I have duplicate rows also

I have to find duplicate rows in basis of multiple column and once it found I have to add " P column value at one row and then I have remove duplicate rows

So, basically first I have to add duplicate rows value and then I hace to remove duplicate rows

hi @Chirag1991 !

You can find the rows that are identical (all columns match) by using this LINQ below:

var_DT.AsEnumerable.GroupBy(Function(i) i.Field(Of String)("Col1")).Where(function(y) y.Count()>1).Select(function(z) z.Key)

Example of the input:

As seen in the image, rows “c/d and e/f” are duplicated. The result of this Linq will give you the following:

image

On a later stage, you can use the Lookup DataTable activity, to identify the indexes of that rows, and therefore add the required value that you need.

Hope it helps!
Ignasi

DuplicateRows.zip (43.7 KB)

1 Like

Thanks for response what i need is I have to remove duplicate rows and also the rows which i remove I need sum of particular column, for example if I have removed 3 rows and those 3 has a column P and P has value (10,20,20) so I need 50 at remaining row i have attached the input and output example


output

Got it!

So from duplicate rows, you need to sum the column P right?

Hi @Chirag1991 , please find below the updated xaml.

I’ve used the following LINQ expression to sum the values of the Descriptions that are duplicated.

the output (just for demo) is a dictionary, of the description, and its own sum of amounts

image

var_InputFile.AsEnumerable.where(function(y) y("Description").equals(description)).Sum(Function (x) If(Double.TryParse(x.item("Amount Bought (MT300)").ToString, Nothing), Double.Parse(x.Item("Amount Bought (MT300)").ToString), 0)).ToString

DuplicateRowsCount.zip (45.1 KB)

hope it now suits your needs!

Ignasi

Thanks I will check and will update you what I need I hope you got it I will try and update

Hi I did run I get no output


I am not getting anything it’s blank…
P_ST64 - Forex provisioning report - AMAS.xlsx (58.9 KB)
I have attached the Dummy of excel so if you wll see sheet no '1 that will be input and sheet '2 that should be output …

And distinct filter will be combination of

“BIC”, message type , place of Custody, value date , currency bought, currency sold and

If you will see column Amount bought and Amount sold . For these two columns I need sum so if you will see excel you will find

Hi @Chirag1991 , you were getting an empty result, since the data on my xaml was a test excel that I created.

I’ve loaded yours and with this 3 activities, I’m extracting your sums:
image
image
image
image

The workflow now uses your input. Be aware that the results are being stored in a dictionary for demo purpouses (var_dict_UniqueDescriptions), if you want to do anything else with that Sums, please use the following variables, inside the For Each Loop:

image

Best Regards,
Ignasi

DuplicateRowsCount.zip (96.4 KB)

Hi, that is not what accepting output for your refernce I have attached the excel, in excel you will see one sheet as “Input”, so that what will what excel it look like and “output” sheet is what desired output

so if you will look you will see that sum of the respective column has to write on the distinct rows of duplicate, not below.

if you will look into an excel that "output"sheet has what it should be look like

so could you please create in same way, and i would request if you can create in data table because I would need to create excel

P_ST64 - Forex provisioning report - AMAS.xlsx (58.5 KB)
this the excel which has “input” sheet and “output” sheet

Hi @Chirag1991 , please note that this screenshots were just to demonstrate that the bot is doing the sum correctly and therefore, I did the sum manually in excel as can be seen in this image below.

- Values in Yellow and green on the image below, were added manually to prove the correct Sum being done by the bot.

The workflow I sent is NOT modifying (Not adding data neither removing data) your excel at any point.
If you want to modify the remaining rows, with the result of the sum of the duplicated rows, you can do it with the values of the following variables:

var_SumAmountBought
var_SumAmountSold
  • Again, this values will keep changing for each different set of rows, therefore please use them inside the For Each, and save them somewhere else, like I did with the dictionary.

Sure,
I will try it .

Meanwhile if you can tell me how we can put in an excel will be helpful…

Hi @Chirag1991 I did add the part you need.

It is just a matter of getting the data that you obtained on the Dictionary, and wit that data, prepare a DT that you can write on the “Output” tab.

Find attached below the workflow.

Notice that the bot is typing the result on the “OutputDemo” tab. Update that part according to your requirements. Also I’m getting each combination of BIC + Currency to build the last Datatable. Make sure to check that the result is the expected.

DuplicateRowsCount.zip (98.5 KB)

Update us here if the workflow fits your request!

Best Regards,
Ignasi



no, above one is not coming as expected,

I have attached the screen shot, if you look into output sheet, that is expected output sheet,

so the sum as per above logic is not correct and line of items also not correct.

if you see output sheet 5 lines are coming that should be output. Even the rows who are already unique they should be there too

Also if you will see your output, Trade date and Value date is coming wrong

Good morning @Chirag1991,

the only difference between your output and mine, are the dates and the number of columns. My workflow considered the same date, and you only need to add the filter to each unique date, inside the for each.

→ This can be done with the Filter DataTable activity, or with LINQ, as you prefer.

You only need to add this extra filter, and for each date, get your new row based on the filtered result. And remove the unwanted rows before typing it back to the output sheet.

Please remember that the forum is used to provide support, and we will not normally develop full functional workflows as per request.

→ Feel free to reuse any components on the actual workflow, to make that extra filter, and you’ll have your desired output.

Best Regards,
Ignasi Peiris

Okay I did that and thanks one thing if my input header type is text and row type is custom, and in output sheet it is coming as text for rows, how can I make it custom?

On the example workflow, the datarow array, is introduced as “Array Of String”, but if you want to have custom data in it, you can change it to “Array Of Object”:

image

On this example you can see that the array of objects accepts Strings, Int32, DateTime, and so on. therefore you can use it also on the “Add Data Row” activity, under “ArrayRow” Property to get custom formats.

image

thnks

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.