Remove duplicate column excel using readrange

I have a logic to check if there are duplicate columns and wanted to remove it after I remove it if I read range the sheet again it should have the update data , the one with the duplicates remove. Any idea guys ? please check the workflow below thank you

Main.xaml (8.1 KB) Book1.xlsx (9.0 KB) project.json (996 Bytes)

I got an index out of range in your if statement, but the general logic of it looks fine.

However, I would recommend a different way personally. I would use a try-catch on your first read range statement, and make sure the ‘add headers’ property is checked. This will take the first row as column names, but since you can’t have a duplicate column name it will throw an error (make sure the catch is just for System.Data.DuplicateNameException types). So if no error is thrown there is no duplicates, but if an error is thrown you can change the column name in the ‘catch’ portion.

In the catch you should use a read range with no headers. Then you can iterate through the first row of data (Which is the column names) and use your linq statement or other logic to find the duplicate(s), delete the column(s), and re-write to the new workbook

1 Like

But I cant remove duplicate columns using add headers . here is the update files Main.xaml (7.5 KB) Book1.xlsx (9.0 KB) project.json (996 Bytes)

Yep that looks good and the new file seems to be working great.

The logic I mentioned above was just used so you don’t need to do any processing at all if the workbook had no duplicates. Your logic would still be needed if duplicate were found. Either way it seems to be working great, good work!

Can you give me example of your implementation ? thanks

Main.xaml (9.5 KB)

I left your logic as-is, I just put it inside the ‘catch’ portion of a try catch. One thing to note with your logic is that how it currently works could change the column order. If that doesn’t matter then it’s fine. Otherwise, you may want to change your logic so it iterates through the columns backwards.

1 Like

Hi @RajivKumar12,

Take a look on this. This will solve your problem.

Thank you
Balamurugan.S

1 Like

For example I already have identified and remove duplicates , how would I read again the new sheet ? based on the example I give you , should I get back to the top which is the read range to read again ? can you provide me an example of that Sir that after I remove the duplicates and created a new workbook I wanted to go back to the read range process and read the new workbook again . Thanks

After I created the workbook I want to read again the file since the duplicate is already removed. what is the best way to do that rather than adding another readranger below. Thanks

@Dave

Why not just use the read range activity again? FYI that error is occurring because you are trying to use the write range activity within the While Loop and the robot is working too fast, so it is trying to write to it multiple times before the file can be properly saved & closed.

If it was my workflow I would leave the first read range as-is. Then in the catch section I would change so I iterate through the columns backwards and when a duplicate is found, I would use the ‘Insert/Delete Columns’ activity to remove the duplicate rows. At the very end of that catch actiivty, I would use a ‘read range’ again, but this time make sure the ‘add headers’ is checked.

If you did it that way, the excel file you are working with AND the datatable you are working with are exactly the same, regardless of whether there were duplicates to begin with or not

1 Like

Yes just give me a moment. To clarify, will the machine your robot is running be licensed and have access to microsoft excel?

1 Like

Ok - I made the change to include it all within the excel application scope. Since you have the excel license I would almost always recommend using that instead of the workbook activities.

The changes I made were:

  1. Excel application scope rather than individual workbook read/write range.
  2. Iterate through columns backwards in your while loop
  3. Remove columns from excel sheet in while loop
  4. Got rid of the extra .xlsx file that was being created.

Feel free to ask if you have any questions or if it still doesn’t seem to be working correctly

Main (1).xaml (13.1 KB)

1 Like

@RajivKumar12 - Try using some delay… Whenever I use excel application scope I faced HRESULT error. If Delay doesn’t work then Kill application or Try catch(you might have to search this forum on this)…

1 Like

where do we put the delay?

when i added this to my workflow everything is fine ,except it was not able to read the read range , maybe because its too fast ? does it need delay?

I see you are getting the error in the “Insert/Delete Columns” activity. So try adding before that.

1 Like

I already have added the exception but why I am receiving error , object is not set to an instance of an object ? what causes that
?

@Dave

I am not sure why you are getting that since that since it’s essentially a null reference error. As it seems to be specific to that activity, I would recommend surrounding only that activity within a retry activity. Have it retry ~5 times at 1 second (or half second)

1 Like