Delete duplicate columns in a CSV

SAMANTA_COTTACKAL1 · January 29, 2024, 10:32pm

Hi Team,

A bot of mine downloads a csv file from a portal and copies it a Google sheet. Today, it happened to see that there occured 2 columns with the same name . Hence, bot threw error saying that column already exists in the data table. So, in future, if any columns appear more than one time, I need to delete it in the data table itself and later write it to the google sheet. Could anyone help me how to do this.

Below attaching the csv file downloaded file from portal:

@Yoichi could you please help here?

postwick · January 30, 2024, 12:13am

You’d have to read in the CSV without headers. Then your first row of data is the headers.

Loop through the header names in row 0 of the datatable, look for duplicates, and delete the necessary columns.

Then write the cleaned up datatable to a temp CSV without headers, then read it back in with headers.

Nguyen_Van_Luong1 · January 30, 2024, 12:19am

Hi @SAMANTA_COTTACKAL1 ,

The column names in the data table are unique so there can be no duplicates, please uncheck ‘has header’ in this activity

Hope it help,

Yoichi · January 30, 2024, 12:21am

Hi,

The following post and sample will help you. (This renames either of duplicated column name. )

Regards,

SAMANTA_COTTACKAL1 · January 30, 2024, 12:43am

Hi @Nguyen_Van_Luong1 and @postwick ,

when i removed the headers checkbox, i am getting the below error.

[Column1,Column2,Column3,Column4,Column5,Column6,Column7,Column8,Column9,Column10,Column11,Column12,Column13,Column14,Column15,Column16
First Name,Last Name,Email,Username,Role,Account Created,Groups (sharing),Organizational Group,Lucid Suite License,Lucid Suite License,Lucidscale Creator License,Lucidscale Explorer License,Trial - Lucid Suite License,Trial - Lucid Suite License,Trial - Lucidscale Creator License,Trial - Lucidscale Explorer License

Nguyen_Van_Luong1 · January 30, 2024, 12:49am

You can use index
index of data column from 0

You can put index of that column at Column Index

Hope it hep

postwick · January 30, 2024, 1:01am

You have to find the index of the column you want to remove. In this example, Organizational Group is index 7 (8th column). Use that column index in the ColumnIndex property of Remove Data Column.

You can’t remove the column by name because the column names are Column1, Column2, etc and your headers are the first data row after the headers.

I’m curious why you’re removing Organizational Group, though, since it’s not a duplicate.

SAMANTA_COTTACKAL1 · January 30, 2024, 1:26am

Hi @postwick ,

I had to remove that column as per a business requirement. Additionally, i would need to add 3 columns - Status, Product Name and Extract date. With this index , how to change it ?

postwick · January 30, 2024, 1:58pm

You wouldn’t add the new columns until after you’re done removing the duplicate columns then renaming the headers.

Read without headers
Use data row 0 to determine duplicate headers and delete by index
Write to CSV without headers
Read back in from CSV with headers
Delete any other columns you need to, like Extract Date, which can now be done by name instead of index
Add additional columns you need

SAMANTA_COTTACKAL1 · January 30, 2024, 9:42pm

Hi @postwick ,

Team Users_.zip (409 Bytes)

I am not sure how to do the step two you mentioned. Could you please help me out with the file attached

postwick · January 30, 2024, 11:33pm

I came up with a better way.

Read CSV without headers (so column names are in row 0)
Loop through row 0 values. If value appears more than once, delete column
Loop through columns and rename from row 0
Remove row 0

Main.xaml (22.3 KB)

SAMANTA_COTTACKAL1 · January 31, 2024, 12:49am

Thank you so much @postwick ,!!! It worked perfectly.

postwick · January 31, 2024, 1:05am

Please mark as solution.

SAMANTA_COTTACKAL1 · January 31, 2024, 1:07am

Hi @postwick ,

Done…

system · February 3, 2024, 1:07am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to delete the columns in CSV file Studio studio , question , activities_panel	7	1811	July 26, 2022
Remove Duplicate Column From csv or List of string Activities excel , csv , activities	0	1201	February 5, 2021
CSV file with duplicate column names can be read and modified to columns with unique names provided by the user Activities excel , activities , studio	6	2578	June 30, 2021
How to filter CSV file with duplicate Colum header Studio uiautomation	8	2153	September 24, 2021
How to Remove Duplicate columns in csv and overwrite existing one Studio studio , question , activities_panel	1	910	December 20, 2021

Delete duplicate columns in a CSV

Related topics