Create a dictionary from a CSV, for each column

excel
csv
activities

#1

Hello,

I’ve tried searching the forum and have been struggling to solve my problem. Apologies if the solution is out there, but I haven’t been able to find exactly what I’m looking for.

I’m exporting CSV files that have many columns and the column headers change between files. Each row then has the specific information for an Account ID. I want to create a dictionary for a specific Account ID, that assigns the value from that row to the representative column header.

I’ve attached a sample excel file to illustrate. The XAML I provided is for a similar solution I was looking for. It reads from the second sheet of the sample excel workbook, but that sheet only has information for a single Account ID. I’ve tried using a similar approach to read for a specific Account ID from sheet 1, but have had no luck.

Any help is appreciated!

Cheers

For Each Row.zip (13.0 KB)


#2

Hi there.

I think you mean .xlsx, although it’s close to the same as a .csv except .csv only let’s you have 1 sheet so it’s slightly different.

There’s different approaches you can take, like you could make a dictionary of datatables or a list of datatables, then just use each datatable. Or you can create a 3-dimensional dictionary so you can reference each sheet (like a group) and each column. Or you can create a 2-dimensional dictionary where all sheets get put in to the same group and the second dimension is the columns. You can also go further and use Arrays in the cells and convert them to an array as you place it into the dictionary ( like “[ ‘a’,‘b’,‘c’ ]” then using json parsing ), since each type would be an Object.

What I would say you need to do is before each loop create a new dictionary with the needed key. And, using an Excel scope with a wb variable you can loop through each sheet and apply it to the dictionary. Then within that loop, loop through each column and apply the column into the dictionary as one of the dimensions.

Example pseudocode would be like this:

out_d = new dictionary( of String, Dictionary( of String, Object ) )
Excel Scope with wb variable
    For each sheet In wb.GetSheets(0)
        Read Range of sheet to dt variable
        For each row In dt.AsEnumerable.Where(Function(r) r(0).ToString.Trim<>"" ).CopyToDataTable
            out_d(row(0).ToString.Trim) = new dictionary( of String, Object )    
        For each col In dt.Columns.Skip(1)      // skipping first column (syntax might be off)
                out_d(row(0).ToString.Trim)(col.Name) = row(col.Name).ToString

Close wb

The above example is if you want to combine all keys to the same group for each sheet for a 2-dimensional dictionary. EDIT: if you want to group them by sheet just adjust the initialization of the dictionary to be 3-dimensional and add an assign in the “For each sheet”

If you would like to reference my version of an InitSettings workflow which will do .xlsx/.xls, .csv., .json, and other text-based extensions with a delimiter… feel free, here it is: InitSettings.xaml (56.2 KB)

Regards.

Clayton.


#3

I also just realized that in your excel file the rows are in the columns on one sheet, so you will probably need to adjust your logic so one sheet it uses the Row loop and the other it uses the Column loop. Thanks.


#4

@ClaytonM -

Thanks for the thorough response! I tried mimicking your pseudocode but was unsuccessful. Do you think you could take a look and help me resolve some of the errors? Also, your InitSettings you shared was really helpful as a reference for some other solutions I’m considering.

To clarify, this is for reading .csv files. I uploaded a .xlsx previously because it wouldn’t let me attach a .csv on this forum, but I ended up zipping it and didn’t convert it back before uploading. I didn’t communicate that above so I see why you were advising for a .xlsx and not a .csv.

Thanks for your help!

SampleCSV.zip (2.2 KB)


#5

Hello,

I took a look.

See below screenshots of the fixes needed:

Since you are using a regular For Each you need to change the TypeArgument to DataRow and for Value, add .Rows which will create a list of the rows in that datatable.
DT.AsEnumerable.Where(Function(r) r(0).ToString.Trim<>"" ).CopyToDataTable.Rows

You had a typo in the first Assign where you had an underscore, it should be like the below image:
image

Again, for the next For Each, change the Argument Type to DataColumn. Also, I was unsure on how to get it to skip the first column through vb.net, so I add an If activity that checks if the .IndexOf is > 0.

Lastly,

In the last Assign, you need to use .ColumnName, not .Name
image
image

Hopefully that helps and the fixes should be fast.

Regards!

C


#6

Another way:

If you 1st column is Account ID, you could do something like below without looping to create the dictionary.

dict =new Dictionary(of string,List(Of String))

dict = dt.AsEnumerable().ToDictionary(Function(s) s.Field(Of String)(0), Function(r) r.ItemArray.Skip(1).Cast(Of String).ToList)

Here For your Account ID key, values will be in a List

Or

If you are sure about your column count

dict = new Dictionary(of string,string)
dict = dt.AsEnumerable().ToDictionary(Function(s) s.Field(Of String)(0), Function(s) s.Field(Of String)(1)+"|"+s.Field(Of String)(2)+"|"+s.Field(Of String)(3))

here your Account ID key will have values in String with pipeline(|) delimiter and you need to split while iterating.


#7

Thank you!! This is really clear and helpful. One last super simple question, though. I’ve never read from a nested dictionary and I’m struggling to find a solution. I’ve searched around and found a few possibilities but nothing has worked for me. What’s the proper syntax for reading from this example? Say I want to find the address for the Account ID 456?

Thanks again!


#8

Yeah, so you read the dictionary similar to a datatable.
The first column for instance is being stored into the dictionary as a “key”, so you use the key to access the value in it.

Since it’s a 2-dimensional dictionary you will have 2 keys.
out_d("456")("Name").ToString

Also, since you are using “Object” for its type, you will need to convert it to a String, Number, Date, etc when you use the value.

Additionally, you can create a list of the keys, for example, if you wanted to run the keys through a loop or something. I can’t remember the syntax for sure but it’s something like out_d.keys or out_d("456").keys

You can actually find more info on dictionaries using Google or search engines.

Regards.