Trying to read a single excel sheet as two different data tables

My excel sheet has 6 columns and I wish to read only one column as one data table and all the remaining rows as another data table. I don’t have any idea how to do that. Can you please give me a procedure how to do that. Thanks in advance.

Hello @chintham,

Use two “Read Range” activities on the same file, create 2 data table variables, lets say dt1 and dt2.

Assuming the coumn you what to get is in “A” :

For the first read range, set the range as “A1:A10” (assuming you have 10 rows),set output as dt1
For the second read range, set the range as “B1:F10” (this will give you all the other rows),set output as dt2

We can’t know the exact number of rows, because it varies for each file.

For example, If we I want column A as separate data table, Can I use “A1:B1” as range. Or if Use like that, will it consider all the empty fields in column A too ?

Hi,
You can use the range like this to take entire column
“A:A” or “B:B”

Also, another way is to Read Range entire table and copy the data table to another data table variable. Then, you can use Remove Column activity (w/ or w/o a For Each) to remove each column you don’t want for both data tables. Ultimately, you end up with 1 table having just one column while the other has all columns except that one.

Regards.

One more thing is, Read Range is only valid in Excel Application Scope, in that properties, we need to fill Workbook under Output. Is that just Excel file path to be filled in?

You can also use Read CSV if it’s of a .csv filetype

If you use the Excel Scope, you only need to input the Workbook Filepath property.
The Workbook property in Output is for accessing various properties from the excel application window that is opened, like .GetSheets() and other things; you can also use this in the Close Workbook activity which is necessary at the end of each scope when you are done with it.

EDIT: you can leave the Output properties blank, and only use them when you want to access the information from the variable (ie workbook.GetSheets)

After using Read Csv , how can I select a particular column and store it in a separate data table?

You can use “Remove Data Column” activity inside a For Each.

Search for the activity “For Each”, then change its TypeArgument property to DataColumn (you probably need to browse for DataColumn).

Once you have done that, then input the main field where it says…
For each col In dt1.Columns

—dt1 is the datatable variable used

Then, inside the For each block, place an “If” activity, and in the If use the condition like:
col.ColumnName <> columnname
—columnname is the string that represents the particular column header you want to keep

Now, finally, place the “Remove Data Column” activity inside the If block and in the Column property place
col
—col is the variable used in the For each that represents each column being looped through

Assuming you made no simple mistakes here, you should now have a sequence that will remove all columns except a particular column.

Read CSV to dt1
For each col In dt1.Columns
    If col.ColumnName <> columnname // columnname is the string of particular column header
        Remove Data Column // col

If you want a table of the other columns, then just use another Read CSV (or an Assign with dt2 = dt1) followed by the Remove Data Column to remove the particular column header placing the string in the Column Header property.

To go further, if you have a list of columns you want to keep, then place them in an array. And, for your If activity, use the condition Not arrayColumns.Contains(col.ColumnName), which would check if the columnname is Not contained in the array of columns.

I hope this helps answer it in more detail.

CORRECTION: changed .AsEnumerable to .Columns

Regards.

Just a clarification on Read Range activities - there are two:
image

The first one requires the Excel Application scope while the second will read the Excel file directly (it is much faster for the purpose of reading the file)

As far as reading the particular columns, I would do it like that:
image

2 Likes

Hi,

Thank you so much for detail explanation. I am getting an error for the condition col.ColumnName <> columnname saying Option Strict On disallows late binding. ‘Servers’ is not declared. It may be inaccessible due to its protection level. FYI Servers is my Column header

Hi. To resolve that error, I think “Servers” should be a string with quotes around it.
col.ColumnName <> "Servers"
or if “Servers” is stored in another string variable, then use col.ColumnName <> variable

Regards.

Yeah Thanks. That worked

Thank you. It helped

Hi. I think I made an error with my above explanation. In the for each loop, you should be referencing the .Columns in the data table, like:
For each col In dt1.Columns
so it loop through each column. // dt1.AsEnumerable is if you want to loop through each row

Sorry.

Regards.

Yeah, I found it and worked on it. Thanks :slight_smile: