Extract beginning portion of data table only

datatable

#1

I have an excel file in which the first row contains the proper headers, and I want to take the first N rows as my actual table, ignoring the rest. I will be using a loop on many excel files doing this same thing, so I am trying to find an efficient way of doing things to avoid possible memory issues

For each excel file in list do the following:

  1. Read excel file
  2. remove excess rows from file
  3. Add to consolidated datatable

This consolidated datatable will then be used within the business process. What is the best way to go about this?

Within each file, row N + 1 contains a blank row, so I was planning on finding N by looking for the first blank row & subtracting 1. However, now that I have the row index, I’m not really sure the best way to get only rows 1 through N in my datatable. Any ideas?


Extensive utilization of Excel
#2

Hi @Dave,
This sample will help you to ur situation.

File : Extract beginning portion of data table only.zip (8.2 KB)

Regards
Balamurugan.S


#3

@Dave
You can try like this.

First read your datatable and store it in a datatable variable dta by using read range activity, check the add headers property

Then if you want to take only first N rows then use the below query
dt1=dta.AsEnumerable().Skip(N)

Where N is the number of rows you want to skip.

Or If you want to remove only empty rows then you can use this query.
dt1=dta.Rows.Cast(Of DataRow)().Where(Function(row) Not row.ItemArray.All(Function(field) field
Is DBNull.Value Or field.Equals(""))).CopyToDataTable()

Regards,
Mahesh


#4

@balupad14 It seems like this is just giving me the index of where the blank row is in the datatable, correct? I was hoping to add all rows prior to the blank row into a separate table

@MAHESH1
First read your datatable and store it in a datatable variable dta by using read range activity, check the add headers property

Then if you want to take only first N rows then use the below query
dt1=dta.AsEnumerable().Skip(N)

Where N is the number of rows you want to skip.

Wouldn’t this skip N rows, so that only rows after N are in dt1?


#5

@Dave

I didn,t get you.

Regards,
Mahesh.


#6

The .skip method you reference would skip the first N rows of the table, whereas I was hoping to include N rows.

However, you have helped because after you showed me the .skip method I was able to find it’s sister - the .take method! https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/sql/linq/return-or-skip-elements-in-a-sequence

I haven’t tested it out yet, but it looks like .take is exactly what I was looking for. For each excel file I’ll create a temp table using the .take method on the original table and then use the import rows method to a new consolidated table created from all of the excel workbooks


#7

@Dave

Do you want to remove only empty rows from your datatable

Regards,
Mahesh


#8

No, I wanted to take only the first N rows from many different tables and combine them together into one consolidated table. The blank row I mentioned was just used because that blank row would be N+1 in every excel file.

Each excel file has multiple tables within it separated by blank rows. I was hoping to get the 1st table in each excel file and combine into a single table utilized by UiPath


#9

@Dave

Oh K. :grinning:

Regards,
Mahesh


#10

HI @Dave,
I understand that before the empty row . it is consider as table . My first sample took only the fist blank row as table.After that if you have blank row, it won’t consider. Now I have changed it. Where ever you have blank row, before the data it considers as a table.

To test : I have appended values to the excel sheet. Each time before you do the test. Please check it that have more than sheet3. If it has please delete it . And do the test.

Here is the file : Extract beginning portion of data table only.zip (11.0 KB)

Regards
Balamurugan.S