What is the most efficient way to read 1800 sheets from 200 different excel files?

Hi,

Anyone know what is the most efficient way to read 200 excel files and each excel file contains around 9 sheets? So the total sheet count is 1800 and my automation will need to loop through all of them and then extract data from all the sheets and output into a master excel file.

Currently, my automation (Using ReFramework) takes about 24 minutes to do a full run like this. I am trying to cut down the run time if possible.

Here is an example of what an excel sheet looks like:

Row 1 - 15 contains address info

Row 16 - 30 is a table that contains Box number, cost, and some other info.

Box number        Cost
1841               $118
2733               $118
1543               $118

The master excel file that I am generating at the end is appending the Item number and Cost row by row from all 1800 sheets.

Right now my process workflow looks like this:

Build Data Table Activity (dt_allBoxes)

Excel Application Scope (Workbook path: In_TransactionItem("File_Path").ToString) 
  For Each sheet in workbook.GetSheets
      Read Range - (sheet name: sheet.ToString) (Range: "A1:J39) (Output DataTable: dt_fullSheet)
      Read Range - (sheet name: sheet.ToString) (Range: "A16:I35) (Output DataTable: dt_boxInfo)
    For Each Row in dt_boxInfo
        Assign drRowsAllBoxes (Variable Type: DataRow) = dt_allBoxes.NewRow
        Assign drRowsAllBoxes("Box number") = row(0).ToString
        Assign drRowsAllBoxes("Cost") = row(1).ToString
        Assign drRowsAllBoxes("Code") = dt_fullSheet.Rows(4)(5).ToString
        Add Data Row activity (DataRow: drRowsAllBoxes) (DataTable: dt_allBoxes)
Close Workbook
Append Range (writing records to excel from dt_allBoxes)

Please let me know where I can do some improvement to make the automation run faster with ReFramework.

Thanks!

Hi,

    For Each Row in dt_boxInfo
        Assign drRowsAllBoxes (Variable Type: DataRow) = dt_allBoxes.NewRow
        Assign drRowsAllBoxes("Box number") = row(0).ToString
        Assign drRowsAllBoxes("Cost") = row(1).ToString
        Assign drRowsAllBoxes("Code") = dt_fullSheet.Rows(4)(5).ToString
        Add Data Row activity (DataRow: drRowsAllBoxes) (DataTable: dt_allBoxes)

For now, we can write the following code which same as the above. This will be faster because of less number of activities.

dt_Temp = dt_boxInfo.AsEnumerable.Select(Function(r) dt_allBoxes.Clone.LoadDataRow({r(0).ToString,r(1).ToString,dt_fullSheet.Rows(4)(5).ToString},False)).CopyToDataTable()
dt_allBoxes.Merge(dt_Temp)
Append Range (writing records to excel from dt_allBoxes)

This may be take long time if file size is very large.
It might be better to append data as text (csv) using TextWriter class etc.

Regards,

1 Like