Select Random Sample in Excel, then mark selected samples in Source workbook

The purpose is to randomly select 25 rows (for sampling) from a much larger source population in Excel (in the same workbook). The 25 random rows are to be copied to a new sheet (‘Sample’). Then, in the larger population sheet (‘Population’), a new column (‘Selected’) is to be added at the end and marked with an ‘X’ for each row of the 25 samples selected - to show which were selected out of the larger population.

My question is - what is the most practical way to add a column to the main population sheet marking ‘X’ for each of the 25 that was used - and then add the 25 rows in a new sheet. Which part should be done first?

The attached .xlsx file shows the process so it is clearer. Let me know if you have any questions. Thanks so much in advance!

Sample Selection Example.xlsx (83.3 KB)

High level approach below :

  1. Use Random Number Generator activity to generate Random numbers from 1 to (excel row count -25 )
  2. In your data table , select those rows based on row number range generated above and copy into temporary data table e.g. if randome number is 100 ,you copy rows having index from 100 to 124
  3. Copy temp data table to other sheet using write range
  4. For updating source excel you can use add column in excel scope or just keep it all in memory I e. Update data table with new column, set value X and perform write range to write again.
    Approach would depend on how many rows are present in the excel. You wouldn’t want to copy entire data table again if there are too many rows

Hi,

Hope the following sample helps you.

arrTargetRowNumbers = Enumerable.Range(0,dt.Rows.Count).OrderBy(Function(i) rnd.NextDouble).Take(25).ToArray()

Sample20230520-1aL.zip (208.9 KB)

Regards.

1 Like

Hi @n.sandel

  1. Read Range (Main Population Sheet) => **dtPopulation**
  2. Assign activity:
  • Variables:
    • **randomIndices** (List(Of Int32)): Enumerable.Range(0, dtPopulation.Rows.Count).OrderBy(Function(i) Guid.NewGuid()).Take(25).ToList()
    • dtSample (DataTable): New DataTable
  1. For Each activity (item: **index** in **randomIndices**)
  • Add Data Row (ArrayRow: **dtPopulation.Rows(index).ItemArray**) to **dtSample**
  1. Add Data Column (ColumnName: “Selected”) to **dtPopulation**
  2. For Each Row activity (row: **row** in **dtPopulation**)
  • If (randomIndices.Contains(row.Table.Rows.IndexOf(row)))
    • Assign activity: row(“Selected”) = “X”
  1. Write Range (Main Population Sheet) => **dtPopulation**
  2. Write Range (New Sheet ‘Sample’) => **dtSample**

Hope it helps!!

Regards,

Thank you! This is amazing. Is there a way to copy over headers from the population sheet to the sample sheet using your method (without excel activities)? They are dynamic and change often. Thanks so much for your help!

Nevermind, checked add headers. Got it, thank you!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.