Divide csv with 1 million of rows

There is someone with knowledge in linq, i new a linq which divide a file csv in multiple files csv with a specific value of rows, for example i have a csv with 1.5 Million Rows and i need divide in files each one of 100.000(100K) rows in this specific case should be 15 files .csv each one of 100K rows, in this specific case i cant use a for each activity cause take a lot of time an linq is more efecctive.

Hi @Fabio_Villamil

Please try the workflow attached. Replace the CSV path with yours.

You can set the split row quantity in this expression:

Enumerable.Range(0,csvLineList.Count\100000 -CInt(csvLineList.Count Mod 100000>0)).Select(Function(i) csvLineList.Skip(i*100000).Take(100000).ToList()).ToArray

Spliting CSV File.zip (53.8 KB)

1 Like

We assume the following:

  • Project is set Windows Compatibility
  • First Line in CSV is Header Line
  • Header Line is also needed in the splits

Assign Activity
arrLines | String Array =
File.ReadAllLines("YourFullPathToOriginCSV)

Assign Activity
strHeader = arrLines.First()

Assign Activity
CSVSplits | List(Of String) =

(From s In arrLines.Skip(1).Chunk(999999)
Let sf = String.Join(Environment.NewLine, s.Prepend(strHeader))
Select sgm = sf).ToList()

Then Loop over CSVSplits and write to file each looped part CSV

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.