To split a big CSV file into mutiple csv's

Hi,

I have a CSV file containing 250000 records in it . I want to split the csv file into multiple files each of 25000 records .
Is it possible ?
if you suggest me code then do paste the code !

Thanks,
Shabaz

Hi
–yah thats possible
–use READ CSV FILE activity and pass the filepath as input and get the output with a variable of type datatable named dt

–followed by that use a while loop activity with a condition like this
counter > dt.Rows.Count
where counter is a variable of type int32 with default value as 25000 defined in the variable panel
–inside the loop use a assign activity like this
dt1 = dt.AsEnumerable().Skip(dt.Rows.Count-counter).CopyToDatatable()
where dt1 is a variable of type system.data.datatable with default value as New System.Data.Datatable

–now use a WRITE CSV FILE activity and mention the input as dt1 and mention the file path as
“yourfolderpath\yourfilename”+Now.ToString(“hh_mm_ss”)+“.csv”

–then use CLEAR DATATABLE activity where mention the input as dt1
–and inside the while loop atlast use a assign activity like this
counter = counter + 25000

hope this would help you
kindly try this and let know for any queries or clarification
Cheers @SyedShabaz03

1 Like

Hi,

You can get it using Matches activity as the following steps.

First, read your whole csv fils as String type variable (strCsvData) using Read Text Activity.
Then, use Matches activity with the following settings.

  • Input : strCsvData
  • Pattern : "((.*?\n){25000}|((.*?\n?)){1,25000}$)"
  • Result : result (IEnunmerable<match> type variable)

Finally you use For Each Activity and set result in Value property, and use Write Text Activity in the loop

If you need header in each file, it requires to append the result string to header text before Write Text activity in the For Each loop.

Regards,

Hi @SyedShabaz03,

Here is the package that it has many to activity to facilitate the Text format file.

Here is the steps in picture

image

Sample : EasyText.zip (19.1 KB)

Regards
Balamurugan.S

1 Like

I’m having the csv file containing 84,536 rows.
My requirement is to split the 64,536 rows into new CSV file.
First split file can contain 64,536 and second file can contain the remaining rows.

Similarly if the count is more, it can be splitted to more csv files.

Could you please help to achieve this scenario.

1 Like

Hi @hari2010haran,

Welcome to UiPath Community !!

In above you can able to under well to split the file using the Read Specific line activity.

Regards
Balamurugan.S

Thanks Palaniyappan! The run ran well, but I could not find the csv files in the folder (indicated by the file path) created by write.csv activity.

If I understand your code correctly: “yourfolderpath\yourfilename”+Now.ToString(“hh_mm_ss”)+“.csv”, the folder is the one where i run the project in uipath?

Would appreciate your advice. Thank you.

1 Like

Yah it can be
It’s the folder path where we want to save the file

Cheers @eagerlearner

Hi , I have a same requirement …In excel sheel I have 1000 or more data… I need to split into 50 rows.

I tried this logic … it doesn’t give me any result. It’s showing blank on folder path

edit: this should be
–followed by that use a while loop activity with a condition like this
counter < dt.Rows.Count
and by this logic it deletes rows from beginning not from last…and doesn’t create any multiple excel

@chiranjit.saha1 can you explain what you mean by split into 50 rows?

So you have 1 excel with 1000+ rows, and you want to take 50 rows from it and move it to a new excel document, then move onto the next 50 rows and repeat until all rows have been put into new excel documents? Or what exactly do you want to do with the 50 rows your are splitting?

I have 1000+ rows. I need first 50 rows in one excel, 2nd 50 in next excel and keep moving.
here counter value is 50…
dt1 = dt.AsEnumerable().Take(counter).CopyToDatatable()

[ problem is in while condition – counter < dt.Rows.Count… suppose I have 53 data it’s taking first 50 data …missing 3 data like that.]

While counter <= dt.rows.count
dt1 = Dt.AsEnumerable().skip(counter).Take(50).CopyToDatatable()
Write range dt1 (make sure writing to uniquely named excel)
counter = counter + 50

This will work through your entire original table called dt and move 50 rows at a time to your new excel workbooks. Note that even though take will be set to 50, if there are less than 50 rows available it will take whatever is left without throwing an error (e.g. with 53 rows it will take 50 rows, then 3 rows, then be done)

EDIT: counter is assumed to be an integer which initializes at 0, so no need to declare it first

2 Likes

Thanks Dave. It’s working fine :slight_smile:

1 Like