UiPath Studio (2017) crashes when working with large volume data

Hi, my UiPath crashes every time I run this workflow:

  • Read from several Excel files in folder
  • From every Excel file, read every sheet and collate together all sheets
  • Collate together from all files (sheets may have additional columns, append column if present)
  • Evaluate combined file per row, checking validity of email, contact details etc
  • Checking for duplicate rows this takes a lot of time because every pair of rows has to be compared, the actual content per row may be different but it can be considered a duplicate because I need to compare content of first name, last name, email, company etc
  • Building reverse referencing for duplicate rows e.g. if rows 1 and 5 are duplicates, row 5 also has to refer back to 1
  • Appending error column
  • Appending duplicate column
  • Highlighting rows with errors and duplicates

Even if I work with just two files I get about 30 columns and 120 rows and while running the workflow UiPath Studio crashes… UiPath Robot continues executing but does not reach up to the point where it highlights the errors and duplicates…

How do I prevent the studio from crashing?

It should not crash… Can you give input files and flow?

Read everything in datatable, add a column with the key then remove duplicates.

Sorry, can’t give the workflow.

I don’t want to remove duplicates, just mark them for review. Also, definition of “duplicate” here is not that both rows have exact same content. Need to compare first name, last name and sometimes the data has the full name in the first name field, “John Smith” vs “Smith John” etc. Also check if both rows have email and company, or one or the other or both has email or company missing and then based on the available information decide whether it the row is referring to the same person. The rest of the information can differ

I notice that Write Line activities produce tens of thousands of output log entries as the workflow runs, could this be causing it to crash?

Should not be a cause… But just remove the writelines in production. You should have it for development / debug only.

Could it be because comparing each pair of rows is a problem that gets more complex as the input gets more data?