How to go about deleting a datarow immediately after processing it

I’m using nested “for each row” loops to transfer data from one datatable to another based on matches in specific columns. I have an OutputTable (the datatable I want to write to excel) and an InputTable(where my data extracted from an XML file is stored). I check for matches in the first few rows, and if found, copy the data from the other columns to the output table. Since the size of the data table gets quite big, I wanted to delete the rows from the input table after processing them, since they will not be needed anymore, and they cause a huge increase in processing time. I tried using the “Delete datarow” activity from within the inner loop, but it threw an error “Datatable altered, might not be able to process next iteration.” I then tried collecting all the indexes of the processed rows from the input table in a List and deleting these rows before the next iteration of the outer/super loop. It all went fine, but 3 times I got an error “Remove Data Row: There is no row at position 258.” (which I “Ignored” during the debugging run). The output spreadsheet had some faulty data in some of the lower rows.
So, could someone guide me to a proper way of doing this? Attaching my .xaml for the loops.
Debugging.xaml (128.7 KB)

This is because the For Each Row activity uses the index of the row item. This would change if you delete the row, so the process will not allow it.

What I’ve done to get around this is to create a list of indices from the table when I want to delete. Once iteration over the table is complete, iterate over the list, and delete the row at each indicated index. It is very important that you create an offset for the row you’re deleting when you do this, however. Once you’ve deleted a row, the indices are shifted down by 1, so your row list indices will be 1 off each time you delete.

If you really want to delete the rows you’ve already copied to your output table, you’ll have to wrap your logic within a do-while loop. Once you’ve found and copied from Input to Output, break the for-each, alter the InputDT (remove row) and start the for-each loop again. Set a control variable for the while condition.

@rpa4
I assume the reason why this Error is thrown is already cleared. Different stragegies are possible

  • A: Store the row Index to a list as @Anthony_Humphries mentioned
  • B: Store the row as datarow to a list

Removal can be organized e.g.
For A:

  • As mentioned by Anthony along Handling the corrected Index
  • Use yourlist.reverse and remove from highest to lowest Index, then No Index correction IS to do

For B:

  • Remove datarow and use the datarow aß argument
  • Datatable Set Operation except and do an except on Origin Datatable except the rows from the rows to remove list

I think your strategy of deleting each row from the source data table will increase even more of cpu and memory usage in your process, what you most likely should avoid is to nest those loops and think of a more performatic approach.

To bcorrea’s point, you could do it in the same loop, but you would need a While or Do While loop that stops when the current row index reaches the datatable row count (as mentioned by pedro.arroyo). Be careful with this, though, as when you delete in the while loop, you’ll want to not increment your row index, as the deletion shifts the data up.

Thanks for all the replies, guys. I guess the main problem was that I didn’t realize that the indices were shifting up after each deletion.Trying to optimize performance is a pain :sweat_smile:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.