Quickest way to compare list of files in folder to list of files in excel

i am running through files in a folder, i am recording the file name and a single piece of txt data from the file on a two column excel file. the folder has 500k folders on it.

in case of error, power loss etc, i would like my process to be able to scan the excel document and compare it to the folder and pick up where it left off. besides using a loop of some sort is there a quicker way to find the file it left off on?

files are numbered sequentially. 1 to 9999999
no new files are being added


@godzilla_king_of_mon Funny you should ask this. The other day I logged an ID for a record in Excel to a log file. Upon killing the app partially through processing the records in the Excel file, I would restart the app and the first step is to put the log file rows into a list, order by descending and take the top record. Then I put that ID in a variable which I used as my “resume” point for processing records in the excel file. Kind of clunky, but it worked. However, I did see one small problem if I killed the task at a particular point, it didn’t get a chance to write the last ID to the log. So I do still have some duplicate processing. :frowning:

Yeah, I think you will need to store some status for the item in a file. Let’s say you log this status in excel, then you can use LINQ to check the status before processing the item.

Here is some pseudo that might help explain this better:

Read Range of log file //I'll call dt1

ForEach file in System.IO.Directory.GetFiles(folder,"*.*",System.IO.SearchOption.AllDirectories)
    Assign activity: rowMatches = dt1.AsEnumerable.Where(Function(r) r("Filename").ToString = System.IO.Path.GetFileName(file) ).ToArray
    If activity: condition: rowMatches.Count > 0
        Assign activity: status = rowMatches.Last.Item("Status").ToString.Trim

    If activity: condition: Not status.ToUpper.Contains("COMPLETE")
        <process file>
        Update status for row with Add Data Row or Assign activity

So, it finds the row in the datatable, and if it found a row, then get the status of thee row. If it does not contain “complete”, then do something with it.



@godfathr @ClaytonM

i figured this is where i would end up doing something along those lines. reason i asked is one of the RPA tools i’ve used in the past had feature built in just for this kind of thing. Needed to make sure i wasn’t making more work for myself than i needed to be.


Wow. I like this. I might go back and update my workflow to use this idea. :smiley:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.