Search duplicate files



Want to delete all duplicate files from directory. So, I am creating a list of files in 1st folder and checking the names in 2nd folders, if same file is found in 2nd folder then move it to a duplicate folder, if no duplicates are found a a new list will be created combining folder 1 and 2, this will check folder 3 for duplicated, and so on. Is there a better way to find duplicate files in directory? Please suggest if there is any other way to achieve this. Many Thanks.



From my understanding, you want to check for duplicate files, if there are, then create a new “folder #”.

So my thinking was you could get all files in the home directory. Select only the filenames, then take the Distinct filenames. Then, use the Count and compare it

System.IO.Directory.GetFiles("C:\HomeDirectory","(.*)",SearchOption.AllDirectories).Select(Function(x) System.IO.Path.GetFileName(x)).ToArray.Distinct.Count < System.IO.Directory.GetFiles("C:\HomeDirectory","(.*)",SearchOption.AllDirectories).Select(Function(x) System.IO.Path.GetFileName(x)).ToArray.Count

So basically do If .Distinct.Count < .Count, which tells you that some of the duplicates were removed from the list.

If it is less than and the condition is true, then create a folder using the last folder name with Regex.Replace.

System.Text.RegularExpressions.Regex.Replace(System.IO.Directory.GetDirectories("C:\HomeDirectory").OrderBy(Function(x) x).Last, "[0-9]{1,3}",  _
    If(IsNumeric(System.Text.RegularExpressions.Regex.Match(System.IO.Directory.GetDirectories("C:\HomeDirectory").OrderBy(Function(x) x).Last, "[0-9]{1,3}").Value), _
	(CInt(System.Text.RegularExpressions.Regex.Match(System.IO.Directory.GetDirectories("C:\HomeDirectory").OrderBy(Function(x) x).Last, "[0-9]{1,3}").Value)+1).ToString, _
	System.IO.Directory.GetDirectories("C:\HomeDirectory").OrderBy(Function(x) x).Last+" 1") )

So you can use something like in the Create Directory activity, and feel free to store some of the code segments in variables.

Anyway, that was my idea.

Apologies if I get some the logic needed wrong. What I suggested would work assuming there are no duplicates with all the folders combined into one list, so if the duplicates stay in those folders it will always see a duplicate and create a new folder. So, one might remove the duplicate files, and if that is desired, I might be able to provide a suggestion on that later.