File matching and sorting

Hello guys

Im trying some new things with UiPath and I ran to a problem.

I have a folder with a files which are named after a extracted values from pdf. PDFs without the date contains name of sender (I have it saved in data table). The sender is variable value. I need to do this

  • pair the pdf without date with pdf with date (via matching number in pdf name)
  • sort the pdfs by sender to pre-pared folders (always sort the matched couple of pdfs)

So it will be like match 5865105 + 5865105_27.04.2025 then read extracted sender and then by that sender sort the duo to another folder.

I hope this makes sense. I tried everything Im capable of but nothing worked out so every hint helps.

Hey @michalhyza12 you can use Directory.getfile method to get the file from the folder.

Directory.GetFiles(folderPath, “*.pdf”)
or else you can use Getfiles from folder activity and filter in with *.pdf extension.

pdfFiles = Directory.GetFiles(folderPath, “*.pdf”)

Group Files by Number
Use a For Each to loop over the files and extract the number part using Regex.
Example Regex for file name: (\d+)
Use Match.Value to get the number.

Build a dictionary:

Dict(of String, List(Of String)) filePairs
For each file:

Add to dictionary :

If filePairs.ContainsKey(number) Then
filePairs(number).Add(filePath)
Else
filePairs(number) = New List(Of String) From {filePath}
End If

For Each Pair → Extract Sender from the One Without Date
Loop over the dictionary:

Identify which of the two files does NOT contain a date in the name (Not Regex.IsMatch(filename, “_\d{2}.\d{2}.\d{4}”))

From that file, extract sender name using whatever method you have (DataTable lookup, keyword matching, or actual PDF text extraction with Read PDF Text).

. Move Files to a Folder Named After the Sender
targetFolder = Path.Combine(rootFolder, senderName)
Use Directory.Exists and Directory.CreateDirectory to ensure the folder exists.

Then use Move File activity to move both files (the pair) to the sender folder.

cheers

1 Like

Hi @michalhyza12

I have an approach, could you try this approach for the Matching Part of your process:
Input:
Dummy PDF Files with the same name:

Workflow:

Explanation:

  1. Assign the list of File Names:
listOfFileNames = System.IO.Directory.GetFiles("Data\PDF files\").Select(Function(filePath) System.IO.Path.GetFileName(filePath)).ToList()
  1. Iterate through the list and use IF condition to get the files without the Date in the name:
    The IF condition used: Not currentFile.Contains("_")
    The Assign statement:
match = listOfFileNames.FirstOrDefault(Function(fileName) fileName.StartsWith(currentFile.Replace(".pdf","")) AndAlso fileName.Contains("_"))

The Output:

I hope this helps.
Could you let me know on the Sorting Part. I need further explanation.

Try This please:
Use Directory.GetFiles(folderPath, “*.pdf”)to get all PDFs.

Separate into Two Lists:

Use LINQ or For Each to split:

  • Files with date (e.g., contain “_”).
  • Files without date.

filesWithDate = allFiles.Where(Function(f) Path.GetFileNameWithoutExtension(f).Contains(“_”)).ToList()

filesWithoutDate = allFiles.Where(Function(f) Not Path.GetFileNameWithoutExtension(f).Contains(“_”)).ToList()

For Each fileNoDate In filesWithoutDate
    number = GetNumberFromFilename(fileNoDate)
    matchFile = filesWithDate.FirstOrDefault(Function(f) Path.GetFileNameWithoutExtension(f).StartsWith(number))

    If matchFile IsNot Nothing Then
        pdfText = ReadPDF(fileNoDate)
        sender = ExtractSenderFromText(pdfText)
        
        destinationFolder = Path.Combine(baseOutputFolder, sender)
        Directory.CreateDirectory(destinationFolder)
        
        MoveFile(fileNoDate, Path.Combine(destinationFolder, Path.GetFileName(fileNoDate)))
        MoveFile(matchFile, Path.Combine(destinationFolder, Path.GetFileName(matchFile)))
    End If
1 Like


I tried It by your workflow but it just game me this without any match (I added some random extra numbers)

Extra numbers?
Could you explain the variety of possible inputs. So that i can give you a dynamic solution

Just some random numbers to file name, nothing serious. Now the file names are set as thez are in screenshot.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.