Compare the phone numbers in excel to a list of PDF files, write the file names in the excel file

Hello,

I have been searching the forum for a few hours and I cannot solve a data comparison problem.

Background

  1. I use document understanding to extract data from a pdf file
  2. One column of the data set includes a phone number
  3. The phone number is found in the file name, which I need the full name in order to carry out my excel manipulation

Issue

  1. After hours of searching it did not appear to be possible to include the PDF file name in the excel output during the document understanding data extraction process

Question:

  1. Could someone help me identify a solution to add the full pdf file name to the corresponding document understanding output row in excel?

Idea 1: If it is possible to add the file name during the document understanding process this would be the easiest option

Idea 2: The next option in my mind would be to build a list of file names and an excel list, then for each row in the excel list, if the file name matches the phone number record the file name in the excel file

If either of these ideas would work, I am having trouble executing them and could use some help.

Thank you,

Hello,
You can do it by getting the list of files as an array of strings using Directory.GetFiles command and then check if the array contains the phone number using an iterative loop.
I have created an example where you can the bot goes through the numbers in each row of the excel sheet and then checks if any of the pdf files contain the phone number. Please have a look and do let me know if you have any more queries.

PhoneNumberExcelName.xaml (11.0 KB)

1 Like

Hi,

Thank you for the suggestion.

In your workflow the last activity was invalid or missing. Could you share what you did in the final step?

Hi @MF.RPA ,

The above Idea is very much a Possibility as I have also accomplished a similar updation after exporting the extraction results.

During the Exporting of the Datasets to an Excel, we could use the datatable for updating the Column Value of the Phone Number.

Suppose table is your Datatable variable, then we could update the value in the Phone Number Column using an Assign Activity in the Below way :

table(0)("Phone Number") = FileName

FileName is the data which should already be available as it is the document file being processed.

Also, we would also need to Check for TableName of the Datatable first to Check it is not Formatted fields table.

Let us know after you have tried this method and revert back if your facing difficulties.

1 Like

All the activities I used were Excel Scope, Read Range, For Each Row in Data Table, If, Assign and a Write Line.

This is what I see:

Hi,

This solution is actually working for me now! Thank you for the explanation.

My new issue is that the File Name is writing to the wrong column but I can probably figure that one out.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.