I have been searching the forum for a few hours and I cannot solve a data comparison problem.
- I use document understanding to extract data from a pdf file
- One column of the data set includes a phone number
- The phone number is found in the file name, which I need the full name in order to carry out my excel manipulation
- After hours of searching it did not appear to be possible to include the PDF file name in the excel output during the document understanding data extraction process
- Could someone help me identify a solution to add the full pdf file name to the corresponding document understanding output row in excel?
Idea 1: If it is possible to add the file name during the document understanding process this would be the easiest option
Idea 2: The next option in my mind would be to build a list of file names and an excel list, then for each row in the excel list, if the file name matches the phone number record the file name in the excel file
If either of these ideas would work, I am having trouble executing them and could use some help.
You can do it by getting the list of files as an array of strings using Directory.GetFiles command and then check if the array contains the phone number using an iterative loop.
I have created an example where you can the bot goes through the numbers in each row of the excel sheet and then checks if any of the pdf files contain the phone number. Please have a look and do let me know if you have any more queries.
PhoneNumberExcelName.xaml (11.0 KB)
Thank you for the suggestion.
In your workflow the last activity was invalid or missing. Could you share what you did in the final step?
Hi @MF.RPA ,
The above Idea is very much a Possibility as I have also accomplished a similar updation after exporting the extraction results.
During the Exporting of the Datasets to an Excel, we could use the datatable for updating the Column Value of the Phone Number.
table is your Datatable variable, then we could update the value in the Phone Number Column using an Assign Activity in the Below way :
table(0)("Phone Number") = FileName
FileName is the data which should already be available as it is the document file being processed.
Also, we would also need to Check for
TableName of the Datatable first to Check it is not
Formatted fields table.
Let us know after you have tried this method and revert back if your facing difficulties.
All the activities I used were Excel Scope, Read Range, For Each Row in Data Table, If, Assign and a Write Line.
This solution is actually working for me now! Thank you for the explanation.
My new issue is that the File Name is writing to the wrong column but I can probably figure that one out.
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.