PDF to Excel: Multiple RegEx matches in multiple columns

Hi everyone,

I’m very new to UiPath and programming as a whole, so I don’t have a lot of experience in this.

I’m trying to create a program, that reads a single, random PDF-file, datascrape using regex and writing the matches in separated columns in excel. It should read a “Country”, then a number of “Customer Reference” and a corresponding “Customer Item”.
The number of “Reference” and “Item” might vary from file to file.

I have created a program, that returns the first match, but no further. I have created a program, that finds every reference, but can’t make it fit into my first program. (I have tried, but I have disabled the activity, since it didn’t work)

The picture below might describe my problem better:

image

Due to policies, I cannot share the PDF, but I have created a txt-based program instead, and made an anonymized file.

Sample txt-file:
PDF_output_anonymized.txt (387 Bytes)

Excel_output-file:
Test_CreateTableTest.xlsx (9.7 KB)

Program; that returns first match in each column
Txt_to_xlsx.xaml (15.9 KB)

Program; that returns every “Customer Reference”-match
MatchCustomerReference.xaml (7.2 KB)

The problem seems very similar to another users issue, but their solution does not work for me:

Youtube video, that I have found inspiration from:
How to extract data from PDF's with RegEx in UiPath - Full Tutorial - YouTube

I would really appreciate your help, thank you.

Best regards,
Daniel Petersen

use add data row and in array section use the below code and after that append in the worksheet

This should be given in the array section of add data row activity--------------{CountryVar,CustRefVAr,CustItemVar} to a dt
after this use Append range and provide that datatable

Thank you for your answer!

Where should I add the Data Row? After the first one, or instead.
I don’t think that I have created them as variables, but rather created them in the Data Table, using RegEx.

Would it be possible to show me?

Sincerely,
Daniel

after extracting using regex store it in a variable and use data row after all the extractions and pass it as i have.

Hi Sree,

I tried the “store it as variable”, but didn’t work out. Could you guide me a bit?

Best,
Daniel Petersen

send me the workflow , it should contain the extracted variables then i will make changes in it and revert you.

Txt_to_xlsx.xaml (18.7 KB)
Hi @daniel.petersen.contracto attaching my solution. Let me know whether works for you.

Hi Sree,

This is the un-edited workflow, that I also posted in my original post:

Txt_to_xlsx.xaml (15.9 KB)

I have tried to store the extracted text as a variable both before and after the “Read Text File”-function, but it only resulted in errors. I must admit, that I’m still not understanding your instructions, but I truly value the time you spend on helping me.

Thank you!

Best,
Daniel Petersen

Hi Naveen,

Thanks a lot for your help. I can run the code, without errors, but when I try to add the “Write Range” and get the results in Excel, it returns blank. How did you manage to print it to Excel?

I have posted my try on the implementation for Excel below the text.

Thanks for your help!

Best,
Daniel Petersen

Hi @daniel.petersen.contracto thanks for looking at my solution. So instead of “Write Range” my solution uses “Write Cell”. I have used a switch to map the column name in dtData to map to column header in excel. For example
Country matches would be printed in Country header (A index)
Customer references in B indexes
Customer item in C indexes .
So forget the Add data row and Write Range :joy: as far this solution is concerned .
let me know if u have further questions.

@daniel.petersen.contracto attaching my output for your reference.
TableTest.xlsx (8.5 KB)