Extract specific data from multiple PDF files and add to excel file

Hi, I need help to extract Data from PDF files (source folder), then add in rows and columns on excel.
Attached Sample Pdf file. Usually there are more than 1 business trip summary file.
Then i attached Sample of Excel output required.
Mika test.pdf (183.7 KB)
excel sheet needs to list rows with each new business trip extracted data
BTMS.xlsx (8.0 KB)

Hi @Mikaeel_Shaik ,

Welcome to the community !

If the pdf files is rule-based and fixed structured every time you can you the PDF package activities and write to your excel template.

Else if the pdf structure changes, then usage of “Document Understanding” is the best to extract the label data and then write to your excel template.

Hope this helps.


Thanks @abhilashmohanty86
I am pretty new so watching YT videos to try and do this.

I am stuck however on below error. When using regex.
I have attached the screenshot of my Sequence.

The regex command is “(Name\s+)([\w\s]+)(Employee)”
But i get the following error when i use message box to view the regex matches

I think you’re going a little bit too fast

For example, try to print what you read first

Then, maybe go to an online tool for regex to see if you got any results

Finally, you could google this error message, if you still got it

Hope that helped

Adrian. fanaca

1 Like

Hey @Mikaeel_Shaik instead of EmpName use EmpName(0).value
it will help you to get the regex out in string
Thanks and Regards.

Hi @Mikaeel_Shaik i have created a workflow for you please go thorugh it
MikaForumQuestion.zip (189.8 KB)
The sample output which i got by running the bot also attaching below.
PDFOUT.xlsx (7.3 KB)
Thanks and Regards.

1 Like

Thank you so much @sreejith.ss . With this template I have learnt a lot.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.