How To Read All PDF Files from Directory and Save the Output to one Data Table?

Dears,

How To Read All PDF Files from Directory and Save the Output to one Data Table?

Thanks in Advance

Hi @hsendel

  1. Use Directory.GetFiles(FolderPath) to get the array of string containing path of all PDFs inside FolderPath
  2. Iterate through each Using the For Each Activity
  3. Use Read PDF Text to read the text of the current PDF file in iteration
  4. Use string manipulation techniques(regex - Matches Activity) to extract the required data and write in a data row of the output Data Table

These are the general steps to be followed

2 Likes

Thanks @kumar.varun2 , can you explain how to achieve step 4? Thx

@hsendel

Let us suppose you are extracting information from invoice PDF. Regular expression depends upon the template of the PDF. In order to extract the invoice amount use one regex. For invoice number use another and save them in variables. Depending upon the structure of pdf even one regex can be used to extract more than one information.

After extracting all the required information from the PDF you have write it in a data row of a Data Table. Use Add data row for it.

1 Like

In fact i want to join all extracted Datatables from each file, how to do this?

Problem Fixed Following Topic : How to read multiple pdf files from the folder - Help - UiPath Community Forum

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.