Extraction of text from several pdf files in a folder using read pdf to text

I need to extract text from several pdf files and populate into excel file using write to text. I have looked at videos and directions online - but they are all about 3yrs old. The current UI in UiPath studiox is totally different.

Hey @Olu_Emmanuel so these are the some steps you need to follow to extract the data and write into the excel.

First create your excel format for example -your column names using build data table

after this use for each file in folder activity -so it will give you all the pdf file from the folder inside the for each use read pdf/read pdf ocr activity based on your pdf format
now use string manipulation or regex to extract specific content from your pdf now use add data row activity and pass the output . for string manipulation/regex you use assign activity and that assign variable you have to pass in the add data row activity also that build datatable vriable you hav to pass in the add data row activity so it will store the data in that datatable variable now use write range activity to write the data i the excel.

cheers

1 Like

You can follow below steps

1. Start a New Project

  • Open UiPath StudioX.
  • Click “New Task” or “Blank Task”.
  • Name your project (e.g., ExtractPDFtoExcel) and click Create.

2. Add Excel File

  • In the Activities Panel, search for “Use Excel File”.
  • Drag Use Excel File to the Main workflow.
  • Click “+” to create a new Excel file (or browse to an existing one).
  • Rename the file reference (e.g., OutputExcel.xlsx).

3. Set Up the PDF Folder

  • In the Activities panel, search “For Each File in Folder”.
  • Drag it before the Excel activity (not inside).
  • In the Folder field, click “Browse” and select the folder containing your PDF files.
  • In the Filter field, enter *.pdf to target only PDF files.

4. Extract Text from PDF

  • Inside the For Each File in Folder, search and drag “Read PDF Text” (from the PDF activities package).
    • If not available, go to Manage Packages > Official tab and install UiPath.PDF.Activities.
  • Set the file path to the CurrentFile.FullPath.
  • Store the output in a variable (e.g., pdfText).

5. Write Text to Excel

  • Inside the Use Excel File container:
    • Drag a “Write Cell” activity (or use “Append to Excel” if you want to keep adding rows).
    • Set the Sheet Name (e.g., Sheet1).
    • Use the formula like: =A2 for the first row.
    • In the value field, put the pdfText variable.

:repeat_button: If you’re looping through multiple PDFs:

  • Use an index variable to move down rows (e.g., A2, A3, A4, etc.).
  • Or use the “Append to Excel” activity to auto-append rows.

Do you have images to show this? That might be helpful. It was bit difficult executing the whole explanation.

please refer to attached solution.
PDF Automation.zip (296.2 KB)

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.