Looping through PDF files to extract specific selected data

Hi,
I want to extract selected data from many pdfs(pdfs are same format)
for example: Name , Address , Email , Mobile
How to scan and extract specific selected data automatically?

Regards,
Soe

1 Like

@Soe_Min_Latt

Welcome to the Uipath Community.

You can use Regular Expressions or String manipulation functions to retrieve required text from PDF files.

5 Likes

Hello @Soe_Min_Latt,

As a solution for you I am thinking at this kind of scenario:

  1. Have all pdf’s in a folder. Use the Snippet ‘For each file in folder’ (Studio > Snippets )
  2. In the body of the loop, you can add the following options:
  • Read PDF text
  • Read PDF with OCR
    or
  • Get Text activity
  1. To get the exact text that you need, use the following operations:
  1. Write the extracted data in a txt or Excel file.

Vasile.

3 Likes

@Soe_Min_Latt

  1. You can use the activity PDF to text and from Text file you can use the string manipulation.
  2. If its on same position then you can use get text or scraping methods to get the data.
2 Likes

hi wasea ! hope you are doing well. I have multiple pdf document with same format and i want to extract specific data like invoice number, date , price from all pdf files. I have read and opened all files through studio but i am unable to read fields through get OCR text activity. Please suggest suitable solution. Many thanks in advance.