How to open pdf file

Hi All,

I have created a workflow to open and read the data from pdf file. It is not working to open the pdf file. Please help me to open pdf file.

Thank in advance
Niranjan

Hi @Niranjan_k

Use Read PDF Text activity

image

@lrtetala the file is not opening from the given path. It is reading only when it is opened

Hi @Niranjan_k

Use Start Process activity to open the file

image

Hope it helps!!

1 Like

Hi, Niranjan, You can use the “Read PDF Text” activity to read the text from a PDF file. However, if you’re facing issues opening the PDF file itself, you might want to use the “Start Process” activity to open the PDF reader application (e.g., Adobe Acrobat Reader) and then use the “Read PDF Text” activity to extract text.

Here’s a simple example of a UiPath workflow to open and read a PDF file:

  1. Use the Start Process Activity:
  • Drag and drop the “Start Process” activity.
  • In the “FileName” property, provide the path to the PDF reader application (e.g., "C:\Program Files\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe" for Adobe Acrobat Reader).
  1. Use the Read PDF Text Activity:
  • Drag and drop the “Read PDF Text” activity.
  • In the “FileName” property, provide the path to the PDF file you want to read.
  • Assign the output to a variable (let’s call it pdfText).
  1. Do Something with the Extracted Text:
  • You can now use the pdfText variable in subsequent activities to perform tasks with the extracted text.
1 Like

Hi @Niranjan_k

You can try this

1 Like

@rikulsilva i don’t find this path in my system

@Shekar_Ch thanks for the detailed info could you please provide any sample workflow

Why are you trying to open it? You don’t have to open PDF files to read them. Have you installed the PDF package and tried the Read PDF Text activity? If it’s not text, but a scanned document, then you use OCR/Document Understanding.

@postwick Actaually I have few tables in different pages, I want to read 2 tables out of 50 pages, suggest me what is the best way to implement.

Can you please share the Sample file so we can help you, extraction datable we can do using DU( Document Understanding) or we can send the extracted data to the AI and there you send the call using argument AI will give the table output here AI means (OPen AI Chet Gpt or Genric AI) you can search in Youtube how to integrate with AI @RAKESH_KUMAR_BEHERA or @nisargkadam23 they explain it in detailed.

1 Like

@Niranjan_k you can use the below approch.

Install Libraries:
pip install beautifulsoup4 requests

Create Python Script:

> import requests
> from bs4 import BeautifulSoup
> 
> def extract_table_data(url):
>     response = requests.get(url)
>     soup = BeautifulSoup(response.text, 'html.parser')
> 
>     # Extract data from the first table
>     table1 = soup.find_all('table')[0]
>     data1 = [[td.text.strip() for td in row.find_all('td')] for row in table1.find_all('tr')]
> 
>     # Extract data from the second table
>     table2 = soup.find_all('table')[1]
>     data2 = [[td.text.strip() for td in row.find_all('td')] for row in table2.find_all('tr')]
> 
>     return data1, data2
> 
> # Example usage
> url = 'https://example.com/page1'
> table1_data, table2_data = extract_table_data(url)
> print(table1_data)
> print(table2_data)

Replace 'https://example.com/page1' with the actual URL of the page containing the tables.

For Each pageUrl in ListOfUrls
Invoke Python Method activity
Input: pageUrl
Output: table1Data, table2Data

# Use the table data as needed

End For

1 Like