Hi All,
I have pdf file which is having multiple pages of details, now I would like to search and find the details in pdf file. How can I achieve this.
Thanks in advance
Niranjna
Hi All,
I have pdf file which is having multiple pages of details, now I would like to search and find the details in pdf file. How can I achieve this.
Thanks in advance
Niranjna
Hi @Niranjan_k ,
The question is a broad and we could direct to many suggestions, for specific suggestions related to your cases, we would request you to provide more details on your requirements.
Generally, For Digitial PDF Documents, a first check would be with the below methods :
For Scanned / Mixture set of documents :
@supermanPunch I’m new to UiPath, not sure how to use above functionality. Great if you could provide any examples of above functionality. Thanks.
Thanks,Niranjan
Use the “Read PDF Text” activity in UiPath to extract the text content from the PDF file.
You can specify whether you want to extract text from all pages or from a specific range of pages.
Use string manipulation or regular expressions to search for specific details within the extracted text. You can search for keywords, patterns, or specific data elements within the text variable.
After searching for details, you can process the results in various ways. For example, you can store the results in variables, create a data table, or perform specific actions based on the extracted details.
Workflow:
PDF Read PDF Text=“path”
PDFPageRange
Assign Name=“pdfText” Value=“extractedTextVariable”
For Each x=“line” In="pdfText.Split(Environment.NewLine)
If Condition="line.Contains("Invoice Number")
Extract and process the invoice number from the line
Assign Name=“invoiceNumber” Value=“line”
Do something with the extracted invoice number
@Dilli_Reddy @supermanPunch if you have Any recorded video in YouTube. Please help me so that I can follow the steps. Thanks in advance
We could provide you with more suggestions on a broad case, but do you want to solve it for Knowledge/Learning purpose or do you have a deadline set for your requirement, If there is then as already mentioned we would need more specific details on the requirements so we could direct you to specific suggestions/solutions.
@supermanPunchThe requirement is team has some keywords based on it I have to get the data from PDF file, for some keywords I have to take the table detail from the PDF file. The data will change on pdf file daily basis.
@Dilli_Reddy thanks for the video here I don’t see keyword search. Help me how can I get the data from PDF file based on keyword search. Sometime I need to export table details from PDF file if keyword matching. Data on pdf file will change daily basis.
Use the “Read PDF Text” activity in UiPath to extract the text content from the PDF file. Make sure to specify the PDF file’s path.
Use string manipulation or regular expressions to search for specific keywords within the extracted text. You can use the String.Contains()
method
Based on the presence of the keyword, you can conditionally extract data.
For extracting table details, you might consider using UiPath’s “Data Scraping” activity or a custom solution based on string manipulation and regular expressions.
Store the extracted data in a DataTable or another data structure for further processing.
Hi @Niranjan_k once you are done with extraction, use regex if don’t know how to extract the value using regex please share the same output, so we can help you to extract the value.
We have many ways to extract the value from the string.
@Dilli_Reddy @copy_writes @supermanPunch i have tried to extract data from PDF, it’s not going to the exact page where data is in PDF file. How to read dynamically where match found.