How to extract multiple text details and table info from PDF file

Hi All,

I have PDF file which is having 50 pages of details in that i want to extract only specific text and some of tables details. How should i extract. Thanks in advance.

Thanks,
Niranjan

@Niranjan_k

You can try to use Document Understanding. If not, you can iterate through the PDF pages and then use regular expressions to get what you want.

@Niranjan_k

Use the “Read PDF Text” activity to read the entire content of the PDF file. This activity will give you the text content of all 50 pages in the PDF.
After you can use string manipulation, regular expressions, or string functions to extract the specific text you need.
Use Extract Data Table activity to extract structured data from tables within the PDF.
-Use the “Anchor Base” activity in combination with “Find Element” or “Find Image” to locate the table on the page.
-Then, use the “Get OCR Text” activity to extract the text from the table.
Once you’ve extracted the specific text and table details you can store the extracted data in variables, DataTables, or other data structures for further processing.

OR

you can try to use Document Understanding

@Dilli_Reddy Thanks for your reply, since new to UiPath can you share me any sample workflow

D.U 1.zip (298.2 KB)

@Dilli_Reddy @monsieurrahul im unable to view the workflow can you share me the screenshot

@Niranjan_k

Can you share a sample PDF if possible?