Pdf extraction data


I’m trying to extract text from multiple pdf files but specific text only so i have done till reading all pdf files by using Document understanding so how I do get that text only which I want using Document understanding

These are ss of du and I’m writing pdf text in text file but there is an problem is that first pdf reading it’s write in write text den for second pdf writing and data remove from text file

Hi @suraj_gaikwad

As the write text file is within the for each loop it creates a new file at each run time so remove the write text file activity from the for each loop and create a file at the start of the program.

Hope it works !!

You could actually get a DataTable containing the extraction results using the Export Extraction Results activity, then get the header fields and table fields by using DataSet.Tables("Simple Fields") and DataSet.Tables("Line Items")

Thereafter just use Excel Write Range activities to create one Excel file per PDF, so that the data is not overwritten.

We don’t want to write in Excel sheet

In data extraction scope it’s selected all pdf page that’s why it’s writting all pages so how to remove that bcz I’m trying to remove it’s not working

Dat is not extracting after write text outside of try catch ?

The checkbox just means which fields you’ve validated in validation station. Could you please explain the problems you’re facing in greater detail?

Hello Suraj,
Got To Data Extraction Scope Activity and click on Configure Extractor. Inside that you would be able to select what field you want to extract and what not.

1 Like

I have done this by using match text not du bcz nota able to get output

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.