Multipage single PDF file document Processing Page detection Error for specific page exaction keys

Hi All,

I am extraction a pdf file which is containing 40 pages in it out of which any one of page will contains invoice page . I have used 3 keywords to detect invoice page using intelligent keyword identifier and process it.

Problem - Above mentioned 3 keywords may contain in other pages too,so detecting multiple pages for matching and results in multiple pages DU Processing .

Looking for best solution approach in OnPrem orchestrator for document classifier to detect single invoice match .

Note :
1.A single multipage pdf is splitted into a 40 single page pdf files.
2. stored in a folder location
3.Processing one by one digitizing document
4.Looking for String Match in before DU Processing
5.Once string match processing for DU Processing

@Palaniyappan
@UiPath_Community

@Ritaman_Baral -Adding your comments this post

1 Like

@Nabisab_NabiNadeel,

Check if this is the issue.

Thanks,
Ashok :slight_smile:

Thanks for your response …

I am using enterprise version of UiPath for extraction ,i hope page limit is not a constrain …

@Nabisab_NabiNadeel,

I think it would be as per documentation.

Thanks,
Ashok :slight_smile:

Let me see ,i will check it from project design with a page specified pixel’s and page numbers.

@Palaniyappan
@marian.platonov

can you please suggest your inputs pls .

This shows your selection of keywords are poor,

You need to identify unique keywords.

You can use as many keywords you like for the identification of the correct page.

  1. Read the pdf page by page.
  2. Check current page text contains all keywords
  3. If it matches you know the required page no and can go ahead as required

I am using intelligent keyword identified to identify keywords ,to check matching keywords of OCR text using limited number of keywords upto 5 still facing challenges .

For your understanding I have 10 different sets of documents to identify in my usecase and difficult to always check maximum number of keywords .

If you are facing issue with intelligent keyword identifier you can remove this step and identify and categories it with your code.

And also suggest to raise a support ticket or connect with your TAM for this kind of issue.