Classification of Empty Pages

Hello!

We’re working on a project to classify and split documents from a continuous (scanned) PDF - this can contain 1 to hundreds of documents. The issue that we’re having is when one or more of these documents is empty. This messes up the Action center UI, resulting in the classification validation action not working at all.

Has anyone had any similar experience?

The only alternative I can think of is running a preemptive OCR on the PDF, extracting all the pages with empty document text, then putting the PDF back together with the remaining pages.

However, what if one of the documents I exclude is a picture that contains no text but is still relevant? :expressionless:

Appreciate any inputs you guys might have. Thanks!

@Monica_Secelean appreciate any help here!