Digitize Document Error: Indirect reference was not dereferenced

Hey everyone!

I am trying to build a process that:

  1. Takes 10Ks filings PDF files which can have anywhere between 50-500+ pages
  2. Split these PDFs into smaller files up to 100 pages
  3. For each split file, use Digitize Document with a For Each to run through each page searching keywords
  4. For pages where the keywords are found, Extract PDF Page Range to get a 2 page document only and run it through Extract Document Data (with Generative extractor) to grab the paragraph needed

So, for the first file has 293 pages, which I split into 2 files, one from 1-150 and another from 151-293. The 1-150 file processes well, but in the second one, the Digitize Document is throwing the following error: “Digitize Document: 890 0 R : Indirect reference was not dereferenced.”
The “890” number changes depending on how I split the pdf (150 max pages, 100 max pages, etc).

I’ve tried a bunch of other ways to process these documents, but I can’t find the root cause of this problem. Any ideas?

expectionDetails:
*RemoteException wrapping System.InvalidOperationException: 890 0 R : Indirect reference was not dereferenced. *

  • at .()*
  • at .( ,*
    Predicate`1 )
  • at .[]( ,*
    Predicate`1 )
  • at .( ,*
     )
  • at .( ,*
     )
  • at .(Int32 ,*
     )
  • at .(Int32 ,*
     )
  • at UiPath.DocumentUnderstanding.Digitizer.Pdf.Docotic.DocoticPdfDocument.<>c__DisplayClass25_0.b__0()*
  • at UiPath.DocumentUnderstanding.Digitizer.Pdf.Docotic.DocoticExceptionHelper.HandleDocoticExceptions[T](Func`1 func)*
  • at UiPath.DocumentUnderstanding.Digitizer.Digitization.Preprocessing.PdfDigitizationDocument.GetPageStreamThreadSafe(Int32 pageNumber)*
  • at UiPath.DocumentUnderstanding.Digitizer.Digitization.Preprocessing.PdfDigitizationDocument.GetPage(Int32 pageNumber)*
  • at UiPath.IntelligentOCR.Activities.Digitization.DigitizationActivityScheduler.ScheduleProcessingTask[T](Func`1 func,*
    CancellationToken token)
  • at UiPath.DocumentUnderstanding.Digitizer.Digitization.PageDigitizer.ProcessPage(IDigitizationDocument digitizationDocument,*
    Int32 pageNumber,
    IOcrEngine ocrEngine,
    Boolean shouldApplyOcr,
    DigitizationSettings settings,
    String contentId,
    CancellationTokenSource source)
  • at UiPath.DocumentUnderstanding.Digitizer.Digitization.DocumentDigitizer.GetPages(Content content,*
    DigitizationSettings settings,
    IOcrEngine ocrEngine,
    CancellationToken token)
  • at UiPath.DocumentUnderstanding.Digitizer.Digitization.DocumentDigitizer.Digitize(Content content,*
    DigitizationSettings settings,
    IOcrEngine ocrEngine,
    CancellationToken token)
  • at UiPath.IntelligentOCR.Digitization.IntelligentOcrDigitizer.Digitize(Content content,*
    IOcrEngine ocrEngine,
    ApplyOcrOnPdf applyOcrOnPdf,
    Boolean detectCheckboxes,
    IDigitizationScheduler scheduler,
    IDigitizerTelemetryService telemetryService,
    CancellationToken token)
  • at UiPath.IntelligentOCR.Activities.Digitization.DigitizeDocument.ExecuteAsync(NativeActivityContext context,*
    CancellationToken cancellationToken)
  • at UiPath.Shared.Activities.AsyncTaskNativeImplementation.BookmarkResumptionCallback(NativeActivityContext context,*
    Object value)
  • at UiPath.Shared.Activities.AsyncTaskNativeActivity.BookmarkResumptionCallback(NativeActivityContext context,*
    Bookmark bookmark,
    Object value)
  • at System.Activities.Runtime.BookmarkCallbackWrapper.Invoke(NativeActivityContext context,*
    Bookmark bookmark,
    Object value)
  • at System.Activities.Runtime.BookmarkWorkItem.Execute(ActivityExecutor executor,*
    BookmarkManager bookmarkManager)