I’m working on a document processing flow using Document Understanding. The challenge is that some PDFs are digital (native) and others are scanned images. Using a single extractor doesn’t give good results across both types. Has anyone built a DU pipeline that can handle both in one process without branching into two separate workflows?
Hey @Masuma_Khatun,
You can handle both scanned and native PDFs in one DU process without splitting workflows.
Here’s how:
- Use Digitize Document with OmniPage OCR — it works for both scanned and digital PDFs.
- In Data Extraction Scope, add both Form Extractor (for digital) and ML or Regex Extractor (for scanned).
- Map extractors to specific fields using the Manage Extractors panel.
- UiPath will pick the right extractor based on what’s available in the document.
No need to branch. Just configure the extractors smartly.
1 Like
Thank you for the solution.
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.