How to use all DU (Document Understanding) Components completely offline?

I have a situation that requires verifying whether DU components can function offline without relying on API keys and endpoints. The following are the measures I’m considering to avoid using APIs and endpoints:

  1. Taxonomy: No API required.
  2. Digitization: Utilizing OmniPage OCR without the need for an API key.
  3. Classifier: Using keyword-based classifier.
  4. Data Extraction: Considering the use of Form Extractor instead of regex, although it requires an API key. (According to this documentation )
  5. Validation Station: No API key is necessary.
1 Like