How to Data Extract From a Scanned Document

Multiple documents in a Scanned PDF to extract data. Each document has its own template[ Structure]. Sometimes little changes may affect.
As it is a scanned document, it’s template data-position may change.

  • How to Extract Data, from these situations?

We Tried these methods,

  1. Form Extractor
  2. intelligent OCR
  3. ML Extractor
  4. Regex

We couldn’t get a better result. Anyone Suggest any new method to extract data from this PDF File.

Any AI/ML methods?

The Main Problem Facing in data-position change. Data fields have specific labels to fetch the data.

How to collect actual data from Scanned PDF Using UiPath?

Thanks and Regards

Is the scanned PDF an Invoice? if yes then ML Extractor can handle changing positions @basil_aiqmis

1 Like

Hi @Parth_Doshi,
it contains not only invoices and another thing is ML Extractor provides limited values.

  • How to add more custom values.

To add custom fields you have to create ML Models in AI Fabric and then use them.
Not sure how to do it I think @Lahiru.Fernando or @nisargkadam23 can help you.

1 Like

Hi @Lahiru.Fernando & @nisargkadam23,

can you help me to solve this,

Thank you Mr. @Parth_Doshi. I can’t access AI Fabric. That’s not in my UiPath Dashboard.
Also while I’m installing that MLService[ML Skill] in my UiPath, 3 errors were showing in my output panel.

Hi Guys @Parth_Doshi @basil_aiqmis

Just saw the discussion here… :slight_smile:
if you are processing invoices, yes we can create AI models with custom fields from what we have in AI Fabric.

Don’t worry on creating custom AI models. Use the out-of-the-box packages for Document Understanding, and you will see a package as follows.

Sorry for terrible handwriting on-screen :rofl:

The one I highlighted can be used to train and use for custom fields as you need. You got to use Data Manager for this.

If you are using community or trial versions, sign-in to Insider program and request access to cloud Data Manager which comes in as a part of AI-Fabric once the request is accepted.

Insider program

1 Like