Partial Data Extraction from Document

dimple.khurana · June 24, 2024, 11:26am

Dear Forum Members,

I am working on a solution where I need to extract details from different documents. For that, I am using Out of Box model named, “Passport” and “Document Understanding”. And the model is trained on enough documents. Now, for few cases, data is extracted completely and in few cases, data is partially extracted. And in other document, model is extracting incomplete number.

Below are the details for the cases of Partial Data Extraction and Incorrect Data Extraction:

Checked the output of OCR under Digitize Document. For UiPath Document OCR, data is there but checked for other OCRs, i.e., Omnipage, Tesseract, Microsoft Computer Vision, OCR is not providing any data. And input of ML extractor has data but output doesn’t contains the data.
For Incorrect Data Extraction, I am trying to extract data from Aadhar card, And from the back page, it is extracting data correctly but from front page, it is extracting incomplete aadhar number. For this, the output of UiPath Document OCR doesn’t contains the complete number and other OCRs apart from Microsoft, it contains the number completely but doesn’t contains other details. Microsoft OCR output is blank.

I hope, I could explain the problem statement. Can you please suggest some solution because I have retrained the model multiple times. Is there anything that I can do for full and correct Data Extraction.

Thanks,
Dimple

system · June 26, 2024, 2:00pm

Hello @dimple.khurana!

It seems that you have trouble getting an answer to your question in the first 24 hours.
Let us give you a few hints and helpful links.

First, make sure you browsed through our Forum FAQ Beginner’s Guide. It will teach you what should be included in your topic.

You can check out some of our resources directly, see below:

Always search first. It is the best way to quickly find your answer. Check out the icon for that.
Clicking the options button will let you set more specific topic search filters, i.e. only the ones with a solution.
Topic that contains most common solutions with example project files can be found here.
Read our official documentation where you can find a lot of information and instructions about each of our products:
Watch the videos on our official YouTube channel for more visual tutorials.

Hopefully this will let you easily find the solution/information you need. Once you have it, we would be happy if you could share your findings here and mark it as a solution. This will help other users find it in the future.

Thank you for helping us build our UiPath Community!

Cheers from your friendly
Forum_Staff

Topic		Replies	Views
Incorrect Data Extraction or Partial Data Extraction from Document using AI Center OOB Model AI Center question , ai_center	1	121	July 2, 2024
Document Understanding data not getting extracted Activities excel , uiautomation , studio	5	368	November 17, 2023
Data Is Not Being Extracted From All Pages In A Document Knowledge Base document_understanding , document-understanding	0	21	January 3, 2025
Facing problem in extracting aadhar card using Document understanding Studio question	24	787	October 13, 2023
Uipath document ocr extracting incorrect value with high confidence Document Understanding	6	863	February 3, 2023

Partial Data Extraction from Document

Related topics