Step by step document understanding creation

I’m looking for a step by step to practice for a document understang?
now i’m using studio desktop

thank for advance

Hi @n.bounoun

Please confirm you need steps which can be follow to extract the data from document or do you need to understand how du is work

Hi @n.bounoun

Welcome to UiPath,

I suggest pls go through the below link:

and

If helpful, mark as solution. Happy automation with UiPath

hii @n.bounoun

watch this Videos There is step by step videos are there for DU.

watch here:- https://www.youtube.com/watch?v=VHozzZiybJU&list=PLEYSwx3duQ2BWUnJORX4YVK54geNDJInW

1 Like

Hi @n.bounoun

Please follow below steps to extract the data form invoices

Create new project
Install required packages UiPath.IntelligentOCR.Activities and UiPath.DocumentUnderstanding.ML.Activities.
Create folder - place sample invoices there
Add For Each activity (TypeArgument = String) to loop through file paths (use Directory.GetFiles(“path”,“*.pdf”)).Inside loop, use Load Document or just pass file path to Digitize Document.
Digitize Document
Activity: Digitize Document (from IntelligentOCR).
• Properties: DocumentPath = current file, Output = digitizedDocument (type Document).
• Choose OCR engine: Tesseract (offline) or Microsoft/Google OCR (if available). For scanned docs try OmniPage/Abbyy if licensed; otherwise tune Tesseract options.
Classify Document (optional for single type)
If you expect multiple document types, use Classify Document Scope + classifiers (KeywordBasedClassifer or ML Classifier). For single invoice type you can skip classification.
Data Extraction Scope
• Activity: Data Extraction Scope — Input: digitizedDocument.
• Inside Data Extraction Scope add extractors (order matters):
• Regex Based Extractor — good for invoice number, dates, totals.
• Form Extractor or Intelligent Form Extractor — good for structured tables.
• Keyword Based Extractor — fallback for field proximity.
• ML Extractor (pretrained) if you have it — better for variable layouts.
• Configure fields: Use a Taxonomy file or define fields manually (like InvoiceNumber, InvoiceDate, SupplierName, TotalAmount, LineItems).
Present Validation Station
• Add Present Validation Station (shows results to human for confirmation).
• Output: validationResult (Document object with validated answers).
Export / Save Results
• After validation, convert extraction results to a DataTable: iterate validationResult.DataExtractionResult.Documents(0).Taxonomy or use helper methods. Simplest: build a DataTable with columns FileName, InvoiceNumber and Add Data Row with values taken from validated fields (validationResult…GetTaxonomyValue(“InvoiceNumber”) depending on how you stored it).
• Use Write Range to save to Excel.
Cheers

If you find solution is helpful please mark solved .

Hi @n.bounoun

It’s recommend you to start a document understanding and document understanding framework course in UiPath Academy which will give you idea on Document understanding process. If you are interested to learn from watching videos the below playlist is for you,

Hope it helps!!

do you have some zip file of project ?
thank you

Hi @n.bounoun

Currently I don’t have you can follow this and try to do some practice it will help you to understand the concept

If this solution worked for you please mark solved

Cheers

Hi,
Depending on your scenario you might want to consider UiPath IXP - it is the latest way of delivering document processing functionality.