Document extraction and search

rizvana.mohammed · May 31, 2026, 2:04pm

can anyone propose a solution for this requirement using UiPath. 1. Read customer contracts from a specific folder. 2.Extract certain fields from those contracts. 3.File the contracts in a specific order. 4. Enable users to search for contracts based on the extracted fields. is any LLM based solution available in UiPath. solution need to be onprem and not document understanding.

ForsakenFiji · May 31, 2026, 10:39pm

UiPath can support this, but with your constraints, on-prem + not Document Understanding, I would not position it as a pure UiPath-native LLM solution.

A practical architecture for your use case would be:

UiPath Robot monitors/reads the contract folder
Use File System activities to pick up PDFs/DOCX files from a configured input folder.
Convert contract content to text
For digital PDFs/DOCX:

Read PDF Text
Word activities / Office activities
Python/.NET library if neededFor scanned contracts, you still need OCR. If Document Understanding is excluded, use an on-prem OCR engine such as:
Tesseract
ABBYY on-prem
Azure/AWS would not fit if strict on-prem is required

Extract fields without DU
Options:

Regex/rule-based extraction for predictable contracts
Custom Python/.NET parser
On-prem LLM hosted separately, for example Llama/Mistral running through Ollama, vLLM, LM Studio, or an internal API
UiPath calls that local LLM API using HTTP RequestUiPath itself can orchestrate the process, but the on-prem LLM would usually be hosted outside UiPath.

Store extracted metadata
Save fields such as:

Customer name
Contract number
Effective date
Expiry date
Contract type
Region
Status
File pathStore this in SQL Server, PostgreSQL, Elasticsearch, or even SharePoint/on-prem DB depending on search needs.

File contracts in required order
UiPath can rename/move files based on extracted metadata, for example:CustomerName\Year\ContractType\ContractNumber.pdf
Enable search
Best option: build a small internal web app/search screen over the metadata database.
Simpler option: use Excel/SQL view/Power BI report, but a web UI is better for end users.

Regarding UiPath LLM support: UiPath has LLM-based extraction through Helix Extractor, but that is part of Document Understanding, which you said is excluded. UiPath also has Communications Mining/IXP, but that is mainly cloud/service based and not a simple on-prem contract repository solution. Helix Extractor is documented as a Document Understanding LLM extractor.

kkapula4 · June 1, 2026, 2:03am

Yes, this can be implemented without Document Understanding and fully on-prem.

You can try below approach:

Read contracts from a folder.
Extract text using OCR/PDF activities.
Use an on-prem LLM (e.g., Llama 3/Mistral via Ollama) to extract required fields.
Store metadata in SQL and file/rename contracts accordingly.
Build a search UI (Apps/Web app) that queries the stored metadata.

NITHISHKUMAR_RACHAKONDA · June 1, 2026, 7:30am

Yes, this can be achieved using UiPath, but not entirely with native UiPath LLM features if Document Understanding is excluded.

A simple approach would be:

UiPath reads contract files from a specific folder.
Extract the text from PDF/DOCX files (using PDF/Word activities and OCR for scanned documents).
Send the extracted text to an on-prem LLM such as Llama or Mistral hosted internally.
The LLM extracts required fields like Customer Name, Contract Number, Start Date, End Date, etc.
Store the extracted data in a database.
UiPath renames and moves the contracts into the required folder structure.
Build a simple search interface (or use a database search) so users can find contracts using the extracted fields.

So, an LLM-based solution is possible and can remain fully on-premises. UiPath would mainly orchestrate the process, while the LLM handles the field extraction.

Topic		Replies	Views
Legal Contracts Data Extraction with UiPath Document Understanding Product News document_understanding	16	5691	September 24, 2024
Document Understanding Use Case Form Based Extractor Use Cases Repository customer-service , beginner , uipath-studio , uipath-document-understanding , other-sector	0	663	February 22, 2023
Uipath app - chatbot using embeddings and similarity search Studio studio , question , activities_panel	2	77	November 22, 2024
Document Understanding Use Case Form Based Extractor Use Cases Repository customer-service , beginner , uipath-studio , uipath-document-understanding , other-sector	0	618	February 22, 2023
Extract data from a wide range of documents, including Native files, Scanned documents, as well as Word and Excel documents in over 100 different formats Studio studio , question , workflow_diff	8	432	September 21, 2023

Document extraction and search

Related topics