can anyone propose a solution for this requirement using UiPath. 1. Read customer contracts from a specific folder. 2.Extract certain fields from those contracts. 3.File the contracts in a specific order. 4. Enable users to search for contracts based on the extracted fields. is any LLM based solution available in UiPath. solution need to be onprem and not document understanding.
UiPath can support this, but with your constraints, on-prem + not Document Understanding, I would not position it as a pure UiPath-native LLM solution.
A practical architecture for your use case would be:
- UiPath Robot monitors/reads the contract folder
Use File System activities to pick up PDFs/DOCX files from a configured input folder. - Convert contract content to text
For digital PDFs/DOCX:
Read PDF Text- Word activities / Office activities
- Python/.NET library if neededFor scanned contracts, you still need OCR. If Document Understanding is excluded, use an on-prem OCR engine such as:
- Tesseract
- ABBYY on-prem
- Azure/AWS would not fit if strict on-prem is required
- Extract fields without DU
Options:
- Regex/rule-based extraction for predictable contracts
- Custom Python/.NET parser
- On-prem LLM hosted separately, for example Llama/Mistral running through Ollama, vLLM, LM Studio, or an internal API
- UiPath calls that local LLM API using
HTTP RequestUiPath itself can orchestrate the process, but the on-prem LLM would usually be hosted outside UiPath.
- Store extracted metadata
Save fields such as:
- Customer name
- Contract number
- Effective date
- Expiry date
- Contract type
- Region
- Status
- File pathStore this in SQL Server, PostgreSQL, Elasticsearch, or even SharePoint/on-prem DB depending on search needs.
- File contracts in required order
UiPath can rename/move files based on extracted metadata, for example:CustomerName\Year\ContractType\ContractNumber.pdf - Enable search
Best option: build a small internal web app/search screen over the metadata database.
Simpler option: use Excel/SQL view/Power BI report, but a web UI is better for end users.
Regarding UiPath LLM support: UiPath has LLM-based extraction through Helix Extractor, but that is part of Document Understanding, which you said is excluded. UiPath also has Communications Mining/IXP, but that is mainly cloud/service based and not a simple on-prem contract repository solution. Helix Extractor is documented as a Document Understanding LLM extractor.
Yes, this can be implemented without Document Understanding and fully on-prem.
You can try below approach:
- Read contracts from a folder.
- Extract text using OCR/PDF activities.
- Use an on-prem LLM (e.g., Llama 3/Mistral via Ollama) to extract required fields.
- Store metadata in SQL and file/rename contracts accordingly.
- Build a search UI (Apps/Web app) that queries the stored metadata.
Yes, this can be achieved using UiPath, but not entirely with native UiPath LLM features if Document Understanding is excluded.
A simple approach would be:
- UiPath reads contract files from a specific folder.
- Extract the text from PDF/DOCX files (using PDF/Word activities and OCR for scanned documents).
- Send the extracted text to an on-prem LLM such as Llama or Mistral hosted internally.
- The LLM extracts required fields like Customer Name, Contract Number, Start Date, End Date, etc.
- Store the extracted data in a database.
- UiPath renames and moves the contracts into the required folder structure.
- Build a simple search interface (or use a database search) so users can find contracts using the extracted fields.
So, an LLM-based solution is possible and can remain fully on-premises. UiPath would mainly orchestrate the process, while the LLM handles the field extraction.