Intelligent OCR architecture for large volume


What is the recommended architecture for deploying UiPath in scenario where large number of documents need to be processed (semi-structured, multi format or languages)? I see that UiPath seem to go towards Abbyy Flexicapture which needs a distributed Flexicapture server to be installed. Has anyone deployed a architecture with UiPath and Flexicapture fro large enterprise with various departments and processes and any guidelines for it?

Thanks !

In general, as OCR is a quite CPU-intense process, you may want to add at least one application server to the mix, two if High Availability is your requirement. Everything else depends on your document format, the number of pages processed per day, and service level agreements. Let’s say you need to process 10,000 documents with an average of 2 pages, and each page takes 2 seconds for full-page OCR - that total to over 11 hours required for OCR alone. If your workday is 8 hours, you will need additional servers or resources on said server.

Disclaimer: I did not work with ABBYY’s FlexiCapture, but we integrated Kofax into UiPath. The process and idea is the same, though.

1 Like

Can you explain some more about your Kofax/UiPath integration? @redlynx82

It is something we are currently considering and would be interested to hear your view on it. Where there many challenges? Would you go with the same solution again?

Thanks for your answer. I am also deliberating from deployment point of view in a large multi-country federated organisation. Not all the bots will process documents and document processing will need to be segregated across different department which would mean human roles that validate exceptions in OCRed documents will also be different. I drew an indicative architecture as below. What do you think about it based on your experience?

Also Abbyy versus Kofax could be a good topic if you could share some insight on it!

@ronanpeter with regard to Kofax, there are two possible approaches. One is well-suited for transactional processing - that would be via Kofax Real-Time Transformation Interface, which essentially is classification and extraction via a RESTful service. Kofax would return field data as JSON, allowing you to parse and process them further in UiPath. Here’s a sample response:

You can use the HTTP Request activity to call the service, and Deserialize JSON to parse data returned by it.

The other approach is better for high-volume processing, and it’s required if you want to do any sanity checks (i.e. data validation and correction done by humans). This can either be done by using Kofax Capture along with Kofax Transformation, or Kofax TotalAgility. Integration can be done either via SOAP web service calls, yet Kofax can also import data coming from miscellaneous sources (such as email, folder, fax).

@Gaurav_Sharma I’m not sure if I understand the integration you had in mind. Since FlexiCapture can import documents from various sources as well such as folders or email boxes, where and when does UiPath come into play?

1 Like

UiPath here will be used to process the data retrieved from OCRed documents. It is not a " digitasation" use case but transaction processing to retrieve specific data from documents such as invoices, contracts and POs and then acted upon by RPA to either update im downstream systems or to do more processing.

While I am not an expert on FlexiCapture, I believe that you could leverage their API in order to include document capture into your process - especially if transaction processing is a requirement, and not batch processing.

UiPath could then gather documents from various sources - such as email boxes, vendor portals, and more - call FlexiCapture for processing and potentially validation; another robot could fetch the processed results, forwarding data towards an ERP or ECM system.

UiPath 's so called Intelligent OCR activitu in fact is nothing but deep integration with Flexi Capture without having to explicitly call APIs. So strategically UiPath seems to go Abbyy way or atleast make it easier to incorporate " non batch" like transaction processing that has document and unstructured data involved.

I dont know if anyone had rollef it out already in large enterprise setup , ie Uipath iOCR activity with Flexicapture.

You’re right, there are activities dedicated to FlexiCapture. They still rely on the API, but apparently they require you to install the SDK (i.e. client libraries) on the same machine:


The ABBYY FlexiCapture SDK must be installed on the machine you want to run activities from the IntelligentOCR activity pack. You can find the official ABBYY installation documentation here.

Unfortunately, I have no experience that I could share. With regard to preferred integration, we can only guess. I think that ABBYY still is one of many companies in the field that remained independent, with Ephesoft being another player.

Many of the vendors in the Document Capture industry were consolidated over the last couple of years, and big players such as ReadSoft or TIS are now owned by Kofax. Now, since Kofax offers their own RPA Suite, it’s unlikely that we will see an official integration :slight_smile:

1 Like

You are right. I call this " Data Capture" segment than " Document Capture" with focus on zonal OCR and extracting specific bits of data from documents than digitising entire document. In this segment OCR players will eithet get bought by RPA vendors or acquire RPA capabilities themselves like Kofax has done. So in that sense Kofax will compete with UiPath providing end to end data capture to RPA features.

1 Like