How an RPA Platform Democratized AI and Transformed Document Understanding

Editor's note: the following is an excerpt from an interview David Eddy conducted with Prabhdeep (PD) Singh, Head of Artificial Intelligence (AI) Products at UiPath. This is a guest post by Eddy and is the second of a two-part series.

David Eddy: You said your team has baked artificial intelligence (AI) into almost every aspect of the UiPath Platform. What comes to mind as a good example?

PD Singh: I’d have to say the UiPath United States (U.S.) patent office activity is a good example of how AI is making an impact across the entire platform. It’s a matter of public record, so there’s no mystery. If you look, you’ll see that our attended robot automations can leverage a recognition framework that utilizes an ensemble of models including deep learning neural networks. You’ll also see our robots use convolutional neural networks to detect graphical user interface (UI) elements. There’s much more to see, but these give some idea of what’s there.

Eddy: Thanks for bringing patents to my attention, I will take a look. What about products or features that may be closer to the user experience?

Singh: In that case, UiPath AI Center, our machine learning (ML) product I mentioned earlier in our conversation, strikes me as a good example. It’s really the beating heart of our AI/ML capabilities, making it easy for customers to drag and drop ML models into robotic process automation (RPA), either their own or one of our out-of-the-box models. And it comes with tools to retrain and support those models.

Since AI Center can deliver cognitive capabilities to any number of operational processes, it’s a big story to tell. So, let’s stick to document understanding for our purposes today. Actually, the story of document understanding leads to AI Center and provides a good introduction.

Just as I had a charter at Microsoft to transform sales and marketing with AI, Daniel Dines gave me a charter at UiPath to create unique product lines using AI.

After careful study, we focused on document understanding. It caught our eye for two reasons: strong customer interest and the field was dominated by a few entrenched players using really old-world techniques.

The team was convinced we could easily disrupt document understanding using modern ML techniques and tools—and that’s what happened. Now, UiPath customers can use AI Center’s validation and annotation tools to take user data and quickly retrain models for semi-structured and unstructured data.

Eddy: Does this mean you’ve solved the problem where even world-class AI teams took too long to build effective ML models?

Singh: Yes. But, I misstated the problem. It was more than time, it was also money.

So, as we developed document understanding, we said to ourselves, 'Let’s not wind up with time and money problems. We need an underlying platform for deploying and managing ML models that an IT person can handle.'

Let me frame it this way: think of DevOps functions. The IT people in every company know how to deploy applications, it's a pretty well-understood and a well-oiled machine.

But, that's always not the case with MLOps. It’s a complex process and traditionally, the IT employees at many organizations wouldn’t know how to deploy models. For example, what’s needed in terms of hardware, in terms of software, in terms of the operating system functions, or the necessary system libraries? IT may not know those answers.

Frankly, the AI community hasn’t done a good job of sharing information on what it takes to develop and deploy ML models in real-world settings.

We wanted the opposite of that. Our goal was to abstract out MLOps complexity, so our customer’s IT people could manage AI/ML models with their existing skillsets. The product that emerged was called AI Fabric, now known as UiPath AI Center. Now you can see why I said the story of document understanding would lead to it.

Eddy: Is it accurate to say your team is democratizing AI/ML, just as UiPath has democratized RPA?

Singh: That’s exactly right. But remember, democratizing RPA was about more than just abstracting out complexity. UiPath also had to deliver the readiness and scale our customers, particularly the large global enterprises, require. So too, democratizing AI/ML must go beyond simplification and give customers flexibility and scalability.

Eddy: Is there a specific capability or feature that illustrates what you mean?

Singh: Sure. In fact, we can stick with document understanding as an example and dig deeper into why the team was confident we could innovate and disrupt the competition in that field. Bear in mind that UiPath Document Understanding means giving customers good ways to process structured, semi-structured, or unstructured data.

The sweet spot for third-party optical character recognition (OCR) engines is structured data, something our team was able to replicate quickly.

But our goal wasn’t just to match competitors' capabilities. It was to do that while also providing the technology free of charge. Here’s why. Effective OCR solutions based on third-party engines triggered significant licensing costs as customers scaled up their automation. It may have been the smarter and cheaper option, but customers found the escalating license fees to be a real pain point.

While our OCR engine was a significant achievement, it was in the areas of semi-structured and unstructured data where we created real separation from the competition. But, before I get into why that’s the case, let me make this point.

As I said earlier, we’re not like some of the OCR vendors that claim to have the best out-of-the-box models and talk about 99% accuracy and so forth.

Instead, our document understanding product is really a collection of different components, like pre-built ML models for invoicing, onboarding, sales orders, etc. Then, it uses UiPath AI Center as a reconfigurable workbench, putting your own custom or our pre-built models in the hands of RPA developers and providing tools for model re-training and centralized model support.

What we’ve really designed is a self-improvement system. For instance, our pre-built invoice model. You can quickly drag it into your automation and see if it performs well or not. Likely it will for a high percentage of your use cases. But regardless, our tools use your own data to quickly retrain and optimize the model to the accuracy level you want. Think of our out-of-the-box models as starter models. With only a reasonable amount of real-world data, customers' IT teams can quickly fine tune the results.

Which is why the document understanding product we have is well differentiated from anybody else out there and why customers are using our technology to run tens of millions of documents through it every month.

Eddy: Before we leave document understanding, I want to apply what I’ve heard to an actual customer scenario and get your thoughts. About three years ago, I remember UiPath and a third-party OCR engine being brought into a troublesome invoice problem at a global automaker. Every month the company would receive a large number of invoices from small, but essential, vendors that either couldn’t or wouldn’t comply with the automaker’s SAP® invoice template. Those were pre-AI/ML days, and RPA automation was considered a success, even though a significant volume of invoice exceptions remained.

Suppose that automaker came to you today with that same problem. Would the UiPath pre-trained ML model would be able to successfully reconfigure, right from the start, a large percentage of those noncompliant invoices? Then IT people could load that data and quickly retrain the model to very high levels of accuracy?

Singh: Let me start by telling you what happens with that type of problem in the AI world. Usually, you as a customer know your business better than anyone else, right? Automaker ABC is Automaker ABC, and it has a huge partner network. They know how to work with these partners, they know issues and circumstances, and so they also know what different levers of acceleration are possible and desirable.

There's nobody else out there—and this is the problem with the most AI companies—even in that automaker’s own industry that can match its experience, knowledge, and expertise. Yet there are AI companies, filled with uninformed hubris, who tout themselves as capable of changing and accelerating any arbitrary process that automaker may have.

Whereas, the invoices that automaker receives from those small vendors are drastically different from the invoices someone else gets from doing a Google search and training a model. That’s why we've designed an ML system which can quickly retrain itself based on the inputs customers get from their own operations.

And, this where our platform shines. As an example, there's a lot of customers that we work with that take our invoice model and will pump in a hundred real-world documents through it and get really good recognition rates. They're like, 'Wow, I didn't even know that that was possible.' And, that’s before any runtime retraining has been done.

Some customers, particularly BPO [business process outsourcing] organizations with low-cost operations, can actually annotate some of their data. Other customers may already have annotated data. In those situations, they can use the Data Manager component of AI Center to batch, upload training data, and retrain the model.

Eddy: PD, I can’t thank you enough for the time you’ve given to this interview and the insights you’ve provided. The one that particularly stands out for me is your vision of RPA as a uniquely compelling vehicle for delivering AI to the enterprise. It’s right on the money.

This is a companion discussion topic for the original entry at