Anton McGonnell is Director of AI Product Management for UiPath AI Fabric at UiPath.
Tony Tzeng is Director of AI Product Management for UiPath Document Understanding at UiPath.
As we explained in a recent blog post, UiPath has made it easy for Robotic Process Automation (RPA) developers and centers of excellence (CoE) to adopt artificial intelligence (AI) using out-of-box starter models.
These machine learning models make it simple to get started using AI. No extensive data science background is required.
That said, there is one important caveat to keep in mind before getting started. Just because UiPath starter models are easy to deploy doesn’t mean they are limited, primitive, or confined in any way.
In fact, they can be incredibly powerful.
The best way to think about these models is to view them like a foundation for your AI needs.
As time goes on and you continue deploying AI across your organization, you may need to modify or retrain the models—or build your own models, too, for that matter—to achieve higher accuracy or based on your custom data.
Some models are not specific to customers’ data, they are generalizable across the world and have already been trained on a huge representative dataset (like language translation or sentiment analysis models). Thus, they do not need to be retrained.
When is model retraining required?
There are several different reasons why you might need to retrain your models, which we’ll explore next.
1. Insufficient model training
You may need to retrain your model when the current model hasn’t been trained to handle specific data sets, so humans have to validate and correct the results manually. For example, if it doesn’t work well for a specific document layout.
2. Model improvements
Further retraining may be required when a model is incrementally improved over time and as new validations are made to real-world data.
3. Changing real-world data
As real-world data fluctuates, a model will need to be adapted to keep pace.
As you can see, models should not be viewed as restrictive in any way. They are completely customizable, serving as a powerful springboard to further develop your AI use cases.
With that in mind, let’s take a closer look at our newly released retrainable models.
UiPath newly released retrainable models
UiPath recently released five models for preview—all of which can be used in both cloud and on-premises deployments. Here is a breakdown of the new retrainable models.
1. Purchase Orders (PO) model
The Purchase Orders model allows you to extract a PO number, date, client names and addresses, vendor names and addresses, billing information, tax amount, and other fields (see the full list).
The Purchase Orders model can be called from ML Extractor activity and is available in UiPath Studio version 2019.6.
2. Utility Bills model
The Utility Bills model enables you to extract billing names and addresses, vendor names and addresses, account and invoice numbers, invoice date, due dates, previous balances, and other fields.
This model can be called from ML Extractor activity released in UiPath Studio 2019.6.
3. Invoices-India model
The Invoices-India model allows you to extract nine additional key invoice fields including:
- Supplier and vendor goods and services taxpayer identification number (GSTIN)
- State goods and service tax (SGST)
- Central goods and service tax (CGST) percentages
- SGST and CGST totals
- Integrated goods and services tax (IGST) percentage
- IGST total
This model is specific to the India market. It can also be called from ML Extractor activity.
4. Invoices-Australia model
The Invoices-Australia model mirrors the Invoices-India model, except it extracts two additional fields—Australian business number (ABN) and bank state branch (BSB) number. For a full list of available fields this invoice can extract, check this out.
This model is specific to the Australia market. Like the others, it can be called from the ML Extractor activity.
Other retrainable models that UiPath offers include the:
- Generic Document Understanding model for extracting commonly occurring data points from semi-structured or structured documents. Based on this model, you can retrain it for any other custom documents.
- Receipts model for extracting commonly occurring data points from receipts, including header fields and line items.
- Invoices model for extracting commonly occurring data points from invoices (while the Invoices-India and Invoices-Australia models mentioned above additionally extract some regional data points).
UiPath also offers retrainable models for English and French text classification, tabular classification, AutoML, TPOT and tabular classification, TPOT XGBoost classification, and more.
How to retrain models
Ready to get started retraining models?
You need to add the upload file activity to your workflow. This activity sends data (including those validated by humans) from your RPA workflow to a given dataset.
Another way to retrain your model is to start by exporting your newly labeled data from Data Manager. Upload the dataset to UiPath AI Fabric, run a training pipeline with your new dataset, and create a new ML skill out of it. Now you’re ready to use it in your workflow.
You can also retrain your model using the new ML Trainer activity. It can save data corrected and validated by employees in the Validation Station and automatically upload it to AI Fabric for model retraining. This way the model learns to extract data with higher accuracy, thus you may expect improved extraction results over time.
UiPath supports retraining on cloud and on-premises
UiPath models can be retrained on either cloud-based or on-premises infrastructure for added flexibility and convenience.
On UiPath Automation Cloud, you can train your models without having to worry about setting up expensive, complex infrastructure. This is helpful for cloud-native companies that don’t have on-premises infrastructure. UiPath AI Fabric processes training pipelines and allows you to scale up and down as needed.
In addition, a full library of hardware and software documentation is available for customers that want to run training pipelines with AI Fabric on-premises. You can access the library here.
Q: Do we have to purchase retrainable models separately?
A: No. In order to use these models, all you have to do is enable the 60-day UiPath Automation Cloud for enterprise trial. You can also access them by purchasing UiPath AI Fabric or Document Understanding licenses. Please talk to your customer success manager for more details or contact sales.
Q: What languages are available?
A: UiPath Invoice models support a variety of languages, including English, Spanish, Portuguese, German, French, and Romanian. In turn, UiPath Receipts model support English, Spanish, German, French, Norwegian, Finnish, and Romanian. More information is available here. In addition, you can train your own model for documents in any languages, the only limitation is right-to-left languages.
Q: Do you need to be a data expert to get started?
A: UiPath retrainable models are simple enough for novice users with little-to-no understanding of AI. At the same time, they are flexible and dynamic enough for advanced, highly customized usage. UiPath can help regardless of your level of experience.
Next up: See you at UiPath DevCon!
We’re looking forward to the two-day virtual event—which takes place September 2 and 3, 2020. UiPath DevCon 2020 will be packed with information, ideas, and inspiration from our robust, global user Community, and we can’t wait to kick things off.
Why attend DevCon?
You’ll learn more about UiPath Document Understanding during the session titled Fast and Accurate Document Understanding with AI, with UiPath Machine Learning Product Manager Tarun Singh and RPA Developer Victor Bautista. This session (on day one of DevCon) will highlight the newest AI-enhanced capabilities for intelligent document processing while demonstrating how machine learning can ensure fast and accurate data extraction.
Right after that, UiPath Product Manager Ioana Gligan and Executive RPA Lead and Solution Architect Lahiru Fernando will host a session titled Document Understanding Framework: An End-to-End Solution for Document Processing Automation. Attendees will learn all about how the UiPath Document Understanding framework can help process documents in different use cases.
UiPath AI Fabric will also be on full display during the session titled Machine Learning Model Lifecycle Management on AI Fabric with UiPath Engineering Director Shashank Shrivastata on day two of the conference. This session will provide an overview of the UiPath Platform ML capabilities, while demonstrating how AI Fabric can help manage your ML lifecycle from packaging to deploying to retraining ML models.
Editor's note: If you missed DevCon, don't worry! You can still access the on-demand recordings of any presentations you missed.
Whether you’re just getting started in this space—or you’re a seasoned AI veteran—UiPath is here to help. And you don't have to wait until September to see our ML capabilities in action!
Sign up and try UiPath for free today.![|1x1](upload://wetq1fnzALLIDF4fQJFcNqcDEcT.gif)
This is a companion discussion topic for the original entry at https://www.uipath.com/blog/easily-retrain-ai-starter-models-more-accurate-results