In July 2024, we released to GA the initial slate of UiPath GenAI Activities (announcement), a first party connector that gives Cloud users access to a variety of UiPath managed, industry leading large language models.
We have since seen some excellent early adoption with users leveling up the complexity and quality of their automations with the help of GenAI inferences. Curated prompt activities like Summarize and Generate Email have offered consistent and reliable GenAI outputs for common RPA tasks that are easy to configure and deploy via Studio Web, Studio Desktop and now Apps.
Furthermore, the ability to craft a custom prompt (via the Content Generation activity) that incorporates variables and arguments at run-time has proven to be a versatile tool for tackling a wide variety of bespoke use cases. With the recent roll-out of Context Grounding in public preview, Content Generation has become even more powerful by grounding prompts and LLM outputs in a business’ unique data assets whether those be long, complex documents or data that has been uploaded to UiPath Storage Buckets from common enterprise business applications.
We’re now excited to roll-out what we expect to be several high value curated activities tackling complex text analysis and image based tasks:
-
Image Analysis: Offers users the ability to create a custom prompt and upload an image, as well. Just like Content Generation, you can pull in variables or arguments from other workflows, but also include an image in common formats to analyze.
-
Categorize: By uploading a string to analyze, users can make use of a new dictionary expression editor (only available in Studio Web for the time being) to create and define custom categories. The activity will output the best category match based on your definitions and the model’s best judgment. See an example of the activity being used below to automate the categorization help desk tickets.
-
Named Entity Recognition: Given a string input and a custom defined list of entities to search the string for, the activity will output the entity name and the entity discovered in the text. See a silly example below where we ask the activity to return animals and morals (of a story) from Aesop’s The Tortois and the Hare
And the output is a nice structured JSON (as a string): [{“text”:“Hare”,“type”:“animal”},{“text”:“Tortoise”,“type”:“animal”},{“text”:“Plodding wins the race.”,“type”:“moral”}] (Removed multiple entries, but every entity discovered would be represented in the output )
Keep a look out for upcoming improvements to help determine where in the string the entity was discovered.
- Detect Object: By uploading an image and a list of custom objects and their definitions/attributes, users can determine if indeed those objects are present in the image. Furthermore, users can apply custom conditions for the model to consider. For example, perhaps a logistics company needs to determine whether or not a package was appropriately placed on a doorstep. With Detect Object, a photo with a custom condition for package placement could easily be set up and run with reliable results.
Other exciting activities upcoming:
- Signature Comparison
- Reformat (ex. JSON to CSV or malformed JSON to properly formatted JSON)
- Semantic Similarity - string to string or string to list of strings
- Deep Sentiment Analysis
- Image Comparison - comparing two images with custom categories
- Image Classification
- Semantic Search - search your Context Grounding indices for semantically similar data
All of these activities only cost 1 AI Unit to execute. By chaining these together with other GenAI activities or UiPath products (Document Understanding, Communications Mining, Integration Service activities), the sky is the limit! Let us know your feedback!