UiPath Community 23.1 Preview Release - Document Understanding

Monica_Secelean · January 30, 2023, 11:00pm

We’re happy to report on our latest work for the Document Understanding product encompassing updates in the area of both classification & extraction

Definition of a default value for a Field
Have you ever wished for a fallback value to be populated in the Extraction Result, when a field could not be extracted? Have you ever wondered “Why manually inputting the same value, if it comes empty in documents?” - wonder no more! With the latest release, one is able to provide a “Default value” for fields, which will be populated in the Extraction Result in case no other value for the field has been found in the document. In this sense, one does not need to repeatedly select “False” when no checkbox is checked or provide a default value to not leave an input empty, because nothing else was extracted - it just works!

OCR updates
We have migrated the Omnipage OCR to .Net5 portable, so that you can now use it within Linux robots

Improved Classification Experience using the Intelligent Keyword Classifier
We are happy to report that we have improved the splitting algorithm: now the algorithm can take page numbers into consideration and does a better job at identifying where documents start and end. For example, it looks for “Page 1” or “3/3” or “Page 3 out of 3” to identify the starting and ending of a document, resulting in more accurate splitting.
And in case you do not want to use the splitting algorithm provided with the Intelligent Keyword Classifier, you know have the option to disable it. Until now, the algorithm would split documents even if if splitting wasn’t necessary. Now, the splitting feature can be disabled using a checkbox option.
Finally, we have also improved the splitting algorithm to better split documents of the same type within a file - shall you not notice our improvements, please reach out - we’re happy to help out!

Reporting of the Text Type in the Extraction Result
Text can come in documents either as handwritten or printed, checkboxes or other elements. With the latest release, the Extraction Result also makes this information available for you to consume, enabling the use case in which handwritten documents are sent for validation or checkboxes are further collected & processed.

balaraman.ramiya · February 1, 2023, 8:26am

Waited for this one…will check it out. Can this similar feature available for ML classifier and One Click Classification?
Regards,
Balram

rikulsilva · February 5, 2023, 5:59pm

It would be really nice if default value able to accept expression. Taking in count Invoice Due Date. If Due Date is empty, set Today + 10 Days.

Monica_Secelean · February 9, 2023, 5:08pm

thanks for your feedback @rikulsilva - will add your feedback to our backlog

Murtuza_Kapadia · February 25, 2023, 4:05am

Hi Monica,

Those look like really good additions to the DU capability.

Can I please check if the enhacements will include the capability to split/ extract multiple invoices from same/different Vendors in a single PDF. Also if there are multiple invoices on one page can they be split/extracted.

Thanks
Murtuza

Monica_Secelean · March 1, 2023, 10:47am

@Murtuza_Kapadia yes, so ideally your use case of processing a PDF having multiple invoices from multiple vendors should now be supported, IKC enabling you the split of it so that you can iterate over each invoice and extract data from it - however, multiple invoices on the same page are not detected (maybe share with us some sample docs if you can?) - we hope you will give your use case a try & let us know how it works

Think_Blue_Management · March 16, 2023, 12:31am

This is great new, Monica. Am I able to use the feature now? How can I use it? Is it automatic?

Monica_Secelean · March 16, 2023, 9:57am

Hey @Think_Blue_Management !
Not sure I understand you, what feature do you mean (I have listed multiple)? if you refer to the “definition of the default value” then yes, it will work automatically - you define it in the taxonomy manager, the algorithm populates the extraction result automatically with it.

Let me know if I didn’t answer,
Monica

Think_Blue_Management · March 18, 2023, 3:15am

Hi Monica, my apologies, I was referring to the Improved Classification Experience using the Intelligent Keyword Classifier ability to extract invoices that spans multiple pages (eg 1-3)

Topic		Replies	Views
Classification Results dividing one document into multiple documents based on Pages Document Understanding	4	1621	February 8, 2023
Document Understanding - 2022.5 Community Preview Document Understanding document_understanding , document_processing	8	2421	February 1, 2023
UiPath Community 2023.4 Stable Release - Document Understanding Product News	1	1444	March 30, 2023
Document Understanding 2021.4 - or Why We've Been so Quiet Product News document_understanding	17	5783	August 11, 2021
UiPath Community 2022.4 Stable Release - Document Understanding Document Understanding document_understanding , document_processing	5	2294	April 19, 2022

UiPath Community 23.1 Preview Release - Document Understanding

Related topics