Please make me understand about what Invoice and Receipt AI Does?

Hi talents,

I was working on the latest beta feed of Invoice and Receipt AI using machine learning Extractor.
Actually understood theoretically using those activities, but i would to know what those individual activity does for the output ?

  1. What taxonomy manager do ?
    After loading taxonomy
  2. What is the Digitize document will do really to that documents ? And what was the output from digitize documents that is DocumentObjectModel? DocumentText?
  3. And after that in data extraction scope what really happening here inside machine Learning Extractor to the documents.
  4. And why we are getting the results from the Export Machine Results result looks like .Json File.

Kindly make me understand this what was happening in the backend of those activities.

Thanks in Adavnce !!

regards,
Sriram

Hi @Sriram07,
Please check this out (if you haven’t see it):

@Pablito yeah I saw it and also worked already.
But still technically I didn’t get what was those activity doing in the back end …
For example Digitize Document
Am asking you guys tell me to explain technically

Hi @Sriram07

Maybe @Ioana_Gligan will be able to assist here :slight_smile:

1 Like

yeah @loginerror.

Because you guys gave video demo and steps also but from that i was replicating what you guys did !!! that’s it? and i got output !! but i really didnt understand anything technically.

So @loana_Gligan
please assist with all those details in layman terms if posssible !

thanks a lot for all you guys effort !!

Hello @Sriram07 ,

I will try to answer your questions:

  1. Taxonomy Manager is a wizard in Studio that allows you to manage a Taxonomy. A Taxonomy is a collection of document types, and a document type has a collection of fields. This information is the metadata that the system works with: it’s the general schema of information for the files you want to process. Example: you define document type A and document type B, you define fields X, Y and Z for document type A, and fields M, N and P for document type B (follow this structure through the subsequent notes please :slight_smile: )
  2. Digitize Document is a component that reads the content of any incoming file which is PDF or image, regardless of whether they are scanned or clean PDF files. The component uses the OCR engine drag-and-dropped inside ONLY if necessary, and for a multi-page PDF, ONLY on the pages that do not retrieve sufficient text information. After getting the OCR results or reading the raw text, the activity tries to format the information into words, sentences, paragraphs etc. The output of this is the string containing all the text in the document, and a Document Object Model (an object that contains all the information about this structure, down to the exact position of each word on each page, and even the OCR confidence, where applicable, for each word). Example: you have a file, you get a Text and a DOM for it.
  3. Classify Document Scope - allows you to use multiple classification methods, allows you to prioritize between them, activate / deactivate them at document type level. Example: you can drag and drop a FlexiCapture classifier and only activate it for Document Type A and Document Type B, and only report the result if the confidence level is above 70%, and also use the Keyword Based Classifier as a backup in case FlexiCapture classifier does not report a good enough (or any) result.
  4. Data Extraction Scope - allows you to use multiple data extraction methods, in a certain ordered priority, and by activating each extractor for certain document types and certain fields. For example, you can use FlexiCapture Extractor to process fields X and Y from Document Type A, and use the Machine Learning Extractor to process fields Y and Z. In this case, if FlexiCapture extractor retrieves a value for field Y, then any value returned by the Machine Learning Extractor for field Y will be ignored, so the final output, if a document of type Document Type A is fed for processing, will contain: the value reported by FlexiCapture Extractor for field X, value reported by FlexiCapture extractor for field Y OR value reported by Machine Learning Extractor if FlexiCapture is not capable of giving a good enough result, and value reported by the Machine Learning Extractor for field Z.
    4.1) the Machine Learning Extractor is an activity that takes in a file and a list of requested fields (as configured in the “Configure Extractors” wizard of Data Extraction Scope), calls a selected HTTP service, which reports results for the requested fields, and then maps those results to the expected extraction results output.
  5. Train Classifiers Scope and Train Extractors Scope - activities that you can drag and drop as many classification and extraction activities as you like, IF they are capable of learning from human feedback. These scope activities ensure that the human validated data reaches the algorithm so that the algorithm itself can improve its own performance.

In between, there is the Validation Station, which is a complex User Interface that allows humans to interact with documents and with automatically extracted data, with the purpose of correcting them.
This can also be used by itself, as a simpler and faster interface for data entry, even if you don’t have any automatic classification or data extraction.

And on top of everything, there is a public .nupkg called UiPath.DocumentProcessing.Contracts, that contains abstract classes that you can implement and build your own classifier and extractor!

WOW, that was a lot of text :slight_smile:

Please let me know what part you would like to get more details about.

10 Likes

Thanks For all the efforts.
I have one use case but i don’t know how to implement through this feature just guide me on this.

This is the image i want to feed.

In this I want to extract information below .

1st Half Due
Sewer Rent
City tax
Depth,Assessed Value,Frontage,Type and school district.

Tell me how can i get all those fields? using Machine Learning extractor?

And also please *explain me keyword based classifiers scope What is that learning path files inside this activity?
Also Train Classifiers and train extractor scope inside that its showing Drop Trainable activities i didn’t get what are all Trainable activities in uipath.

Will you @loginerror @Ioana_Gligan @alexcabuz guide me to extract the field above the bills. And also please explain those activities.

Regards,
Sriram

Hello Sriram,

The machine learning extractor is not a generic data extractor - it can ONLY process invoices and receipts, and it only targets the fields described in the release announcement - os it is not at all suitable for your use case.

If you want to extract information as per your case, try using the Abbyy FlexiCapture extractor, if you have a license, or try building your own extractor targeting your required fields. This implies writing custom code for your use case.

The Keyword Based Classifier helps in automatically classifying incoming files, if in the same process you might get multiple document types - eg, you might get invoices, contracts, medical records and you need to extract different data according to the
type of document that is incoming.

The learning file path needs to have at least an empty (but existing) file on disk. This gets automatically written by the Keyword Based Classifier when used inside a Train Classifiers scope.

The training scopes are used to provide classifiers and extractors a means to ingest human feedback, IF and only if they have the feedback ability implemented as a training activity.

Currently the only component capable of learning is the Keyword Based Classifier.

Hope this helps,

Ioana

Disclaimer

This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee, you should not disseminate, distribute or copy this email. Please notify the sender immediately by email if you have received this email by mistake and delete this email from your system. If you are not the intended recipient, you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.

3 Likes

Firstly Thanks for your time and efforts.It was very useful.

will you please walkthrough me that how can we Build our own extractor for the required fields.

And also sorry for asking again

I am not able to understand this. If possible explain me again!!!.

Regards,
Sriram

Hello @Sriram07 -

For first question - building your own extractor, please have a look here: https://docs.uipath.com/activities/docs/extractor-code-activity-class - you will need to build a custom activity by implementing one of the abstract classes provided in the UiPath.DocumentProcessing.Contracts package. You can have a look at how to build one, here: GitHub - UiPath/Document-Processing-Code-Samples: Code samples for document processing activities.

For the second question, please have a look here: https://docs.uipath.com/activities/docs/keyword-based-classifier . We are working on a series of improvements to make it more easily understandable (have a wizard for generating an initial learning file). Meanwhile, you can also have a look at this: UiPath Academy - the UiPath 2019.4 Updates course, in which you can enroll for free. You can view any section you want - and the related one is the Activities Updates - Intelligent OCR – Document Processing chapter.

Hope this helps.

3 Likes

Update: in the latest IntelligentOCR package, the Keyword Based Classifier has a “Manage Learning” wizard that allows you to create and edit the contents of the learning data for the activity :slight_smile:

You can also pass in a string as LearningData if you wish to keep it in a central place and have it available to all robots without needing to write it on disk first :slight_smile:

Have a great day,

Ioana

1 Like

Hi Iona - This is helpful info. The GitHub process code sample link doesn’t seem to work. Can you provide a new location for it? My company is looking to use the new intelligent OCR tools but the extractors are very limited at this point. The UiPath ML models built so far (invoices and contracts) likely won’t work for our large array of formats. Will there be a Studio Tool to build these ICR extractors ourselves eventually, based on our specific use-cases and formats? For now, I’d like to try creating an extractor specific to my scenario that is compatible with the Extractor Scope and Training Activities.

Hello @abetts2114,

Please try accessing the GitHub repo again, there were some permission issues a few days ago.

Also, yes indeed, we will soon be releasing an on-prem machine learning model that you will be able to install on your own machines and train on whatever fields you need, and on whatever document types you need.

Please let me know how building your own extractor works for you - any feedback would be welcome!

1 Like

Hi @Ioana_Gligan

We are glad to hear of this on-prem machine learning algo that will arrive soon. Can’t wait, it would be a massive help with a few of our clients.

Kind regards,

Balint

1 Like