Intelligent OCR - Machine Learning Extractor

Hello Guys…

I need a small help…

I have being trying out the Intelligent OCR activities to test data extraction from different types of documents like invoices.

I also downloaded some examples through the forum from posts related to Intelligent OCR packages…

There I came across this scenario where we use the Data Extraction Scope to extract data. Inside that, we use the Machine Learning Extractor.

What I don’t get it this part.

In here, the circled part has to have some name in it for this thing to work and identify the elements we need. However, i’m not sure what exactly I should mention here for this thing to work.

I went through the machine learning extractor documentation and it said that these refer to the internal taxonomy names. But I couldn’t find what those names were. These are somethings I have put in by guessing. But most of them do not work, and some work.

Anyone has any idea what should go in here?
or am I doing it wrong totally :sweat_smile:

This is the example I downloaded

@Jan_Brian_Despi @Lakshman @sandeep13 @Palaniyappan @HareeshMR @loginerror - Do you guys have any idea on what I’m talking about here?


Sorry bro I haven’t used this activity till now.
But in academy there is one chapter for this under 2019 update course. It would be helpful for you.
As Per my knowledge circle part of above image already created in taxonomy manager that refere here to use


Thanks a lot bro!!! Just went through the academy course… I think this gives me a good idea on these stuff. And for my scenario, It looks like the machine learning extractor is not the one I have to use. In my scenario, we are focused on extracting data from utility bills which are very much different from a normal invoice.

I think machine learning extractor is pre-built specially for invoices and receipts. So I guess I have to use the regex extractor…

Will try it out with a fresh xaml file and see how it goes :slight_smile:

Thanks a lot bro… Will post here if I come across any issues


I will also check on Monday if I get some time


Hi @Lahiru.Fernando,

Did you solve your problems? I want to extract some data from different type of bills. Tried to extract with machine learning extractor. Couldn’t find a proper source for machine learning extractor.

Which academy course did you enroll? What do you suggest me to do?

1 Like

Hey @ercanebiler

Yeah… I was able to figure some ways out… Actually machine learning extractor is a one that was pre programmed for invoices and receipts. This was helpful in extracting some information. But couldn’t extract all the stuff I needed from that. So I used the regex extractor as well with some regex code to get the other stuff I wanted to extract using regex. Basically I used both extractors…

To get some idea on these activities etc, I followed the 2019.4 Updates academy course. In there, there is a separate section for Intelligent OCR. Go through that and you’ll get a good understanding on the activities and what they can do.


Thanks for your answer. I will check them.

1 Like