Repetition of values in Data Extraction Scope

nashrahkhan · February 20, 2021, 12:50pm

I am not getting why my values are getting repeated in the Data Extraction scope. it is shown in the Pre validation Station.

Can anyone help me on this?

prasath17 · February 20, 2021, 2:14pm

Hi.@nashrahkhan …to be specific…only the mentioned values are getting repeated right? Not everything…

If you have multiple extractor in the data extraction scope , this field might have checked in two places.

nashrahkhan · February 20, 2021, 2:19pm

@prasath17 everything is getting repeated twice.
I have also unchecked that.

but it is not working.
Moreover we can check multiple extractors. Because if one extractor confidence does not work then it will send it to another.
Isn’t it ?

prasath17 · February 20, 2021, 2:52pm

@nashrahkhan …let me open up my setup and check quickly…I have used 3-4 extractors but never ever faced this problem. It’s weird.

nashrahkhan · February 20, 2021, 3:05pm

@prasath17 Sure. That would be a great help.
Moreover can you tell me how we can use metadata, predictions and document extracted from train extraction scope extractor to train our data extraction scope model?

prasath17 · February 20, 2021, 3:45pm

@nashrahkhan - Here is my Iintelligent form extractor(IFE) and Regex based extractor(RBE) setup. Even though two fields are in both extractors I have dont have anything mapped in IFE so this is fine…

If i am not wrong…only Machine Learning Extractor trainer is allowed inside the Train extractors scope because that is the only one trainable, others are not.

I am adding @AndyMenon @Lahiru.Fernando to assist some of your questions.

AndyMenon · February 20, 2021, 6:27pm

@prasath17 - there isn’t much choice here. If we try to drop in any other extractor except for the Machine Learning Extractor Trainer, Studio wouldn’t allow us. That said, we can always implement our own trainable extractors by implementing the classes as indicated in the documentation.

@nashrahkhan - yes that is how it is supposed to work.

This is how ML Extractor Trainer is supposed to work - Community is free to correct if I have not understood something right.

You put in the results of your Human validation (from Present Validation Station) at 1
Specify your skill or the end point at 2
Define the Output path for the Trainer to create the training results file at 3

Once you run your flow, validate your results - for example in this case you will manually correct the duplication of your signature fields and then the output from the PV Station will be used by the MLE Trainer. In the “Configure Extractor” step you will map the Signature fields to be the focus of training the extractor.

Once your flow is run, the MLE Trainer will create a set of files named documents, metadata & predictions at the output folder.

The entire folder will have to be zipped up and uploaded to Data Manager. If I’m correct this feature is still in Preview.

And here is the fun part, you export the data out of Data Manager - the data will be in a format acceptable to AI Fabric

You upload this data up into AI Fabric and re-run the Training pipelines.

And then you run your Flow again to see if your manually validated results have made any difference.

Topic		Replies	Views
ML extractor trainer Document Understanding activities , question , document_understanding	2	433	June 22, 2023
Custom Extractor for Data Extraction Scope Other Products document_understanding	1	793	November 17, 2022
Can Validation Station help improve the Extractor Document Understanding	7	987	December 20, 2020
Train extractors scope error Academy Feedback lp_developer_fnd	4	1615	July 13, 2022
Save extraction data in validation station Document Understanding	3	970	April 23, 2020

Most Active Users - Yesterday
ashokkarale
anjani_priya
Dheerendra_vishwakarma
Parvathy
Aakash_Singh_Rawat
Luis_Fernando
bjorn2390
neco
pere
Shiva_Nikhil
More details...

Repetition of values in Data Extraction Scope

Related Topics