ML skill not extract proper details

Veera_Boddula · March 27, 2023, 9:15am

I created document understanding package, trained with sample files using pipeline then created skill.

When I ran the skill in studio, It managed to fetch the desired data for the documents I provided but one particular not one particular field in one particular document. Then I added the same document in dataset and trained the package and selected the trained version and deployed. But still DU failed to label that field.

Please check below image, which is trained in in AI Center.

I wanted to process the same document with funding amount as 970. It is not able to fetch it. what could be potential issue?

Anil_G · March 27, 2023, 9:18am

@Veera_Boddula

Did you do the datalabeling again for that one field?

cheers

Veera_Boddula · March 27, 2023, 9:21am

@Anil_G Thank you for the reply. Please check my screenshot attached to original post. I already did the data labeling.

Anil_G · March 27, 2023, 9:23am

@Veera_Boddula

You labelled only one document?

atleast 5 documents are to be labelled for it get trained

Please include more files of this type

cheers

Veera_Boddula · March 27, 2023, 9:28am

Hi Anil,

Can i Kindly asked you to check the snap I attached, I trained 10 documents of same format

Anil_G · March 27, 2023, 9:30am

@Veera_Boddula

If you check the question…it clearly says you added the field only for one particular document…not for all…so it is not evident to understand…as per this we see no other document has that field…

Can you confirm if all 10 have that field and you labelled in all 10 coduments?

cheers

Veera_Boddula · March 27, 2023, 9:36am

Hi @Anil_G ,

Sorry, If my question created confusion. I trained 10 document of same format all the fields labeled in all 10 documents. The last document which is in screenshot, I am trying to process the same document with funding amount as 470. But it was trained with value 480. If I process with funding amount as 460. Bot can fetch the data, Only problem with funding amount as 470.

supermanPunch · March 27, 2023, 10:19am

Hi @Veera_Boddula ,

Generally, we would train with more than 10 documents, even if 10 documents are a minimum we go with with 20-25 of the same format and then Train the Model with this dataset.

Also, do remember that each stage of Re-Training, Train the Base Version with all the Set Of Labelled data.

Let us know if you are able to label for 20-25 documents of the same format, we can then check if it does not correspond to the field mapping.

Also, not a very complete info is given on the fields or on How the Training was done? How many versions of the Model were already created ?

Veera_Boddula · March 27, 2023, 11:38am

Hi @supermanPunch,

Training done via Train pipeline using dataset exported from Data labeling session. I will try feed 25 docs and let you know the outcome. It was trained only on based version 0. ML skill deployed using Version 1.

Thanks
Veera

Anil_G · March 27, 2023, 11:43am

@Veera_Boddula

When you retrain the model from 0 also you will get a new version other than 1…that is to be selected…did you select latest one only?

and training more also would help

cheers

supermanPunch · March 27, 2023, 11:44am

@Veera_Boddula ,

Alright. Let’s keep the base version as 0 and move forward with the Pipeline Training when also additional training dataset is added, Train the whole dataset and not just the additional data that was added.

Let us know once you were able to use the ML Skill and if there are improvements observed.

Veera_Boddula · March 27, 2023, 12:27pm

@Anil_G

Please check below screens

ML Package Version

Pipeline

Pipeline ran once of type “Train”

Thanks
Veera

Anil_G · March 27, 2023, 12:47pm

@Veera_Boddula

Can you create a new pipeline instead of restarting the same please

cheers

Jefh · May 5, 2023, 9:19pm

Hi, try create a new regular field then export with the name please check the (ALL LABELED).

Then go to pipeline use the minor version and use in studio the machine learning extractor.
Wait the package is successfull in status in pipeline then use only if is successful.
Then create new evaluation to see the score.

And the most important Create new train, cause this is the package to use in ML_SKILL:

Then activate the ML_SKILL to the new version like this:

Topic		Replies	Views
Facing Issue in ML Skill in Document understanding Document Understanding question , document_understanding , ai_center	5	31	February 21, 2025
Not able to extract data from the document even after training the ML session with multiple documents? AI Center question , ai_center	2	47	October 3, 2024
ML skill created Document Understanding question	5	617	May 11, 2023
ML skill - field ID Activities document_understanding	10	679	May 19, 2023
Few Questions About Document Understanding Document Understanding orchestrator , activities , studio , question	3	850	June 6, 2024

Most Active Users - Yesterday
sonaliaggarwal47
Aditya_Nalawade11
Lynn_Song
ashokkarale
Anil_G
arivu96
sharazkm32
More details...

ML skill not extract proper details

Related topics