How to improve the Confidence of the Keyword Classifier or Intelligent Keyword Classifier

We have only Document Type Invoices and there maybe some documents that are not Invoices for which the classifier should have less confidence score, but for Invoices we are having confidence score from 60-70% only with both keyword and intelligent classifier. We have used around 80 invoices to train the intelligent keyword classifier.Please help how we can get high confidence scores for Invoices and low for other documents so those can be business exceptions

Hi @madhu6393

Train the Model with more Invoices to achieve more confidence score, make sure the trained fields in the Invoices and Other documents are different.


Do we need to add the other document types to the taxonomy? As we are interested only in Invoices and other documents like Statements and Credit notes are out of scope

The confidence for classification is 60-70% for statements and credit notes as well same as Invoices

Only invoices needs to be added in Taxonomy.

to get a high performing model you need to make sure that all invoices you train in intelligent keyword classifier are well represented of what fields you are trying to extract.

Once you have a well represented sample set of invoices you can pass documents that are invoices into the extraction. There needs to be a significant distinction between invoice and non invoice documents. You can also try post processing methods to ensure you get invoices passed into extraction.