I’m working on a use case where we are extracting data from invoices for multiple vendors. Currently, we have implemented this using Modern DU and have trained it for 4 vendor formats so far.
However, for every new vendor, additional effort and time is going into training and improving extraction accuracy. Since all of them are invoices and most of the fields are fairly standard, I wanted to understand if this is the expected approach or if I might be missing a better design/practice.
We are expecting 50+ vendors in total, so I would appreciate suggestions on the best scalable approach for this scenario.
For Document Understanding processes like this, it’s a good idea to implement a classification station in the process.
The classification station lets you pass documents through to human intervention if the “new” invoice did not meet the confidence score that you set. For example, let’s say you trained invoices on 10 different vendors and you have set a confidence level of 90% and a completely new vendor invoice comes in, IXP “should” understand that it’s still an invoice and let is through, but lets say it doesn’t, it won’t meet the 90% confidence level and immediately default to the classification station (Human-in-the-loop) in Action Center for manual classification.
What happens at that point is that a human will manually go into Action Center and classify it as an Invoice, and once complete, it will automatically train into your existing invoice model as a new vendor in (almost) real time. Next time around, if that same new vendor comes in, the confidence will now increase. Repeat this process.
After let’s say 6 months, you should theoretically be able to disable the classification station because it will have trained automatically on all those new manually classified documents from Classification Station. But if the client consistently get’s new vendors, then just leave it in.
That’s my personal opinion. Just implemented it recently for two clients, works like a charm
validation can still be handled through Validation Station or Action Center
only edge cases may require additional improvement/training
For 50+ vendors, maintaining individual training cycles for every layout may become difficult operationally. Using IXP/Agent + pre-trained models first, and then selectively training only low-confidence vendors, could be a more scalable approach.
What you’re doing is normal, but training a separate model for every vendor is not a scalable approach. Since all documents are invoices, it’s better to use one common Invoice model and handle most fields in a generic way. Then you can improve accuracy by adding a few samples, using regex or rules for specific fields, and using validation for edge cases. Only go for separate models if a vendor’s format is completely different. This way you avoid maintaining 50 models and keep the solution much simpler and scalable.