2022.2 - Document Understanding ML Extraction Packages - now available in Cloud!

Hello friends of Document Understanding,

We are pleased to announce the 2022.2 release of the Document Understanding ML Packages. These packages involve some important changes, that everyone should be aware of.

First, the behavior when training on GPU versus CPU has changed as compared to the 2021.10 release. In the 2021.10 release, models trained on CPUs are automatically smaller, so they train faster than before, while having slightly lower accuracy than before. This has been reversed in 2022.2, so the model being trained on GPU and on CPU is the exact same model, and the training speed has reverted to what it was before 2021.10, which means training on CPU is again 10-20X slower than on GPU.

Second, the DocumentUnderstanding ML Package includes a significant accuracy enhancement. You should see significant improvements on your scores when training on this package, as compared to previous releases.

Third, dates in column fields are now parsed correctly, and also date parsing recognizes Turkish month names: ocak, nisan, ekim, etc.

Fourth, the Utility Bills model is now Generally Available.

For more details, see the full Release notes here.

Enjoy, and get ready for the upcoming 2022.4 release, which will contain some exciting new pre-trained models. But let’s keep the suspense going just a bit longer.

Cheers!

Your friendly Document Understanding team.

14 Likes

Great. Will Definitely try. Thank you for sharing.