Alternatives to DU

Hi All,

ive been using DU for few weeks to test Invoices. I’ve run multiple training sets.
However, every time i ran my code confidence is low most of the time for some fields (below 50%)
I kept repeating by training more sets, but I’ve been keep getting the same result.

My question is, is there an alternative to Document Understand for invoices? are there libraries/packages which i could use which may have better results??

@sacad

May I know what kind of extractors are you using here ?

Thanks for your response. I am using ML Extractor. I am using the framework/code thats supplied as part of Uipath DU training

@sacad

Did you try other extractors like Form or RegEx based ?

unfortunately i didnt, can you please explain how that helps and if you had problems training data

Hi @sacad

Have you tried with regex without using DU

Regards
Sudharsan

@Lahiru.Fernando can you help with this?

1 Like

Hi @sacad,

Few things that you can try here:

if confidence is coming out to be low in case of extraction, try Using different extractors to see if there’s been any change in the results.

if confidence is coming out to be low in case of classification, try Using different OCR engines.

Regards
Sonali

Hi…

@Parth_Doshi - Thanks for the tag :slight_smile:

Hello @sacad

By reading the post and comments, I figured that you are using the ML extractor.
To address this scenario, we can work on several things as follows.

As you know, all the ML models available are retrainable. In case you are using endpoints for ML extractor and getting low confidence, I suggest you switch to AI Center ML models as it is retrainable.

Initially, the confidence may be low. But the good thing is, you can use the Data Manager and start training with a good number of initial document sets. This will definitely increase the accuracy (confidence). However, it may require multiple training runs (may depend on the number of initial documents you provide too).

There after, you can keep on fine-tuning the data even more where it gets low confidence. This is possible through the Document Understanding workflow itself.

The videos here may give you some tips on training the models for better accuracy…

Also, the tips given to you by our friends will also help for sure… There are many methods that’s available for us :slight_smile:

Let us know how it goes…

1 Like

Hi @sacad,

If you are looking for a great alternative to DU then you really have to try Rossum Automate Document Communication with Artificial Intelligence | Rossum.ai.

They are the alpha when it comes to intelligent document parsing. Rossum is easy to setup and make custom connectors all in python. It is also quite straightforwad to integrate with any RPA tool.

Docparser is another SaaS worth a mention here.

2 Likes

can you share the samples? that eill help is suggesting something better.

just processing training set isn’t enough, that needs to be backed by the proper base model of extraction

depending on the type of invoices there can be alternatives.

If it’s structured and not with many variants, even simple regex will work without DU.

If it has some key value dsta extraction fields, give a try to form based Extractor

It all depends on the invoice, better share that to gain a better suggestion.