Question: Even though you have a multipage document, containing several document types within - the explicit ‘splitting’ of the document into separate files using extract pdf range, is still optional, right? Because you can still extract each classification result separately without explicitly splitting into multiple files. So that you wont have to perform redigitization.
In fact, for this workflow it even might be useful to combine several PDFs into one.
Imagin a scenario where you have to extract data out of email attachments. Lots of times there’s not only the attachment you need, but also others that are unnecessary. In order to know which one is useful and which isn’t you need to digitize and analyse them.
As this framework is using one job per input file, the easiest solution is to combine your attachements into a single PDF in the dispatcher (dispatcher reads mail, combines attachments, starts DU Job).
The framework then does the whole job of finding out which documents (pages) are useful (classification) and extract the needed data.
Thank you, everyone, for your feedback! It is very much appreciated, please keep it coming!
@zell12 - indeed, document splitting is optional. In fact, most processing steps are optional and should only be used if needed. As for the splitting - it’s more of a UX improvement; a bit of a workaround to allow the person doing the data validation to only view the page-range they should be checking. It should make for a better user-experience and make validation a bit less prone to user-error. This optional step will go away once page-range support is implemented in the Validation Station/Validation Action.
@mmcruzRPA Thank you for explaining how to set up the template
@wagner - the new framework is meant to be used only for implementing Document Understanding processes (and only such processes). It should be used in conjunction with the RE-Framework, for any automation process coming before or after the Document Understanding part.
Can you please offer a practical example on how DU framework would be used in conjunction with queues?
As quick example Dispatcher puts files in Q1, Q1 would have a trigger to start DU for each item uploaded by the Dispatcher - when DU finished it creates a Q item in Q2, waiting for the performer to pick it up.
Upsides for using queues are too many to mention … downside is that the Q1 items would be abandoned if not validated by HiL in 24hrs (if that’s the case), any workaround for this? considering I want queues end to end.
Agreed, using queues definitely has huge advantages and we all prefer to use that approach
The bad news: the only workaround at the moment, would be to change an Orchestrator config setting that dictates how much time passes before a queue item becomes abandoned. We recommend not to use this approach however: this setting affects all queues and the potential for running into undesirable problems is quite high. Or you can allow queue items to become abandoned for a while but this is also undesirable.
The good news: the Orchestrator product team is working on bringing long-running support to the queues. When this feature is introduced, the framework will be updated to support queues, as well.
What do you mean by Robot license as unattended robot.
Has both access in Modern Folder, etc.? I’ve tried creating the Dispatcher as show on your tutorial but Ive getting errors on the ActionCenter part - Always job faulted. Can you show us the set-up on your ochestrator?