Sometimes we may come across the situation in Document Understanding that the data can be spread across in multiple pages, and we use multiple techniques to capture data from different fields. If the document is a multipage document, and the data belongs to multiple pages, the issue is that when we open it in Validate stage, the field and data location mapping doesn’t work. It works only if it is a single page document. I do not want to use AI center to do the mapping for this, but it would be a good idea if we have a feature called ‘Document Merging’ which means the entire data set can be considered from a document (multi page) and extract the data from any pages, based on the extractor type. When we define the field properties in the taxonomy manager, there should be an additional property called ‘Document Merging’ with the value options ‘First Page’, ‘First Occurrence’, ‘Last Page’, ‘Last Occurrence’ etc. Then when we open it in the Validate station, we will be able to see the data and values are mapped to the right page
Hello @jamesjacobsydney
Let me see that I understand you correctly: in cases data is spread upon multiple pages, you need a way to configure an extractor in such way that, it can determine the data from more than one page? For example, if:
- a table is longer than a page
- a description or contract clause starts on one page and continues to the second one
then you want to capture the entire information.
Is this the problem you are trying to solve?else, can you please tell me a little more about your use case?
Thanks in advance,
Monica