About the Document Understanding category

Everything you know about UiPath Document Understanding.

Document Understanding is the ability to extract and interpret information and meaning from a wide range of document types (e.g., structured, unstructured), storage formats (e.g., images, PDFs, text), and objects (e.g., handwriting, stamps, logos).

3 Likes

Still in Beta versions? Can we use it for Production?

1 Like

Hi,

I’m going through the document understanding course at the moment.

This question may not be relevant as it doesn’t relate to images/pdfs/text.

If you have data coming in to a process as an excel attachment from a variety of sources that are not consistent with the data column positioning, table starting row, header names, data that might span multiple rows.

Is there a method of setting up rules like you would do for non standard documents in the document understanding tool-set that could keep the rules for finding the data or using ML models which could learn how to extract the data through training, like you can do with scanned documents.

HI, this is an interesting Q, did you figure out an answer? I would suggest a worst case scenario is that you save you spreadsheet as a pdf…not sure if this is a generic solution becuase some excel tables will span across many pages.

a real and final solution is to work with a data scientist an build your own ML engine for this. it does not sound extremely difficult to accomplish. I stand to be corrected :slight_smile:

regards
Sats

Hi Sats,

I haven’t got a solution (it was an issue I encountered ages ago and at the time this was just one of many issues with the process that made it unsuitable).

I would still like to know if there are any solutions out there. You need a resilient/neat way to store the rules for all the different formats of excel input data and maybe a ML model is ideal but I don’t know how to go about doing that.

Thanks,

Hi Everyone,

I have been working in DU for sometime and still couldn’t figure out an unattended way of automating it.

There will be always a scenario where some document will have lesser confidence level of data extraction. In such, We will be using Validation Station and Action Center.

My Query is that while using such activities as given below stops the process until someone responds to the action created. This creates a void time of non-execution

  1. Create Validation Action
  2. Wait and Resume Validation Action

Pls anyone enlighten me if i’m going wrong?