Is it possible to give a data set to train a document understanding module? Or do I have to map?

Hello,

I am exploring the FinancialStatements document understanding ML package.

I understand how to feed it financial statments and map the different accounts to it. However, am I required to map actual pdf documents? Am I able to instead give it a dataset in excel that has mappings? It would be much much faster.

let me know!

@Asanka

There are few fields which the pretrained mode can extract…for those you need not train as well…

https://du.uipath.com/ie/financial_statements/info/model

But if you need any new fields which it is not getting then yes you have to mark and label the data…

Cav upload is not an option or is not available…and also different files might have different fields of same type so for new fields generally we have to indicate

https://docs.uipath.com/document-understanding/automation-cloud/latest/user-guide/out-of-the-box-pre-trained-ml-packages

Cheers

oh thats too bad. Was hoping I could just upload a csv file that would just indicate all the possible mappings…

@Asanka

It cannot indicate…once you indicate it can train and give you back…so that is the reason you have a training that you need to perform for any new fields

Hope this clears

Cheers

I don’t fully understand what you are saying. But I think you are saying I will need to manually map over 100 financial statements to make this work. Instead of being able to upload csv with all possible mappings

@Asanka

To Make it simple

Yes Thats what I mean…and that is what we have to do…There is no way to upload csv and it would understand for this model.

cheers

Hi @Asanka ,

We might not need to necessarily map all the 100 fields since some of the fields might already be detected by the FinancialStatements endpoint as mentioned by @Anil_G .

So, We could use this endpoint in the Prelabelling part, so that we can make our labelling faster and easier, But yes if the detected values are less then it is time consuming process.

https://docs.uipath.com/document-understanding/automation-cloud/latest/user-guide/document-manager#the-user-interface-prelabelling