Has anyone successfully used the FinancialStatements document understanding package before? I am using it and it is able to capture the statements which include the balance sheet, income statement, and cash flows statement.
However, it is literally just taking those 3 sections of the statements and converting it to 3 data tables. It is a really good start but what I am struggling with is how to map the account names to that of our internal system that we enter financials into. Here is what I mean. The accounts on the left are extracted through the DU model and the ones on the right are the ones I want these accounts classified as. The ones on the left match our internal system:
I wish it was just a simple mapping. However, each financial statement that DU reads will have slightly different names for the accounts. We need to be able to map these accounts to our internal accounts.
What is the best way to do this? Should I use another ML model that I can use to map the accounts? Fuzzy lookup? I’m not sure what the best approach is here.
I’m sure anyone that has used this DU model has hit the same issue.
Document Understanding is purely a data extraction service - any logic you would like to apply like the mapping you mentioned above will have to be done in post-processing.
What you could do is to maintain an Excel file with the values that you would see from the financial statements in one column, and the values from your system in another column.
Read the Excel file as a DataTable, then upon getting the values from DU, use a Lookup Data Table activity to retrieve the value you need based on the value obtained from the financial statement.
Yes I get that. The issue is the values retrieved from du will never be an exact match for a table lookup.
Any ideas? Like it needs to see the value from du like “receivables” and know it needs to map to “accounts receivables”. Basically any variation should be able to be mapped. That’s why I thought some sort of word ai or a fuzzy lookup?
Sure, fuzzy matching is possible but probably wouldn’t be able to match “any variation” as that might result in false matches.
I don’t have the algorithm myself but I’ve seen VB code for calculating the Levenshtein distance between two strings. You could incorporate this into a UiPath robot using the Invoke Code activity.