Get line items from Receipts - Document Understanding

Hi,

I’m using the new UiPath.DocumentUnderstanding.ML.Activities package v.2.16.1 (instead of the UiPath.IntelligentOCR.Activities package).

Receipts are read with “Extract Document Data” with the predefined model, and the output is a receipt object:IDocumentdata<Receipts>

This makes me able to get e.g. the total amount, with: extractedResults.Data.TotalAmount.Value

My problem is, how do I get the Line items? The lines items exists, I can see them with extractedResults.AsText

I tried using a For each loop on extractedResults.Data.Items, but currentItem.Description is nothing.

extractedResults.Data.Items.Count is 3.

Studio 2023.10

Hi @Ferdinand

Can you share the output of extractedResults.AsText ?

Regards.

extractedResults:

Merchant Name: Acme Food
Merchant Address: My street 123
Merchant Phone Number: 222-22-2222
Transaction Date: 10 January 2025
Total Amount: 522.00
Tax Amount: 104.40
Document Number: Not extracted
currency: NOK
expense-type: food_meals

items Table data:
Items - Descriptions | Items - Quantities | Items - Unit Prices | Items - Line Amounts
Food1 | 1.00 | 199.00 | 199.00
Food2 | 1.00 | 25.00 | 25.00
Beer | 2.00 | 149.00 | 298.00

Thanks.

Did you try to get currentItem.Description.Value?

Yes, tried with value. Actually currentItem.Description = nothing, so there’s something weird there.

But, I think the solution might be to set “Generate data type” to Off, in the “Extract Document Data” activity.

The output will then be a IDocument<DictionaryData> variable, instead of IDocument<Receipt>.
I think you can use GetField then, have to test more…

Ok you can try as you said but also i can suggest you to use regex if you cannot get from table.

You can use this regex (?<=\n)[^|]+(?=\s*|) on extractedResults.AsText and get the descriptions.

Regards.

1 Like

Here’s the solultion. In the Extractor, set generate data type: Off.

To get e.g. total:
extractedResults.Data.GetField(“total”)

To get the lines:
For Each - extractedResults.Data.GetTables.First().Values.First().GetRows

Log message: (rowIndex+1).ToString + " - " + currentRow(0).Values.First().Value + " - " + currentRow(1).Values.First().Value + " - " + currentRow(2).Values.First().Value + " - " + currentRow(3).Values.First().Value

It’s should always be just 1 “Values”, so use the first.
The table “items” should always have the headers: description, quantity, unit-price, line-amount

I personally find this terrible advice since the generated data types are just superior.
The line items do work with the generated data types and don’t require all of the late binding you have to do with your workaround with them off.

You are likely using the object wrong. Is your table called ‘Items’ in the receipt.
Its super easy to iterate over the tables returned in a generated data object, there is just something small you have got wrong or are missing.

Data type ON only works with the Out-Of-The-Box models anyway?
Most of the time you will make your own DU Project and start with an OOTB model where you add and remove fields and retrain on your own data.
The Published model doesn’t support getting the generated data types.

OOTB is mostly useful for making a prototype for your process.

Thats not been my experience at all. I always used modifed models with custom fields. It always made custom data types that reflected the fields I specified.

If its not working for you its either a bug or a misunderstanding. Perhaps show the issue?

Is this wrong?
“For each” extractedResults.Data.Items → currentItem.Description

I’d think it should work.

What data type does the activity output when you have ‘Generate Data Type’ set to on?

You can see it by hovering over the field label in the properties pane, or making a new variable with ctrl+K