How to get all items from this row generated inside the Document Text scope?

Hi everyone,

When processing a DU sequence and it gets to the ‘Digitize Document’ part you know you get a text file as an output, inside that output there is a phrase/row that has all the amounts we need but it only grabs the first one (0.00) and then moves on to the next step as usual.

image

I hope I explained my self otherwise plz ask so, any ideas?

I’m tagging the following, thx: @Pablito , @codemonkee, @loginerror , @bcorrea, @mateus_cruz

Regards,

Hi @MARIODC,
How are you getting the text from this generated file? Hint. I would use get text from file and then push it through regex which will grab each number. Finally you could get each value from regex by selecting matching groups from it.

1 Like

Thanks @Pablito for answering,

We are getting the text as part of the Document Understanding process. Those numbers you see are from the Text variable that is created by default…

Any ideas on how to address this scenario within Document Understanding/OCR?

Thx for the support!

I need to understand it more. Does it mean you have all these numbers in one single variable as string? I mean if you will for example use “write line” with this variable will it write all of them in one line?

Good morning @Pablito,

Let me explain better where that variable comes. As part of the Document Understanding framework which has various stages, the stage number 2 is called ‘Digitize Document’ which has as an ‘Output’ a Text document (you can name it anything). For some reason when the OCR was reading the PDF inside this var from "digitize document’ we were getting all those number inline among the rest of the text. I was just wondering how to get it but Thank GOD! as usual sent the light we needed and we found the value we are looking for.

Regards and thx for the support!

1 Like