Read only specific value from ocr

Hello i have a ocr bot that reads Images ( see example image)
I want to extract only the 81128. Its there any rule where it says only extract numbers that are 4 digits or more and only contains numbers ?

Hi

You can extract the entire data with ocr and then get the number with 4 digits using regex

That is if you have got the ocr value In a string variable named str_input

Then use this assign activity to get the number you want

Str_output = System.Text.RegularExpressions.Regex.Match(str_input,”(\d){4}”).ToString

Cheers @langsem

Hi @langsem

You can use regex to do this

\d{4,}

Try it wilth matches activity

thanks

Hm Cant get the string only retrieves System.linq.enumerable +d__97`1[System.text.regularexpression.match] and not the value

@langsem if the output variable of matches activity is result then use
result(0).tostring

Thanks

Genius :smiley: thx!

just a quick question tried to read another image, but got @β€œ1’38580kβ€ž, Z
Service 09/2021 OFF 3795.0 km
8 km” as value. 138580 its the correct number but it thinks that its a symbol inside there ( ’ )

@langsem please try this,

(\d’\d{4,}|\d{5,})

Thanks

hm that skips the first number 1 :confused:

Strange,it’s working for me.

Thats what i get from oCR @β€œ1’38580kβ€ž, Z
Service 09/2021 OFF 3795.0 km
8 km”
and this its the output after match β€œ38580”

This its match : CastIterator { [38580] }

Just make sure the single quotes matches with the one on the output string.

Thanks

managed to get the ’ now but it wont exclude the ’ / clean the string to just be numbers

Replace the single quotes with null.

Try this

Result(0).tostring.replace("’","")

Thanks

thanks that one worked :wink:

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.