Suggestion to get

i have pdf string of invoice now i wnt to capture the address from that what will be best way to get that as address keeps on chnaging for multi country

@manoj_verma1
Hi,

Use read pdf text,if the pdf is scanned pdf use read pdf text with ocr

Use regex or string manipulations

Thanks

Hi @manoj_verma1
Use Read Pdf Text activity if doesnt work go with ocr engines

Hope it helps!!

1 Like

Hi @manoj_verma1

If it is a plain text pdf you can use Read PDF activity

You might need to use Regex, depends on the text you are generating to extract the address

Thanks,
Srini

Hi @manoj_verma1

Use the read pdf Text for structured pdf and use read pdf text with OCR for unstructured pdf and use any OCR engine inside the read pdf text with OCR. The output is stored in a string variable.
image

Let the String variable write in a text file then use the regex expressions to extract the output address.
Use the Match activity to use the regex expressions.

Hope it helps!!

Hi @manoj_verma1

If it is a scanned document use read pdf with ocr otherwise use Read pdf text activity.
Then by using regex you will get the required fields.

Regards,

Hi @manoj_verma1 ,

If your string has static address format, You can regex to get address.

For example your address pincode will be 6 char, use below expression.

System.Text.RegularExpressions.Regex.Match(yourStringhere,"^.*?\b\d{6}\b").Value

or share address format if possible

Thanks!

Hi @manoj_verma1

Give me the proper text and required output to be extract.
It will give us more information

@manoj_verma1

Send the entire input data to extract required data

Regards

Hi,

Test_SMS_RegalCompany_CountryName_\d{2}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}

I hope this will help you

Hi @manoj_verma1

If you want to get date from this try below

(?=\d+).*

I hope it helps!!

@Umadevi_Sanjeevi @Srini84 @mkankatala @pravallikapaluri @lrtetala
any website that you recommend for regex creation

@manoj_verma1

1 Like

@manoj_verma1
Regex 101, RegexR

1 Like

Regexr is the more preferred one. In Regex 101 it will not accept the Look behind function.
Open the below link to navigate to Regexr

(https://regexr.com/)

Hope it helps!!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.