i have pdf string of invoice now i wnt to capture the address from that what will be best way to get that as address keeps on chnaging for multi country
Use read pdf text,if the pdf is scanned pdf use read pdf text with ocr
Use regex or string manipulations
If it is a plain text pdf you can use Read PDF activity
You might need to use Regex, depends on the text you are generating to extract the address
Use the read pdf Text for structured pdf and use read pdf text with OCR for unstructured pdf and use any OCR engine inside the read pdf text with OCR. The output is stored in a string variable.
Let the String variable write in a text file then use the regex expressions to extract the output address.
Use the Match activity to use the regex expressions.
Hope it helps!!
If it is a scanned document use read pdf with ocr otherwise use Read pdf text activity.
Then by using regex you will get the required fields.
Hi @manoj_verma1 ,
If your string has static address format, You can regex to get address.
For example your address pincode will be 6 char, use below expression.
or share address format if possible
Give me the proper text and required output to be extract.
It will give us more information
I hope this will help you
@Umadevi_Sanjeevi @Srini84 @mkankatala @pravallikapaluri @lrtetala
any website that you recommend for regex creation
Regex 101, RegexR
Regexr is the more preferred one. In Regex 101 it will not accept the Look behind function.
Open the below link to navigate to Regexr
Hope it helps!!
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.