hello, need support. i have created a project to extract specific data from a PDF file to a massage box the result came as the attached photo. please advise what can i do to get result.
i am doing this for many PDF Invoices so i am using archer base activity to capture the same data for all PDF invoices in the same folder
Check the get text activity ouput which you have used
The get activity was not giving proper output re check it once
Or
You need to modify the selector of the get text activity like made it dyanamic
Cheers
as you can see the archer bas activity cannot select the hole line as an archer it is only select a part of the line. also, this happens with the rest of the data that I need to extract.
this invoice is the official form of all invoices at my workplace. please any advice, this is very important for me. as I am a new at the studio.
Why don’t to you try with read pdf activity.
Try once this
1.take Read Pdf Activity and pass the invoice path in that and create a variable for that and
- After that you can perform the string manipulation or regex to get the expected output
sorry but i don’t know to do the second activity. that’s why I chose anchor base activity to be easier for me. i am new at RPA.
It’s better to try with regex or string manipulations to extract data instead of rely on anchor base activity(not a best option for always).
Check this to learn regex:
You can write regex code at:
Instead of anchor base you can go with String manipulation or Regex this both are best for your process
Take read pdf text activity and pass the filepath and the output will be in string so you can share that text format then we can provide the regex pattern for that what you want to extract from that particular pdf
thanks all for your support and help, i will try to study regex to understand it and use it.
i have no coding background so I will try to complete this automation with the guidelines you gave me. i hope i can do it.
thanks again all.
also, if I may could anyone tell me what the best practice programming language is I can learn if I will continue in RPA
Learning VB.NET would be beneficial.
as you can see i have managed to work with regex and i managed to select the registration number from the invoice but it also contains this symbol (#) how can i remove it.
if anyone can write the code sown i would be grateful
Hi @mohamed.saty2012 ,
Try the below Expression :
(?<=Registration number\s+#).*
This is also when the Registration number always has the # at the beginning.
If not, We could perform a post processing to keep only Digits in the Extracted value.
thanks what you send have given me an idea i have typed it and it works.
i have just added the synbole that i want to remove as following
(?<=Registration\sNumber\s#)(.*?(?=\s)) and it worked .
thanks for your support
thanks for your support and time. i am really appreciated.
Thanks a million. i really appreciate it