I have been trying to solve this problem for several hours but still have not come up with a correct choice in my process. Thought you might help me here on the forum.
What I want to do is - Open a pdf document with a lot of unstructured text. In the text I want to extract a specific code consisting of 2 letters and 6 digits. Then I’ll copy the code and paste it into a web browser. I have tried to use out of matches / Regex but not got it and work.
Hi
Welcome to uipath community
That specific text might have a solid term around like Invoice = INV124 so here Invoice is the term next to the text we need in specific
Do we have any such
Or kindly share a sample of your text from where we need to extract the term
Actually there are 3 pdfs that I would lo extract a code from. Each of them have different styles of unsorted text. What I would like to do is just using screen scraping on the specific code on each document so I can post it to web browser. However I am not able to do that.
Yah we can do that either
—use Start process and pass the file path pdf as input
—Try with Screen Scrapping method
—once done we wouldn’t get the text as output and from that we can get term we want with regular expression or split method
Thank you for the answer but I really don’t get it.
Can you please guideline me on this.
First of all I’m going to open a file called pdf1 on my computer.
On this pdf file I would like to extract a specific text out of all text in the file. Text that I would like to copy is QR322343
After that I would like to copy that specific text and copy it to a search bar on a homepage.
Fantastic
so once after getting the text with a variable of type string named str_pdf from pdf use this expression in a assign activity to get the value
list_output = System.Text.RegularExpressions.Regex.Matches(str_pdf,“[1].[0-9]{6}”).ToString
this expression would give you any string with two character and six numbers in the pdf
whre list_output is a variable of type System.Collections.Generic.IEnumerable(System.Text.RegularExpressions.Regex.Match)
–then use a for each loop and pass the above variable as input and let the type argument be object itself in the property panel of for each loop
–inside the loop use a writeline activity like this item.ToString
or if we feel like there would be only one text like that then simply one expression in writeline
System.Text.RegularExpressions.Regex.Match(str_pdf,“[2].[0-9]{6}”).ToString
and it worked as well
kindly try this and let know for any queries or clarification
Cheers @FekkeMalin
After your solution I wanted to check if it’s working so I putted a message box after assign activity.
It says System.Text.RegularExpressions.MatchCollection and not the code? What have I done wrong? I did as you told me. For some reason I don’t think that the excel file reads clearly.
After your solution I wanted to check if it’s working so I putted a message box after assign activity.
It says System.Text.RegularExpressions.MatchCollection and not the code? What have I done wrong? I did as you told me. For some reason I don’t think that the excel file reads clearly.