Hi Guys am getting an issue were i need to get text from pdf. It gets the first row in the cell, but the infomation extend to the second row. I used: System.Text.RegularExpressions.Regex.Match(DataFound,“(?<=Customer:).*(?=Company:)”).value
But for some reason the other information is not extracted.
Please see below the screen print similar to the PDF file am working with. System Developer is not being returned using the previously mentioned Regex expression
Thanks @Pratik_Wavhal how ever am not getting the correct results. Please share the sequence you used to test on your side using the example pdf screenshot
I can’t share the exact PDF company policy. But the screenshot attached is similar to the PDF am reading. Am using Read PDF text activity. I want to get “John Smith X System Developer”. But using System.Text.RegularExpressions.Regex.Match(DataFound,“(?<=Customer:).*(?=Company:)”).value am only getting “John Smith X”. Hope the information provided is clear.
Actually you are working on Original PDF so you can preserve the format while reading PDF.
But in my case you shared the Img for that data. So working on it with OCR while screen scrapping the data wont be der in the same format as it is der in the img. The data gets scribbled and output comes in single line as i have shown you below.
So i myself have write the data in same format on the Regex editor as it is der in img which you shared and then applied the regex on it. So then it work for me that i have already showed you in earlier posts.
In that way i showed you the output that work wid me. If I have the PDF then only i can make workflow.
Hope you got it what i am saying.
I got what you said. i have recreated the PDF file that is similar to what am working with. I tested it, still get the same results as mentioned before. Tried to upload the file but am restricted. Please use the google drive link to get the file.