I am opening up a pdf using the read pdf text activity and am trying to select a certain string from the last few lines of the page by using parse on the pdfstring. What would be a good solution to accomplish this?
Regex is usually the most powerful way to handle this, UiPath have the activity Matches that can do this.
Upload a sample or two of your text and bold what you are trying to obtain.
If there is a pattern to the text then Regex will be possible.
the portion in my pdf I am trying to extract lies in this line in the middle of the page:
\nSignature: Venkategowda, Arpitha (133709) By signing this timesheet you are certifying that hours were incurred on the charge Approval: Zade, Rajesh (49554)\r\nDate: Mar 11, 2020 6:18:04 AM and day specified in accordance with company policies and procedures Date: Apr 10, 2020 9:46:09 AM
Now there are many “Date:” values in this pdf but I need to get this specific date (the Apr 10th) and the approval name Zade, Rajesh.
Can regex look for Approval and then we can do string manipulation on it to find the date some how?
There are several expressions to get what you need, like this:
(^Approval: )(.*)( )
the name Zade, Rajesh, using the returned
but about the date i dont understand what is the characteristic of the data you need, is that the last one in the text always?
Yes, assuming the sample text is constant then Regex will be possible. Have a look at the below
For the Date you could try this:
Description: the space before the word Date makes this result unique.
For the Approval name try this:
Description: This will capture all text between “Approval: " and " (49554)”. The numbers in the brackets can change and be any length (as long as it atleast 1 digit)
More information and sample are always better when finding a Regex patten
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.