Regex Based Extractor Not Extracting Data But Regex Builder Says It'll Work

I am trying to use the Regex Data Extractor for a PDF but the validation screen says that data was missing for each field. I made the regex expressions in the regex builder and the test text (from the actual PDF) works

For example: the test text was: Funding Opportunity Number: W81XWH-20-ALSRP-CDA

So the regex I made for it was: ^Funding Opportunity Number: ([A-Z0-9]{6}-[0-9]{2}-[A-Z]±[A-Z]+)$

I added the $ because it was the only text on that line. Even without the $, it doesn’t extract anything. Am I missing something?

I think I may have done the digitize document step incorrectly, I am using the Omnipage OCR. What do I use for the OCR input? For the OCR output, do I make a new variable? Or do I pass the variable used for digitize document document text input.

Hi Alex,

With Digitize Document you don’t need to worry about the inputs and outputs of the OCR Engine activity (OmniPage OCR in this case), as they are handled automatically by Digitize Document. You only need variables for the text and document object model that come out of Digitize Document.

@Alex_Marasco Welcome to our Uipath Community.
Just remove $ sign from your regex then it should work

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.