Matches Activity Works but Regex Based Extractor with Same Expression Not Working

LaHood_AM · August 29, 2023, 8:33pm

I am utilizing document understanding. The Digitization and Classification are working as expected. I am attempting to use a RegEx based extractor to get the 4 fields I need. When I run the process, the extractor ends up empty for all of the fields, even though if I copy the Document Text being fed into the Data Extraction Scope and paste it into the “Test Text” section of the RegEx Builder, it works as expected. I even added a Matches activity with the exact expression right before the Extraction Scope and passed in the Document Text variable and it works just fine there, so it seems to be an issue with the Document Understanding Data Extraction Scope. Any ideas what could be causing this? Does the Regex Extractor act on the Document Text variable? If so, I can’t make sense of how it isn’t working.

supermanPunch · August 29, 2023, 8:36pm

Hi @LaHood_AM ,

Could you enclose .*? in brackets and Check ? Like below :

(.*?)

ppr · August 29, 2023, 8:45pm

Could it be the case that the extracted value is within a line break?
as . is not including \n we can rewrite to

(?<=Patient:\s)[\s\S]*?(?=\sAddress)

LaHood_AM · August 29, 2023, 9:01pm

Thanks for the quick reply! This helped for all except for one that I have that goes across multiple lines. The text from the document looks something like:

Refill Request
1234 Address St.
City, ST 54455
Tel: 555-555-5555

What I currently have is (?<=Request\s\n*)(.*?)(?=\sTel:)

I am trying to extract the middle two lines. I also have the “Multiline” Regex Option selected. Any suggestion on this one?

ppr · August 29, 2023, 9:04pm

as mentioned:

also keep in mind the windows linebreaks composed by \r\n which we express defensive by \r?\n

system · September 1, 2023, 9:04pm

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Regex Based Extractor not extracting while executing Activities activities , question , document_understanding	7	1037	August 5, 2022
Intelligent OCR Regex Based Extractor Not Returning Values Document Understanding	21	4617	December 21, 2022
RegEx Based Extractor does not work but same expression works on Matches activity Studio	5	308	January 9, 2024
Regex Extractor not extracting proper values AI Center question , ai_center	3	616	December 21, 2022
The extractor doesn't have any expressions configured Studio studio , question , activities_panel	3	101	May 15, 2024

Most Active Users - Yesterday
Anil_G
sudster
ashokkarale
Yoichi
jjes
cclemon
Nisha_K21
Arvind_Kumar1
Latifa
Steven_McKeering
More details...

Matches Activity Works but Regex Based Extractor with Same Expression Not Working

Related Topics