Regex to fetch value after certain pattern keyword

Hi Team,

From dynamic text (which contains data of below pattern), need some help in fetching below.

Output Require -

  1. Question Number (like - Q1, Q3, Q6 etc. - not in any specific order).
  2. variable ID’s (like - AgeX, sex, scrnind etc ).

Input Text -

’ RazorLib / KTNAWeb Shell - updated 2019-11-20
’ ~~~~~~~ BEGIN SCREENING QUESTIONS:
’ Q1
’ Age “What is your age?
’ text;
AgeX “What is your age?
long [1 … 99];

' Q3
sex "What best describes your gender?"
    [
        metatype = "rowpicker",
        answertype = "text"
    ]
categorical [1..1]
{
    _1 "Male",
    _2 "Female",
    _3 "I don’t identify as either" fix
} ran;

' Q6
scrnind "Do you or does any member of your household work in any of these occupations?
<span class='mrInstruct'>(Select all that apply)</span>"
    [
        tabulate = false,
        metatype = "rowpicker",
        answertype = "text"
    ]

End Metadata

Is there some way we can fetch the same here?

For q extraction expression can be “[qQ]\d+” .
I didn’t understood about the variable id what have to be extracted. Kindly describe in some details.

1 Like

@nv08 Thanks for your quick response.

Let me explain you in details about the Variable ID part -

For every question, there will be one Variable ID associated with it.
For example - ‘AgeX’ is the variable ID for Q1, ‘sex’ is the variable ID for Q3, ‘scrnind’ is the variable ID for Q6.

Mostly Variable ID will be the keyword which will be the first word in next line of each questions. But there can be some exception scenarios as well, like in Q1 (here 2 extra lines are there which starts with single quote).

Basically, Variable ID will be the first keyword after Question number that does not starts with any single/double quote.

1 Like


Expression made for the following assumptions:-

  1. question initials should be q or Q
    2)there could be either space or new line (any numbers) after q
  2. the first word encountered after q would be considered as ID and would be picked until the double quotes is received.
  3. The match will comprise of double quotes at the end that could be trimmed accordingly.
1 Like

@nv08 - I have tested with the pattern you shared. Even though it is working for 2 scenarios, for one scenario it is failing as there can be 1/2 commented lines in-between sometimes (starts with single quote) after Question number.

@DewanjeeS - Could you please post your entire text here? Only if we know your entire pattern …we can get the right regex. Without knowing the all the different patterns writing the regex is impossible…

@prasath17 - I have posted the input text in the question itself. Below is the text for your reference -

’ RazorLib / KTNAWeb Shell - updated 2019-11-20
’ ~~~~~~~ BEGIN SCREENING QUESTIONS:
’ Q1
’ Age “What is your age?”
’ text;
AgeX “What is your age?”
long [1 … 99];

' Q3
sex "What best describes your gender?"
    [
        metatype = "rowpicker",
        answertype = "text"
    ]
categorical [1..1]
{
    _1 "Male",
    _2 "Female",
    _3 "I don’t identify as either" fix
} ran;

' Q6
scrnind "Do you or does any member of your household work in any of these occupations?
<span class='mrInstruct'>(Select all that apply)</span>"
    [
        tabulate = false,
        metatype = "rowpicker",
        answertype = "text"
    ]

End Metadata

@DewanjeeS - below portion is not in the text you provided right? please check…that’s what I am asking actually…

Text starts only Q3…

@prasath17 - This seems strange, not sure whether any formatting issue here or not. But here I have posted the Input text starting from Q1 itself :slight_smile:

@DewanjeeS - Again…Bro, I have already seen what you have posted …Please look closely…there are lot of spaces before AgeX in the below screenshot…

When you posted in the question it does not contain any spaces before AgeX…You agree???

image

What I am asking is , please share the text format from Q1 to Q6 then it would be easy for us to write the regex

@prasath17 - Got it. Actually I am testing the pattern in ‘https://regex101.com/’ where it seems they automatically appends those extra spaces in front which are not there in original text.

Original text will be in the same format as I posted here :slight_smile:

@DewanjeeS - your Q3 and Q6 is following the same pattern whereas Q1 Agex values does not …it after 2-3 lines below…

Let me think…

1 Like