Hi all,
I have a scenario. I’m receiving a text file in that i will be having questions, question options and the answers. Now i need to extract only questions and answers and store them in another text file. How to achieve this. I’m looking this solution either in text file or word file. Anything is ok for me. Any suggestions.
Sample text file attached below for reference
Sample questions Multiple choice.txt (560 Bytes)
I need output like this.
How did you find out about our product?
Answer : b
What industry are you in?
Answer : d
Yoichi
(Yoichi)
November 9, 2023, 7:32am
2
Hi,
How about the following sample?
mc = System.Text.RegularExpressions.Regex.matches(strData,"(?<Q>\d+\..+)\n[\s\S]+?\n(?<A>Answer.+)")
Sample20231109-5L.zip (3.2 KB)
Regards,
Hi @Yoichi , i will check this and let you know.
rlgandu
(Rajyalakshmi Gandu)
November 9, 2023, 7:41am
5
@Beginner1234
System.Text.RegularExpressions.Regex.matches(strData,“\d+.[A-Z \d a-z]+?|Answer\s:. ”)
Use for each
In that use append line activity. To append the matches
lrtetala
(Lakshman Reddy)
November 9, 2023, 7:41am
6
Hi @Beginner1234
Try this
"(?=\d+\.\s+).*|(?=Answer).*"
O/P:
Hope this helps!!
Hi @Yoichi , tried this code for another set of questions but it’s not giving output below are the questions
Hi @lrtetala if possible can you share me the code.
Hi @Dinesh_Guptil getting below error
Yoichi
(Yoichi)
November 9, 2023, 7:53am
10
HI,
Can you try the following sample?
mc = System.Text.RegularExpressions.Regex.matches(strData,"(?<Q>\d+\..+)\n[\s\S]+?\n(?<A>.*Answer\s*:.+)")
Sample20231109-5L (2).zip (3.7 KB)
Regards,
1 Like
Thanks @Yoichi it worked. Small doubt can I use same regular expression for word document also. I’m getting both .txt files and word documents.
@Beginner1234
Use Matches instead of match
Yoichi
(Yoichi)
November 9, 2023, 8:03am
13
Hi,
How are you planning to extract text from Word document? (ReadText activity?)
There is difference regarding linebreak between text and docx. Can you share a sample file of docx?
Regards,
@Yoichi for word document we need to use word document activities right.
Sample Word Document.docx (12.5 KB)
This is the sample file
Yoichi
(Yoichi)
November 9, 2023, 8:15am
15
Hi,
Which ReadText activity do you use, System-File-WordDocument-ReadText or ReadText with WordApplicationScope?
If Former the above regex will work as it is.
If latter it’s necessary to modify regex pattern because of linebreak.
Regards,
Sure @Yoichi Thanks i will try this.
Yoichi
(Yoichi)
November 9, 2023, 8:21am
17
Hi,
FYI, the following pattern will work for text from text file and both word activities.
mc = System.Text.RegularExpressions.Regex.matches(strData,"(?<Q>\d+\..+)[\n\r]+[\s\S]+?[\n\r]+(?<A>.*Answer\s*:.+)")
Sample20231109-5L (3).zip (13.9 KB)
Regards,
@Yoichi this code is working fine with Text file and for word file it’s writing me as it is from input file
Yoichi
(Yoichi)
November 9, 2023, 8:45am
20
Hi,
Sorry but the above pattern is not very good.
Can you try the following?
mc = System.Text.RegularExpressions.Regex.matches(strData,"(?<Q>\d+\.[^\n\r]+)[\n\r]+[\s\S]+?[\n\r]+(?<A>[^\n\r]*Answer\s*:[^\n\r]+)")
And show Q and A separately.
Sample20231109-5L (4).zip (14.0 KB)
Regards,
1 Like
Thanks @Yoichi it’s working perfectly. Thanks for your help.