I have a pdf document and I should read the file and get the only data between double-quotes.
it would be great if you help me.
@Gowtham_Srinivasan - Please share some sample text…
you can use Read pdf to text activity and after that either Regex or string Manipulation method extract the text you are looking for.
Thank You @prasath17 @NIVED_NAMBIAR for your reply.
Test.pdf (16.8 KB)
You can take this pdf as a test. My need is to get the text of the sentence between the double-quotes into excel.
It would be great if you can help me with this.
Thank you
Cheers
@Gowtham_Srinivasan - Please find the started help here…
Output of Read PDF Text is StrInput
Build Datatable
Regex Matches
Input = StrInput
Pattern = "“(.*?)”"
Assign Activity
Dt = (From m In IEnRegex.Cast(Of Match)
Select Dt.Rows.Add(m.Groups(1).toString)).CopyToDataTable
That’s it, write your results to Excel.
Output
Hope this helps…
Thank you so much @prasath17
I got a doubt
IEnRegex Should be is what variable type
Thank you,
Cheers
@Gowtham_Srinivasan - hit ctrl + k and create a variable it will get created automatically with right variable type.
Its IEnumerable…
Did you actually created the workflow and tested it ?
Dt is DataTable which is your output of build datatable activity.
@prasath17
I got to get one more problem.
if the text between the double quotes are split into two lines then i am not getting the output.
it is omitting the the double quote itself and taking the next double quotes which is in the properly lined in the same line
Thank You
Cheers.