Ready PDF and get only the specific value between double quotes

I have a pdf document and I should read the file and get the only data between double-quotes.
it would be great if you help me.

@Gowtham_Srinivasan - Please share some sample text…

you can use Read pdf to text activity and after that either Regex or string Manipulation method extract the text you are looking for.

Hi @Gowtham_Srinivasan
can u share the sample pdf for getting for more info?

Thank You @prasath17 @NIVED_NAMBIAR for your reply.
Test.pdf (16.8 KB)
You can take this pdf as a test. My need is to get the text of the sentence between the double-quotes into excel.

It would be great if you can help me with this.
Thank you

@Gowtham_Srinivasan - The results under the column $1 is what you are looking for ??

Yes, @prasath17
You r right

@Gowtham_Srinivasan - Please find the started help here…

Output of Read PDF Text is StrInput

Build Datatable

Regex Matches

Input = StrInput

Pattern = "“(.*?)”"

Assign Activity

 Dt = (From m In IEnRegex.Cast(Of Match)
 Select Dt.Rows.Add(m.Groups(1).toString)).CopyToDataTable

That’s it, write your results to Excel.


Hope this helps…

Thank you so much @prasath17
I got a doubt
IEnRegex Should be is what variable type

Thank you,

@Gowtham_Srinivasan - hit ctrl + k and create a variable it will get created automatically with right variable type.

Its IEnumerable…

Did you actually created the workflow and tested it ?

Thank You @prasath17
Got it

Learning and Doing it @prasath17
and @prasath17 Getting this error

Thank You so Much

Dt is DataTable which is your output of build datatable activity.

thank you @prasath17
Got it

@Gowtham_Srinivasan - build datatable activity is missing in your workflow.

it did not come with the screen Shot @prasath17
but did it gone to run the program

@Gowtham_Srinivasan - its there is my workflow.please double check…

5 activities in total.


1 Like

got an other error while running the program

@Gowtham_Srinivasan - (57.5 KB)

1 Like

I got to get one more problem.
if the text between the double quotes are split into two lines then i am not getting the output.
it is omitting the the double quote itself and taking the next double quotes which is in the properly lined in the same line

Thank You