Extract information through a repeated word

Darkbrix · April 9, 2018, 4:14pm

Hello everyone, if I have a word repeated in a document and I need to extract the information that goes with it. How can I identify each one and gather all the information?

21367_140455 (1).pdf (13.7 KB)

example:
In this pdf the word “Lapesa” comes out several times and I need that when you finish reading the pdf, write the information in a file like this

.

Regards.

Jacob_Mintzer · April 9, 2018, 6:40pm

So you want each line that mentions the word Lapesa (or this can be any string)?
To solve this problem, I would first read the pdf, and then split the output string on a newline character, and then for each line, check to see if that entry in the array. If the entry in the array contains Lapesa (or whatever identifier you want) you can then append it to a text file. Hope that helps!

ClaytonM · April 9, 2018, 7:40pm

You can do a For each line In txt.Split(System.Environment.Newline(0))
Then, Write Text File and Append Text, or you can concatenate the lines to a string and write it at the end.
That’s basically what jacob suggested which works.

If there are thousands of lines I would suggest LINQ expressions.
For example,

filteredArray = txt.Split(System.Environment.Newline(0)).Where(Function(line) line.Trim.ToUpper.StartsWith("LAPESA") ).ToArray

That would give you an array to process in a “for each” if needed. You can also surround this array with a .Join to to write it to a file.

Write Text File => String.Join(System.Environment.Newline, filteredArray)

Regards.

Topic		Replies	Views
Array contains Academy Feedback question	1	881	May 20, 2020
Extract information from a PDF Help studio	9	1551	May 12, 2019
How to find the repeated word in the PDF Help	9	7047	July 30, 2019
GettextFromPdf Help	1	801	January 7, 2019
Extracting sentence with set word in it Studio studio , question , data-extraction	8	1333	October 2, 2022

Most Active Users - Yesterday
ashokkarale
MD_Farhan1
Ajay_Mishra
postwick
Dheerendra_vishwakarma
Anil_G
chandreshsinh.jadeja
Gautham_Pattabiraman
vrdabberu
aravindbalineni123
More details...

Extract information through a repeated word

Related Topics