Thanks my friend. I tried this already but that only gives me the lines starting with CHAIN: I am trying to segment the groups that sit between each CHAIN: key word into different blocks of text. I appreciate the idea though.
if I split on CHAIN: it will give me what is after “CHAIN” but only on the same line. It does not give me the next row of data. What I am attempting to do now is count how many times I find a key word, putting that to an array of Int and then looping through that array of int to split the text into group parts. Since I do not know how many times this new key word will appear in any given document, I created a counter beginning at 1 and in a do while I am splitting, looking for the data I need and then increasing a counter on the split until it reaches the counter for the number of times the word is found. It is just really code heavy which I do not like. If anyone has any other ideas hit me up
the pdf is a highly structured pdf that I had to read with Tessaract OCR to get it into one string. There are potentially over 200 + lines that I ingest into the code. I do not want to write an append for each line. I am looking to group the data between the Key Word CHAIN: which seems to be a break in the data sets. Hope that helps understandng