Extract Header from Word.
Word documents has headers, bullet points and su bullet points. Extract headers and bullet points.
Extract Header from Word.
Word documents has headers, bullet points and su bullet points. Extract headers and bullet points.
Pls provide sample input and expected output.
Cheers
Attached is the input… I have highlited the output lines
So the output should look like:
X.Y. SPRINT
[EXTRACT]nxnxnxnxnxn
[EXTRACT]nxnxnxnxnxn
X.Y. SPRINT
[EXTRACT]nxnxnxnxnxn
[EXTRACT]nxnxnxnxnxn
If so then just extend the “if” statement like
If paraText.ToLower().Contains(“[extract]”) or paraText.ToLower().Contains(“sprint”) Then
Cheers
That query gives me every header even though the Bullet point doesn’t exist. I need to extract Header only and only when there is a bullet point with “EXTRACT” word under that Header section.
So try something like this:
If paraText.ToLower().Contains(“sprint”) Then
Dim header As Microsoft.Office.Interop.Word.Paragraph = newDoc.Content.Paragraphs.Add()
header.Range.FormattedText = para.Range.FormattedText
end if
If paraText.ToLower().Contains(“[extract]”) Then
Dim bullet As Microsoft.Office.Interop.Word.Paragraph = newDoc.Content.Paragraphs.Add()
bullet.Range.FormattedText = para.Range.FormattedText
if not isnothing(header) then
header.Range.InsertParagraphAfter()
header = nothing
end if
bullet.Range.InsertParagraphAfter()
End If
Note I am writing without syntax check so it may contain errors.
Its saying “header” is not declared. I thinks it’s throwing at these lines. I think header is not declared at this level right?
header.Range.InsertParagraphAfter()
header = nothing
Thanks
Monika
Sure. There might be some syntax errors, or wrong variable scoping.
But the basic logic is to capture every “header” but add it into new document with following “bullet” and clear the “header” once added to the new document.
Cheers
This may work…
Dim inputObj As Object = CType(inputPath, Object)
Dim outputObj As Object = CType(outputPath, Object)
Dim headerText As Microsoft.Office.Interop.Word.Range
Dim header As Microsoft.Office.Interop.Word.Paragraph
Dim bullet As Microsoft.Office.Interop.Word.Paragraph
If File.Exists(inputPath) Then
Console.WriteLine("Processing: " & inputPath)
Dim wordApp As New Microsoft.Office.Interop.Word.Application
wordApp.Visible = True
Dim doc As Microsoft.Office.Interop.Word.Document = wordApp.Documents.Open(FileName:=inputObj, ReadOnly:=True)
Dim newDoc As Microsoft.Office.Interop.Word.Document = wordApp.Documents.Add()
For Each para As Microsoft.Office.Interop.Word.Paragraph In doc.Paragraphs
Dim paraText As String = para.Range.Text.Trim()
If paraText.ToLower().Contains("sprint") Then
headerText = para.Range.FormattedText
End If
If paraText.ToLower().Contains("[extract]") Then
If Not IsNothing(headerText) Then
Console.WriteLine(headerText.Text)
header = newDoc.Content.Paragraphs.Add()
header.Range.FormattedText = headerText
header.Range.InsertParagraphAfter()
headerText = Nothing
End If
Console.WriteLine(para.Range.Text)
bullet = newDoc.Content.Paragraphs.Add()
bullet.Range.FormattedText = para.Range.FormattedText
bullet.Range.InsertParagraphAfter()
End If
Next
newDoc.SaveAs2(FileName:=outputObj)
newDoc.Close()
doc.Close()
wordApp.Quit()
End If
this is not working for me. Nothing has been written to the output file in the first place.
Hi all,
Can someone help here?
I tested the above code with attached sample file and it works fine.
doc6.doc (32 KB)
That’s working for me too. Sorry my bad, I didn’t realize you haven’t written the save document code. I added it and it worked. Thank you so much for your help.
Right, sorry for that. I updated the code just in case someone wants to try it.
Cheers