Hello,
i need to extract only Bold characters in word document could you anyone can help out
Thanks in advance.
-Shriharsha H N
Hello,
i need to extract only Bold characters in word document could you anyone can help out
Thanks in advance.
-Shriharsha H N
Hi
Use a READ TEXT FILE activity where pass the file path of that text file as input and get the output with a variable of type string named str_input
—now use a MATCHES activity and pass the string variable as input and mention the expression as “[B\][a-zA-Z0-9._/ ]+[/B]”
And get the output with a variable of type System.Collections.Generic.Ienumerable(System.Text.RegularExpressions.Match)
Now use a FOR EACH activity and pass the above match output variable as input
And change the type argument as System.Text.RegularExpressions.Match
And inside the loop use a writeline activity and mention like this
item.ToString which will display all the bold words Alone
Cheers @Shriharsha_H_N
its not working @Palaniyappan
May I know what was the error you were facing
Cheers @Shriharsha_H_N
How it will identify BOLD text here ? Where you are checking it ?
This won’t
Kindly check the updated Regex expression
@lakshman
@Palaniyappan : New Regex also not working please find the screenshot
Step1: Save as Word document as .htm file format
Use a read text file activity to read .htm file path
Step2 : use below regex function to get required values
System.Text.RegularExpressions.Regex.Matches(.htmOutputvairable,“(?<=<b>).*(?=</b)”)
in for each
Check and let me know still your are facing any issues
Thanks
Amar.
I can have your converted .htm files . then I will share the code for that.
i have uploaded sample doc file. in that i need only bold characters means correct answers
Sample.zip (14.3 KB)
Thanks in advance
Do you have bold letters in text file? Read text file might not preserve the format try writing a custom code. Not sure because I have never worked on this usecase before, replying here because I was tagged
Regards
Hello @Shriharsha_H_N
You cannot do this using regex after extracting text using read text activity.
and difficult if we
Convert it into html or htm file,as complex word document will have complex Structure and to extract element would be a little difficult
Now you have two choice either you create a macro to get all the bold text or you can do it using a combination of hotkeys and clicks
I have done it using the second method check this workflow for better understanding
(IT’ll get all BOLD Text )
Bold_Letters.xaml (14.8 KB)
@Raghavendraprasad : Thanks for the reply.
in the text file im unable to bold middle characters but in word file i can. based on that scope i need to extrct only Bold characters from doc file
Use VBA code:
Open VBA editor Alt+F11
Insert>- Module
Paste the following:
Sub CopyBoldText()
Dim doc As Document
Dim rng As Range
Dim boldText As String
Dim tempDoc As Document
Dim char As Range
Set doc = ActiveDocument
boldText = ""
' Loop through each story range (main text, headers, footers, etc.)
For Each rng In doc.StoryRanges
Do
' Check each character in the range
For Each char In rng.Characters
If char.Font.Bold = True Then
boldText = boldText & char.Text
End If
Next char
' Move to the next story range (headers, footers, endnotes, etc.)
Set rng = rng.NextStoryRange
Loop Until rng Is Nothing
Next
' Check if there is any bold text to copy
If Len(boldText) > 0 Then
' Copy bold text to clipboard
Set tempDoc = Documents.Add
tempDoc.Content.Text = boldText
tempDoc.Content.Copy
tempDoc.Close SaveChanges:=False
MsgBox "Bold text copied to clipboard!"
Else
MsgBox "No bold text found in the document."
End If
End Sub
Press F5 then run
Then paste the copied text in the word doc