How to preserve format while reading text from word
I am using “Word Application Scope” and then “Read Text.” How can I keep the format while performing this?
How to preserve format while reading text from word
I am using “Word Application Scope” and then “Read Text.” How can I keep the format while performing this?
Hey @Krithi1
You can try saving the Word document as a PDF using the Save Document as PDF activity within the Word Application Scope.
Once the document is converted, use the Read PDF Text activity from the UiPath.PDF.Activities package to extract the text.
Make sure to set the PreserveFormatting
property to True in the “Read PDF Text” activity, as this will help retain the original formatting of the document during extraction.
So, there is a step after reading the pdf text. I need to store that to a variable and send an email.
I still don’t see the format in the email
@Krithi1
If that method doesn’t work, you can try saving the Word document as HTML. You can do this by using the user interface (in Microsoft Word) or by using C# code in the Invoke Code activity in UiPath.
var wordApp = new Application();
wordApp.Visible = false;
var wordDoc = wordApp.Documents.Open(inputPath);
wordDoc.SaveAs2(outputPath, WdSaveFormat.wdFormatHTML);
wordDoc.Close();
Marshal.ReleaseComObject(wordDoc);
wordApp.Quit();
Marshal.ReleaseComObject(wordApp);
After saving the document as HTML, use the Read Text File activity to read the HTML file and assign it to a variable, for example, htmlContent
. Then, use the Send Outlook Mail Message activity, setting the Body property to htmlContent
and*IsBodyHtml to True. This will ensure that the email content retains the formatting from the original Word document, preserving the HTML styling.
Don’t forget import:
using System.Runtime.InteropServices
using Microsoft.Office.Interop.Word
I hope it’ll help.
There is an activity called “Save Document As” which will save the word in html document preserving the format.
Now I need to read that HTML doc text and send it in the body. How can I read the HTML document preserving the format is my question.