Extract Multiple questions & answers from word document

I need help with a scenario where i need to extract questions (possible of multiple paragraphs) and answers (multi paragrpahs) from a word document. each document contains more than one question and answers and i need to extract each Q&A.
Next challenge is that the Format of the Q&A differs between document.
sometime the document has Comments & Responses. What is the best way to do it.
is this possible using Document Understanding?

@Aswini_Sundar_Rajan

  1. Read Document: Use UiPath’s Read PDF Text or Read Word Document activity to extract text from the Word documents.
  2. Identify Keywords or Patterns: Apply text analysis techniques to identify keywords or patterns that indicate the start of a question or answer. Use regular expressions or string manipulation to detect different structures.
  3. Handle Different Formats: Create logic to handle variations in document formats. This may include differentiating between paragraphs, headings, or any other indicators that mark the beginning of a question or answer.
  4. Extract Comments and Responses (if applicable): If your documents have comments and responses, develop logic to identify and extract this information. Use regular expressions or custom logic to capture comments and responses associated with questions and answers.
  5. Store Extracted Data: Store the extracted questions and answers in a structured format, such as a DataTable, to facilitate easy storage and further processing.

Hi @Aswini_Sundar_Rajan

You can use the Word activities to read the word and store in a String Variable after that you can use the Regular expressions to extract the questions and answers but there has to be keywords to take it as reference to write the regular expressions.

Hope it helps!!