Part complete PDF Comment and screenshot of highlighted text extractor automation

Hi, could I have some help extracting pdf comments and taking a screenshot of the highlighted text please? I’m trying to get a pdf comment extractor working for my boyfriend.

I only want to extract the comments of the pdf. Not the highlighted text of the pdf.
For the highlighted text could I have a simple screenshot of the highlighted text.

I have included some images of what the output CSV file should look like to help & I have uploaded an example pdf file full of comments and highlights to test.
I tried to get the output but failed to do so.
PDF Extractor.xlsx (584.2 KB)
Extract [CCNY] Math Review for Stanley Ocken.pdf (1.7 MB)

I will be so grateful if someone helps me with a solution to this one.

Hi @Sarahorner

Can you try this-

  1. Install the “UiPath.PDF.Activities” package in your UiPath project if you haven’t already. You can install it from the Manage Packages option in UiPath Studio.
  2. Use the “Read PDF Text” activity to extract the text content from the PDF file.
  3. Use the “Read PDF with OCR” activity if the PDF contains scanned images or text that is not selectable.
  4. Use regular expressions or string manipulation techniques to extract the comments from the extracted text. You can use the “Matches” activity with a regex pattern to extract the comments based on a specific pattern or format.
  5. Use the “Take Screenshot” activity to capture a screenshot of the highlighted text. You can use the “Find Image” activity to locate the highlighted text in the PDF and capture a screenshot of the specific area.
  6. Save the extracted comments and the screenshot to the desired output format, such as a CSV file. You can use the “Write CSV” activity to write the data to a CSV file.

Thanks!!