I have a pdf and it has multiple checkbox option, one of them would be checked, i need to extract the correct option which is checked in checkbox and extract them in excel. I am new to PDF automation, any help will be highly appreciated.
Hi @Ayesha_Ijaz
Read the PDF Document
- Use the Read PDF Text Activity:
- Drag and drop the Read PDF Text activity into your workflow.
- Set the FileName property to the path of your PDF.
- The activity will output the text content of the PDF.
- Extract Text Data:
- Store the output in a variable, for example, pdfText.
Process the Extracted Text
- Use the Assign Activity:
- Create an Assign activity to process and filter the pdfText variable.
- Define a regex pattern to identify checked options. For instance, if the checked options are indicated by a specific string like “✓” or “Checked”, use a regex pattern like “Checked”. Adjust based on how the checkboxes are represented.
checkedOptions = System.Text.RegularExpressions.Regex.Matches(pdfText, “CheckedPattern”).Cast(Of System.Text.RegularExpressions.Match).Select(Function(m) m.Value).ToList()
Replace "CheckedPattern"
with your actual pattern.
-
Checkbox is Checked:
-
Text: “Checkbox: Checked”
-
Regex Pattern:
Checkbox:\s*Checked
-
Checkbox is Unchecked:
-
Text: “Checkbox: Unchecked”
-
Regex Pattern:
Checkbox:\s*Unchecked
Thanks,
Thanks Aditya for your help, i will try the steps and confirm.
Thanks
Ayesha Ijaz
Hi @Ayesha_Ijaz
Is Your Problem Solved?
I have this situation here, I have tried and got it almost for one file, but situation here is i need to implement this in multiple files, i tried some of the instructions but no luck or i am heading in the right direction.
in page no : 9 i get this checkbox option to check it, only one of them would have been checked. that checked information needs to be extracted.
I am available in google meet for a quick connect. https://meet.google.com/mwj-hixm-wzo
I have the session open, you may connect and guide me