How to capture a particular checkbox from PDF


#1

How do I capture the checkbox value from a PDF


#2

Is it a readable PDF or Image?

Have you tried to use find relative image ( or may be Image exists)


#3

I have not tried find relative object assuming that , it will not be able to identify on which box there is a tick. So what I did is I captured the complete text ans split it by “Tick” to get the actual value.


#4

So it is a readable PDF. Glad it worked.


#5

The split thing did not work. It did not capture the “tick”. Help please !


#6

Can you share your code? how exactly are you capturing the data? With the PDF open? Or reading data to a data table?

Did you try capturing the image in below way (image exisits) that way robot will know which item check box is checked?


#7

I have the PDF open . For the above image , I have used Gettext. the text captured is as displayed

I was hoping it will capture the “Tick” against “Conventional”


#8

If you use the Get Text property, when you open the UiExplorer of Selector, for the check box you can see the “text” attribute.I think you can use that Value (checked or unchecked)


#9

It does not identify the check boxes.


#10

Scraping also not able to capture the “tick”


#12

It worked using Google OCR . Thanks


#13

I am not able to read checkbox value in pdf Test.pdf (7.0 KB) with Google OCR. If you are able to find, please attache the sample project which works with my PDF(Test.pdf).

Thanks in Advance.


Read Images from a scanned PDF Document
#14

One approach can be take the image of the text and the check box clicked and text and the check box not clicked. and during execution, you can load the image and see which image matches. Based on that, you can figure out if its checked on not checked.


#15

The main problem is the text value of the checkbox checked/unchecked are not the same in all scenarios as its the scanned PDF. Some times it give a value of EI, E, CI or even null in few cases.

I am a beginner and not sure if there is a solid solution to this. Is there a proper solution for this. I am kind of trying out a similar use case of reading through multiple scanned PDFs where text adjusant to the checked checkbox has to be captured.