I have an invoice (pdf) with digital text values and image subjects, for example if I have a field called “Bill No.” and a value of “1234”, “Bill No.” is image, and “1234” is text.

I am using intelligent ocr package, and for extraction step I am using form extractor, but I am not able to select items for first page, I can select only values but not subjects, I can select “1234” only and can’t select “Bill No.”

Also in the digitization step the “digitize document” activity with ocr engine gets only values without subjects, but when using ocr engine with “read pdf with ocr” it will get values and subjects as well.

Solved by converting pdf to jpg

