I want to extract some data from pdf. but that pdf contains some images. So, I need to extract few information’s from images(PDF). I tried with read pdf ocr(Tesseract). It is not extracting properly.
Try changing the scale and profile for the tesseract ocr and the scale starts from 0 and can go till 5. In read pdf with ocr try changing the ImageDpi
RPA_autocadd.pdf (219.1 KB)
Sample PDF attached
The resolution looks really bad
also what you want to extract from it?
it is total unstructured view difficult if the image changes always
108416-crude-oil_schematic.pdf (251.5 KB)
Hi Anil,
I added one more sample also. I need to extract all the information from the pdf.