Destroyed pdf format when using ExtractPDFRange

We are splitting a PDF into seperate PDF files. But some of them have strange fonts/formats:


The original was looking like:

There is no OCR used.
image

Hello Tobias,

Once I had similar issue and I used

image

(In properties You can Preserve Formatting )

and then I used regex in order to separate / extract data. Hope if helps

Hi
Is the pdf editable
If so then use a normal READ PDF activity or
If not use READ WITH OCR and use Omnipage ocr to extract the text

Cheers @TP2B

Thanks for your responses. But I currently don’t read anything. It’s the process “Extract PDF Range” only. So the output is a PDF and I opened it with ACReader (see screenshots)

was a config error, the user extracted manualy by printing a page from a PDF to a PDF

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.