Reading multiple pdf files and detecting landscape or portrait

I am trying to read different multiple (hundreds of) pdf files.
But there is a serious problem. Some files might be portrait, some files might be landscape mode. I must take the all data at each pdf file.
I think the intelligent OCR is not a solution. How can I detect if pdf portrait or landscape? or Should I do this?
But If I tried to read a landscape pdf with omni ocr the characters might be corrupted.

Can anybody help?

Hi @sashatheitguy

There have been similar posts on the forum earlier, asking the same question. Here is a comment by the Lead Developer of the Document Understanding module:

The Document Model should give you the information whether it’s a landscape or portrait.

Thank you. I could not check the other questions entirely.
Sorry for duplicate question. I will make a search for it. But I have a found something about ABBY. I will also search for this.

Great. ABBYY is quite advanced and very flexible in configuring these things (Hence the name FlexiCapture :smiley: )
The big plus for IntelligentOCR is that you can configure and use it for free.

Wish you luck with your project.
Cheers.

1 Like

Thank you for your answer.
Yes, I have tried Intelligent OCR like omni. But I had a problem because of landscape or portrait types in one pdf file.:slight_smile:

I have another question here. I have contacted the ABBY team. They can provice me a trial version. But They are asking two types of OCR Packages.

Finereader and Flexicapture.

Which one should I choose?
In my case, there are pdf files, which has portrait and landscape pages. But, I need to extract all the text data to the plain text without any character error. (Like saving .txt file automatically. Because I will have hundreds of documents)

I read something about a connector but Im not sure about it. How can I integrate ABBY in to UIPath? (Can I integrate finereader? or Can I use only just flexicapture at uipath?)

Sorry for asking too much questions. Im new here.

Stay safe! :slight_smile:

Hi @sashatheitguy
A brief intro (and a non-technical differentiation) on FlexiCapture and FineReader
ABBYY FlexiCapture - Smart Data Capture from structured or semi-structured documents like invoices, purchase orders or even application forms (banks, credit cards, you name it)
ABBYY FineReader - High Volume Document Conversion.

While FineReader focuses on just converting hand-printed/scanned documents into editable text for word or PDF files, FlexiCapture focuses on highly customisable and trainable document templates to extract structured data out of various documents.

So, if you deal with a lot of long, free-flowing text documents, FIneReader is the way to go.
FlexiCapture works for a scenario like - if you are dealing with documents where you need specific information, not the entire text transcribed for you (For example, find the label ‘Invoice Number’ and getting the data from the right or bottom of that label, or extracting line items of invoices whether or not they are separated by clearly marked lines, etc.)

I haven’t researched if UiPath integrates with FineReader. UiPath FlexiCapture activities are available and work alright.

1 Like

Thank you for your reply. :slightly_smiling_face:

1 Like