Excellent PDF Digitization with Intelligent OCR Engines (Portrait and Landscape)

Hey all,
Sorry if someone asked before.

I have hundreds type of pdf files which might be portrait or landscape. (For example, the first page is portrait, other pages might be landscape or not.)
They might be look like an invoice, or data table or something else. I don’t know. Because my users will upload it.
I am trying to use UIPath and intelligent OCR Activities for automation. (I also tried ABBYY finereader, Omni OCR) When I give n pages of pdf to the flow, It fails. When I gave 1 page it reads. But if page was landscape, there might be character errors.

I think, I need to understand the pdf file portrait or not first. Or, do you have an idea how can I solve it? I will develop a web UI for my users. They are going to upload pdf documents with it. I can split and rotate at that point maybe, then I can give splitted files to uipath back? Or should I solve it in just UIPath side? What do you think?

I think my problem is because of landscape pages or multiple pages for now.
As a result, I want to digitize all the pdf file and extract “all the text data” to a .txt file without any wrong character error. Thats what I need exactly…
I saw so many tutorials, they are extracting spesific fields or something else. But I want all the data.

So, do you have any idea, how can I solve these problems?

Your help is much appreciated.
Stay safe!

Hello @sashatheitguy!

It seems that you have trouble getting an answer to your question in the first 24 hours.
Let us give you a few hints and helpful links.

First, make sure you browsed through our Forum FAQ Beginner’s Guide. It will teach you what should be included in your topic.

You can check out some of our resources directly, see below:

  1. Always search first. It is the best way to quickly find your answer. Check out the image icon for that.
    Clicking the options button will let you set more specific topic search filters, i.e. only the ones with a solution.

  2. Topic that contains most common solutions with example project files can be found here.

  3. Read our official documentation where you can find a lot of information and instructions about each of our products:

  4. Watch the videos on our official YouTube channel for more visual tutorials.

  5. Meet us and our users on our Community Slack and ask your question there.

Hopefully this will let you easily find the solution/information you need. Once you have it, we would be happy if you could share your findings here and mark it as a solution. This will help other users find it in the future.

Thank you for helping us build our UiPath Community!

Cheers from your friendly
Forum_Staff

Hello there,

I know it is a very very late reply, just stumbled on this thread - please use Digitize Document and look into the dom output, for each page - something like dom.Pages(i).Size, and see if it wider than taller or the other way around, thus guessing if it is portait or landscape