It would be great if you could help me out for the below scenarios.
Is there an intelligence to check if a pdf page is in portrait view or landscape view. In a pdf, some pages are in portrait view and some in landscape view. I need to read the text in that pdf using OCR. Any suggestions?
I tried to extract text from a structured pdf document. I need the text from all the pages - Tabular and non-tablur formatted text. Below are the options i tried but it doesnt help. Let me know if we can achieve this by any other ways.
2a) “Read pdf with OCR” (With choosing inverted option and without choosing inverted options were tried) - Returns empty result.
2b) Read Pdf text - output is empty
2c) Scraping helps. But how do we know the number of pages and how to extract text from all the pages?
I am trying to extract text from a pdf and trying to move it to another folder. But it says “The process cannot access the file because it is being used by other process”. How do we resolve it? The document is not open anywhere else.
UiPath does not have the intelligence to check if a page is in portrait or landscape mode. However, some of the OCR auto rotate the pages to extract the data
Read PDF with OCR works finally
I copied the file to destination folder and then deleted the file after processing
Hello
can you please explain about OCR you used to extract data from landscape view.
I am also stuck in the same situation can you please help me out @lissynikkytha
I too used read pdf with ocr its giving correct result for all the pages except for rotated one.
to get page rotation and skew angle, please use the Digitize Document activity from the IntelligentOCR 3 activity package. It exposes this information on a page by page basis in the DocumentObjectModel output. Please feel free to navigate through the output (you can do it using the newest debug features in Studio directly) to see where to grab that information from.
Data extraction - I recommend building your own custom activity for data extraction or trying to use the newly released Regex Based Extractor - this applies whatever regex expressions you configure for certain fields, to the Text version of the document fed into the Data Extraction Scope.
Your answer is very helpful for me. Thank you.
I found something about abbyy finereader or flexicapture. (I did not understand which product will be great for me yet.)
I have hundreds of pdf files which may be portrait or landscape. (Some pdf’s may be first page portrait other pages landscape.) So, If you know something about abbyy, which product can be implement on UIPath successfully?
I found a connector plugin for connecting abby and uipath. But I think it works just abbyy flexicapture not with finereader. In this point, I need to read all the pdf file and take all the text data to dom. So flexicapture works with fields. But the finereader works with entire pdf. Which one should I use do you know?