Scenario: I have a simple loop in my newest UIPath Robot where I use Google OCR inside a “Read PDF with OCR” activity inside a For Each (row in a Data Table I’ve created) loop to cycle through a folder of PDFs. I scrape each PDF, determine if it contains certain text, and save pages of the PDF to different locations depending on what I find. If at any point my Robot runs into the “Scrape returned empty text” exception (and I attempt to handle this exception using a Try Catch), the “Read PDF with OCR” activity bugs out “eternally” on every loop going forward – it simply stops scraping the page on each pass.
Steps to reproduce:
(1) Create a simple loop similar to the one I will attach a screenshot of below. The loop should be designed to use “Google OCR” inside a “Read PDF with OCR” activity to read through each page in a PDF document one at a time. (Use a incremented counter.
(2) The “Read PDF with OCR” activity should be inside a Try Catch. The “ArgumentOutOfRangeException” catch is optional (if you want to duplicate what I am doing to determine when the end of the document is), but the “Exception” catch is necessary as this is the only catch that can contain the “Scrape returned empty text” error.
(3) Point the “Read PDF with OCR” activity to a PDF that contains at least one page with no text.
(4) Run the process in Slow Step debug mode, noticing the difference between how UIPath handles the “Read PDF with OCR”/“Google OCR” activities before and after the first time it runs into the “Scrape returned empty text” exception.
Current Behavior:
I have a simple loop in my newest UIPath Robot where I use Google OCR inside a “Read PDF with OCR” activity inside a For Each (row in a Data Table I’ve created) loop to cycle through a folder of PDFs. I scrape each PDF, determine if it contains certain text, and save pages of the PDF to different locations depending on what I find.
One aspect of my loop is that I use a Try Catch to determine if I have scraped to the last page in each file. (I scrape each page separately rather than the whole document at once because I may be saving specific pages in each document to different locations depending on what I find, and I found that this was the simplest way to build that out.) I use this Try Catch on the outside of the “Read PDF with OCR” activity, and the exception type it is catching is “ArgumentOutOfRangeException” (exception source: Read PDF with OCR). This part of the Try Catch works completely fine. I have now run the process for thousands of PDFs and my loop/last page catch work flawlessly.
My issue occurs when I run into a different exception – Namely, “Scrape returned empty text.” This occurs whenever the page I am trying to scrape has no readable text on it. The exception type for this exception is simply “Exception”, and the exception source is “Google OCR” (rather than “Read PDF with OCR”), which is I think where the problem comes in.
To attempt to handle this exception, I have added an additional catch to the Try Catch on the outside of my “Read PDF with OCR” activity that catches any additional generic “Exceptions” that may occur. (The “Scrape returned empty text.” exception is the only other exception I have run into during my extensive testing, so I am fine with this.) I believe the reason this may not work appropriately (as I will explain below) is that the error is really occurring within the “Read PDF with OCR” step (in the “Google OCR” activity), but there is no way for me to put a Try Catch within the “Read PDF with OCR” activity, as the only activities it will accept are OCR activities.
So what happens is this: The Try Catch “successfully” catches the error, in that the robot doesn’t error out and stop running. Instead it continues on with the loop, increments my page counter, and comes back around to the “Read PDF with OCR” step. After running the process in Slow Step Debug Mode several times, I finally figured out what is happening at this point.-- The robot completely skips the “Google OCR” step in each instance of the loop moving forward. The UIPath yellow debug highlighting stops at the “Read PDF with OCR” step and does not highlight the “Google OCR” step, nor does it take enough time on the “Read PDF with OCR” activity to have actually screen scraped anything. In addition, my Try Catch for finding the last page in the PDF never triggers – my robot goes into an eternal loop and I have to force quit it.
Expected Behavior:
I just need some way for UIPath to be able to catch these “Scrape returned empty text.” errors and continue on with my loop without completely bugging out my program.
Studio/Robot/Orchestrator Version:
Last stable behavior: NA
Last stable version: NA
OS Version: Windows 7 Enterprise
Others if Relevant: (workflow, logs, .net version, service pack, etc):
Loop: