Read PDF Text Changes - No Longer Matching FullText

We had an existing process that has been running in production for ~1 year to extract data from invoices from a very high-volume vendor, but ran into a problem after a recent update. The Read PDF Text activity is now outputting the string in a different order than previously, and the new format has made it so that certain data points can no longer be reliably extracted. However, if I open the PDF and use Screen Scraping, FullText method, the output is in the old format.

Before I re-factor this process to open each pdf and screen scrape each page, is there any way to get the Read PDF Text activity to provide the same format as ScreenScrape FullText?

1 Like

Fine
Yes it’s ps
But it all depends on which region and what region is been scrapped with screen scrapping
Because read pdf activity will read the whole pdf
While screenscrapping takes a certain region under scrape

Cheers @jcarr79

Hi, I’m also having this issue - after upgrading an old process and VM to use the latest release, Read PDF Text is giving a completely different string than it used to (and had reliably been doing for 2 years). Is there a setting we can change to revert to the previous behaviour of this activity?

Did we try getting back to the same package version
In Project Dependencies
Go to Design tab β†’ Manage Packages β†’ Project Dependencies and then downgrade the pdf version to the one we had previously

Cheers @AllenC