Read PDF Activity for multiple documents with dynamic lengths

Hi,

I am scraping information from PDF, splitting the string to put into multiple arguments to be used throughout my build.

Read PDF is the only activity that is scraping the information accurately as some of it is in tables or highlighted.

The issue is that the Read PDF is only reading half the document, for example up to page 10 on a 16 page document. I had tried splitting it up and for a 16 page document I was using two read PDFs with the first scraping 1-7 and the next 8-16.
This however won’t work for all documents as if one is 23 pages it won’t get it or if a document is only 8 pages it throws an out of bounds error.

Ideally I would be getting all pages in the one variable or my split strings will all need to be changing each document.

Any ideas

Hello

This might not help but there is a Split PDF activity where you can specify pages to split from.

Split into pages of 8
Read all PDF files.

I hope you find a solution :slight_smile:

Yeah sorry, I can split the PDF using the read PDF activity, using multiple activities and setting the range for each one, but the issue with that is that it outputs multiple strings, and then each of my variables such as name would need to be splitting from different outputs.
Also the document could be various lengths so it would need to change each time what number it was being split at.

1 Like

How big are the file sizes?