Selectable PDF > (output var + separator) > DT

Hi all,

I have a selectable PDF report (~90 pages long), from which I want to extract a few tables to then paste into excel.

I was actually able to use the Data Scraping Wizard on the first 2 tables (pg 61 and 62), which is included in the attached xaml, but then it stopped working for the following tables. (I can’t even recreate the data scraping wizard sequence on the same part that it initially worked on. . . weird.)

I’ve had some luck with using the Screen Scraper Wizard and highlighting a specific Region to scrape (basically just around the PDF data table), but it comes out also as one long string, which I’m not sure how to efficiently parse, since I don’t think the content of the tables is consistent/predictable.

Is there a specific delimiter/separator I can include in the properties of the screen scraping activity (pls see s/c below)?


PDF dt to Excel_help.zip (14.5 KB)
PDF file: http://coastal.la.gov/wp-content/uploads/2017/04/2017-Coastal-Master-Plan_Web-Book_CFinal-with-Effective-Date-06092017.pdf
I would really appreciate any ideas on how best to go about this.

Thanks!
Shelby

Hi @Shelby_Pons,

I also had problems with the project that you attached. It seemed odd so I recreated it in a new file and it works fine. There seems to be something wrong with your current file.

PDF dt to Excel_help_newfile.zip (14.0 KB)

Thank you!! I just tested it on my own machine with a new xaml and newly downloaded PDF, and it seems to be working now. I’ll keep this in mind if I run into similar situations.

Thank you!

Were you able to write the data table output to excel (or anywhere)?

The data scraping wizard seems to be working for me with my new xaml, but when I try to write the dt variables to anywhere for further processing, nothing shows up.

I appreciate your advice! Thanks!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.