How to convert Pdf file to excel and save in one location?

Posting this here also, since I solved this for him in private for anyone else seeking ideas on PDF extraction.


This almost made my head explode, but I got it working.

I’m not entirely sure if it meets all your requirements, but it formulates the text to a table with all the headers and places the check number, date, and amount in the filename. Much of it is driven by vb.net lambda (ie .Select, .Where, .Split).

I did my best to create it as dynamically as possible, but with limited testing there could still be some glitches.

I created many variables to make maintenance easier on you, and also Annotated all the key coding parts to help describe what it is doing.

In summary, here are the steps it takes:
— Reads text and fixes any newline characters that will cause manipulation issues
— Extracts all top text with headers
— Removes all top text with headers from the full text to store data to be added to table
— Extracts all Account numbers and names since they only appear once
— Extracts Account Totals and removes from text
— Extracts Check Number at bottom
— Extracts Check Date at bottom
— Extracts headers only and outputs it to CSV, then reads back to a data table
— Removes accounts and names from data text to prepare for For Each loop
— Split data text by page and loops through each page
— Loops through each account and continues on current one if end of page occurs
— Loops through each line
— Adds items to “|”-delimitted string and as item array adds to table
— Moves to next account when Paid amount sum equals Account total
— Once all text has been looped through, Write table to CSV

PDFTextExtraction.xaml (46.6 KB)

I have attached workflow and screenie of output results.

Regards!

Clayton.

3 Likes