Need Help with putting together loops and scrapes

I have a really simple project that I am struggling with. I think I have a strong grasp of the various tools to scrape data out of pdf’s but I am struggling to understand how to install scrapes within a loop.

I have an extremely routine pdf. Nearly Identical in all instances, that I need to pull 5 unique data points that appear in the exact same location each time. It’s a really standard invoice. I know how to program the 5 scrapes themselves but I don’t know how to install those within a loop.

You could summarize my problem in two questions:

  1. How do you make data scrapes from a pdf more open-ended so that they will run on any files they are given via a loop regardless of the file name?

  2. How do you send data that is scraped then to some type of database or holding array to later be dropped into an excel file?

Thank you all so much for your time, sorry to hassle the forum.

New User.

1 Like

Hi! Can you send a sample invoice?

Hey Jan,

It seems I am unable to upload files as a new user. I have published it to google drive instead (link). This is one of five identical invoices which I am trying to develop a loop for that pulls five unique items out of this exact template.

As you can see this is pretty basic. Most of these you can convert the pdf to text and pull items based off of key terms. Anything where text won’t work you can use anchors for (I think).

I am looking to pull

  1. The Company Name (Top left)
  2. The Due Date
  3. The Invoice Number
  4. The discount
  5. the balance due

If you are able to describe how to do pull just one of these consistently within a looping structure, with some type of scrape such as the anchor tool, it would help me a lot in my efforts.

Thank you!

1 Like

Will get back to you :slight_smile:

Hi @Thomas_Marzol,

Here is the file: Main.xaml (22.2 KB)

Just change the folder path.

Hope this helps :slight_smile:


Sorry for the late response. Just so I am clear. This file still needs the scrapes to be dropped in? And how do I make the scrapes properly deposit into the data table?

The scrapes are already added to a data table