Extract data from fixed rows from a PDF

image
Hi. I am new to this automation and I am trying to extract all data that appears under these three colums named Kreditkonto, KID and Beløpsgrense. Under these columns the amount of data will vary from each pdf, but every pdf contains the same structure/dynamic. What is best practice to extract these specific data when the rows may vary from each pdf, but the columns with the heading will remain the same.
Is the draft attached here a reasonable solution?

Hi @Haakon.Villalpando

If the Format of the PDF remains constant , please try this

Hope this helps,

Thanks.

1 Like

Hi @Haakon.Villalpando ,

I would say you are in the Right Direction. The Next Step Analysis would be to Check the Output of Read PDF Text Activity.

You will be able to understand the Delimiter between the Columns, Most Probably in your case, it looks like the Space.

After you understand the Delimiter, you could then use the Generate Datatable Activity by Specifying the Delimiter as Space and then get the Output in the form of Datatable.

However, I think you should use the 4 Columns in your Build Datatable Activity, Since you have 4 Columns in the Input.

We can Later keep only the Required Columns in the Datatable.

1 Like

Thanks for the advice and good help. :slight_smile:

Thanks for the video, I will take a furhter look into it.

Glad that helped.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.