Extract data from fixed rows from a PDF

Haakon.Villalpando · April 20, 2022, 10:23am

Hi. I am new to this automation and I am trying to extract all data that appears under these three colums named Kreditkonto, KID and Beløpsgrense. Under these columns the amount of data will vary from each pdf, but every pdf contains the same structure/dynamic. What is best practice to extract these specific data when the rows may vary from each pdf, but the columns with the heading will remain the same.
Is the draft attached here a reasonable solution?

suraj.setty · April 20, 2022, 12:37pm

Hi @Haakon.Villalpando

If the Format of the PDF remains constant , please try this

Hope this helps,

Thanks.

supermanPunch · April 20, 2022, 12:47pm

Hi @Haakon.Villalpando ,

I would say you are in the Right Direction. The Next Step Analysis would be to Check the Output of Read PDF Text Activity.

You will be able to understand the Delimiter between the Columns, Most Probably in your case, it looks like the Space.

After you understand the Delimiter, you could then use the Generate Datatable Activity by Specifying the Delimiter as Space and then get the Output in the form of Datatable.

However, I think you should use the 4 Columns in your Build Datatable Activity, Since you have 4 Columns in the Input.

We can Later keep only the Required Columns in the Datatable.

Haakon.Villalpando · April 20, 2022, 1:27pm

Thanks for the advice and good help.

Haakon.Villalpando · April 20, 2022, 1:28pm

Thanks for the video, I will take a furhter look into it.

suraj.setty · April 20, 2022, 6:01pm

Glad that helped.

system · April 23, 2022, 6:01pm

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Query related to PDF extraction through Document Activities activities , question , document_understanding	4	502	January 25, 2023
Best way to read a complex organized pdf file Activities excel , pdf , orchestrator , studio	7	911	July 19, 2023
Extracting Datatable in a PDF Activities pdf , activities , question	7	69	July 29, 2024
Trying to extract columns from unaligned PDF data Help datatable , pdf , studio , string , question	11	1919	January 22, 2021
Extract tabular data from PDF Help pdf , activities , data_scraping , question , data_manipulation	7	1632	December 14, 2019

Extract data from fixed rows from a PDF

Related topics