Read PDF and then add data row write into Excel

Hi All

I had a flow to read pdf and then use RegEx to search keyword data, write into Excel. If all the pdf also is the same format that is fine to me. But my finally purpose is this flow can handle different format pdf, for example PDF 1 has 3 field I need, Item Name, Color and Repeat Size. PDF 2 has 6 field I need, Item Name, Pattern, Article No, Color, Finish and Remarks.

PDF 1 has 3 field
Create Data Table store the value
Create another Data Table for output using array and for each row to write into excel

output Excel like this

PDF 2 has 6 field

How can I create all known field and get value when read different pdf ?
Final output want to like this when I read both PDF 1 and 2

Also upload my project for you all reference
Main.xaml (14.5 KB)
pdf2.pdf (382.0 KB)
pdf1.pdf (481.0 KB)

1 Like


I would like to suggest below approach. Please see if this works for you.

Always read your pdf file first and check if all those 6 header texts ( like “article no:” ) exist. If they are present then get their respective value and insert it in the datatable.
If they don’t exist, then just skip extracting their value .

Sruthi YNM

1 Like

Thx for reply

But actually I not have so much skill in coding, all the thing I just follow YouTube.
Can I have more guide how to do that ?