Multiple pdf specific data extraction and store to csv

hello, suppose i have a multiple pdf in same folder and all pdf have different format but some terms are same
like invoice number, invoice data in all pdf i want to extract invoice number and invoice date and store into csv for all pdf. Please help

1 Like

Hi @Sandeep_Kumar2

Refer this :slight_smile:

Hi Sob here is the invoice from which i have to extract the Invoice Number and Total Data and store into csvwordpress.pdf (42.6 KB)
invoice 2.pdf (171.1 KB)

i tried but fail


What error you are facing ?

I dont know how to follow the exact step u told to me.i have upload 2 invoice here can u send a xaml file

Hi @Sandeep_Kumar2

1.Get your PDF files by using Directory.GetFiles(“your path”, “*.pdf”)
2.Use “Read PDF Text activity” to read your PDF files.
3.Use split String.
4.Then use for each , Inside that give the condition like item.Contains(“INVOICE NO”) //your required data
Then use your condition inside the if condition.

Do this step by step . After trying also if you face any issue , attach your XAML file.


hi Sandeep_Kumar2 ! Hope you are doing good. I have also face this situation but unfortunately i didn’t get solution till. Can you please provide me this .xaml file if you solved this issue. Many thanks in advance.

@Sandeep_Kumar2 and @umair_hanif

You might both want to have a look here: How to use the IntelligentOCR Package - with the new developments, in case the issue is still current, you might have the right solution!


I do as you said . But i can’t catch specific data . Like i want to get Invoice no:. Table data for each specific header. Below i have attached my pdf and xml files.bristanGroup.pdf (95.4 KB) Main.xaml (5.7 KB) .

Hi @Oyndrila_Chowdhury



Try this. If you still facing any issues .kindly let me know about this :slight_smile:

1 Like

Thanks @Sob . It works fine . But now how can i take the table specific header data and individual addresses.