Extract data from pdf to excel issue


#1

HI All,

I want to extract the data from pdf to excel. Here is my fileINVOICE 1500354862.pdf (506.0 KB)

I want to extract below:
tax invoice:1500354862
Port of loading:HKHKG
Port of discharge:SGSIN
total net amount taxable:879.27
total tax amount:7
total invoice/credit amount:886.27

I have a lot of pdf like that need to extract data to excel.
Who can tell me how to make it?


#2

@joyozou

Read the pdf and store it in a string strabc

Split the string based on New Line ans store it in an array of strings.
If your format remains same then you can do like this
strArray=strabc.Split({Environment.NewLine},StringSplitoptions.RemoveEmptyEntries)

Then Run foreach loop
Split each element based on :
After splitting get the second element…
First line will give you Tax
Second line will give you Port of loading
and so on.

If the format is changing means you have to use Regex.
Regards,
Mahesh


#3

HI Mahesh,thank you for your reply. I tried many ways but it’s not secussful . I would like to know if you could share your xaml


#4

can you share your xaml for what you explained?