How to convert Extracted PDF data to excel in specified format

Hi,

We have a PDF file with Invoice details, check Number and Check Date. We have to extract these details in to excel with Invoice details as rows and add check Number and Check Date for each row.

Output is displayed in message box as follows:

Please help us with this flow.

Thanks,
Mounika

I give some idea how to do this.

  1. Once you read the text, convert into array. Each array item contains one line.
  2. Find Index of “No.” in the arr[0], use sub string to fetch the check number.
  3. Find Index of “Date:” in the arr[1], use sub string to fetch the check date.
    [Manually add the column headers]
  4. Once you find the “Invoice” in the array of strings, your data starts from the next row (Invoice details)

[Before entering data into Excel → Trim the extracted value will delete unwanted spaces]

Extracting Invoice Description:

  1. Find Index of “/”

Invoice Description = arr[datarowindex].SubString(0,IndexOfSlash-3) [remove the eg., 01/]

Extracting Invoice Date:

  1. Find Index of “/”
  2. Find Index of first $ sign
  3. Count number of characters between first $ sign and slash-3

Invoice Date = arr[datarowindex].Substring(IndexOfSlash-3,Count)

Extracting Gross Amount:

  1. Find Index of the first $ sign
  2. Find Index of the second $ sign
  3. Count the number of characters between second and first $ sign

Gross Amount = arr[datarowindex].Substring(indexOfFirst$Sign,Count)

Extracting Discount Amount:

  1. Find Index of the first $ sign
  2. Find Index of the third $ sign
  3. Count the number of characters between third and second $ sign

Discount Amount = arr[datarowindex].Substring(indexOfSecond$Sign,Count)

Extracting Net Amount Paid:

  1. Find index of the third $ sign

Net Amount Paid = arr[arr[datarowindex].Substring(indexOfThird$Sign) [will fetch till the end of string]

  1. Terminate the loop once you find “******” in the array.

Regards,
Karthik Byggari

Hi @KarthikByggari ,

Thanks for your help.

Could you please provide any sample .xaml file for our reference.

Thanks,
Mounika

Please send the text in a notepad file and attach it here.
I will send you the xaml.

@KarthikByggari Attached Extract Data notepad file.

Thank you.
Extract Data.zip (670 Bytes)

I will send you the xaml file by today!
Happy roboting…

Please find the attached flow.
extract_Data.zip (17.8 KB)

@kirti.iyer
Hi Kirti,

Thanks for your help.

Based on your inputs, we tried to write regular expressions for 3 amounts in to three columns. But we are unable to extract the particular amount between two dollar symbols.

text: 6083161001/18/17$280.00$0.00$280.00
Expected: We need amounts to be extracted individually with regular expressions.

Could you please help us in completing the extraction of data in to excel.

Thanks & Regards,
Mounika Polsani

Just use split method using dollar as seperator. while writing add dollar at front.

@kirti.iyer

We tried using that, but it is not working.

Could you please write the complete expression to get a particular amount to one column.

Thanks,
Mounika Polsani

Please find the updated flow
extract_Data.zip (18.9 KB)

1 Like

Thank you @kirti.iyer

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.