Find text -1st occurrance on pdf and get value next to it

Hello,

could you please advise:
I need to get some information from pdf attachments (various customers, so each pdf can vary) like Invoice no, Delivery date, Amount etc.

  • I am able to split pdf to lines (with output.Split(Environment.NewLine.ToArray, StringSplitOptions.RemoveEmptyEntries))

  • I can find for example “Delivery date” string and get delivery date, but my questions are:

  • “Delivery date” might 3 times on pdf, how to find 1st occurrance and get date next to it?

  • “Delivery date” string and the date as such can be always in different structure on pdf, like right next to the string, or right below, or at the end of the row (due to different customers) - how to assure that I always get the date? Do I need to create code for each variation?

Thank you

@Mariansson

Share your sample input string and required output string

Thanks

@Mariansson

Check Attached,

BlankProcess13 (2).zip (18.0 KB)

I tried to do the OCR, but due to bad results, it will give you different values

but if you are getting good OCR results then this flow will work as expected

Mark as solution if this helps

Thanks

Thank you ksrinu. I can use your proposal.
I also found this solution: 1. number_Start = myText.IndexOf(“Nummer”), 2. Calculate = number_Start.Substring(number_Start.length-3,3), 3. Start =Calculate.ToInt and 4. Write Line: "Nummer: " + myText.substring(Start+7,7)

Thank you for your help!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.