How to Get text from PDF if it is in multiple lines

Hi People,

I need to get a text from PDF.

For example, the format of PDF is

Supplier Name ABCD-123 Invoice Type Pay
TDF
Invoice Number DFG-1234
0000

But when i read the PDF using “Read PDF text” activity, I am gettin the string as,

Supplier Name ABCD-123 Invoice Type Pay
TDF
Invoice Number DFG-1234
0000

When I use regex,

“(?<=Supplier Name ).*(?=Invoice Type)”

I am getting the Output as, “ABCD-123” but i req the output as “ABCD-123TDF”

I would be great if any one would help me out with the issue.

Thank you

I assume Invoice Type Pay the data from the middle or rightmost part of document is coming in the same line as one sentence after extraction. And TDF is on a line after "Supplier Name:

So you already have the regex for that supplier name. What you can do is just extract the data from the next line and then simply join two string

strSecondPart = “(?<=Invoice Type Pay ).*(?=Invoice Number)”
then use + to simply add two strings variables

HI Rahul,

The “Invoice Type” is constant but the “Pay” will keep changing. It may be “Standard” too.

Thanks

If this is the text and last part “Pay” can change,

Split using the newline and fetch the data in second line
strSecondPart = Split(strExtractedText,Environment.Newline)(1).toString

1 Like

Hi Rahul,

The regex condition you suggested is not fetching anything its coming as blank.

Thank you

What would be appreciated is that as you have the actual file, please see the logic. That Environment.Newline is a newline. Check at which line that desired part is coming in your actual extracted data. It will be blank if that line is blank.

Split converts the string into array of string, in this case every line will be one array item. Just see which line is that data in your extracted text and use the corresponding array element.

grafik
refering to group
grafik

result on prototype:
grafik
grafik

"(?<=Supplier Name )(.*)(?:Invoice Type Pay\r?\n)(.*)"
 String.Join("",{1,2}.Select(Function (x) myMatch.Groups(x).toString.Trim))

shifted to
grafik

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.