Extracting text from PDF using starts with

Hello Everyone,

I’m trying to extract the specific fields from PDF file and the data is in a tabular format.

The data is like below

Where as the field values are not fixed so i need to find the index of field in the data-table and then extract the text present in next column.
Can anyone please suggest me of how to solve this issue.

did we try with READ PDF or READ PDF with OCR and get the output with a variable of type string which we can pass as input to REGEX expression so that we can get the value we want

Cheers @Shivaraju

I had tried with Read PDF text activity, but for some fields the value is getting in multiple lines. So i cannot use that activity. That’s y I had preferred data scraping.

yah data scrapping is fine unless we dont need to scroll down the pdf
else read pdf will help us and even it has multiple line we will be able to get the value we want
if possible can i have a screenshot of the output that we get from read pdf activities
use writeline activity and mention that as input so that we will be able to see the output

Cheers @Shivaraju

As far as now am able to get the required text from the PDF using data scraping.
Initially i had used read pdf text activity but unable to get the required text fields.
I’m on traveling, will share the sample of how the text is coming with read pdf text.
Can you tell me of how to find the index of string which starts with “street” in a datatable.

do we know the column where this value will be

The field will varies so am not depending on column name nor column index.
I had used row.itemArray.IndexOf(“street”)>1
Then i will get the row and column indexex.
But the problem is that in some cases the field is “Street name” or “street name :”
So i need to search the string starts with “street”.
I hope you understand what my problem is