I want to read specific text from pdf . It has multiple pages the pages are dynamic

I want to read pdf from excel. It can be single or multiple page. I want to read specific text which are in this specified format.
MicrosoftTeams-image (83)
MicrosoftTeams-image (77)

I want to get this value and write into excel in one specified row one after another.

Can someone share an expression for this

@Bhushan_Nagaonkar

Can you share the output of read pdf text…that would show how it looks and how it can be extracted…you can remove original data and retain headers to know the format

Cheers

its a document where headers are changing constantly. Basically its financial document. So its headers are not much similar

@Bhushan_Nagaonkar

Do you need ssn always or any other firlds…the firlda that you need are present always? Or they might be present only sometimes and without the format of how data looks it’s difficult to give an answer

Cheers

I need ssn only evertime it is available in above two formats available above.

@Bhushan_Nagaonkar

Then we need how the text looks once extracted to give you a regex…

Cheers

I will share the text in few minutes give me some time.

Please check your dm.

@Bhushan_Nagaonkar

Try this

System.Text.RegularExpressions.Regex.Match(str,"(?<=Social\s+Security\s+Number:).*").Value

Cheers

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.