Can you share the output of read pdf text…that would show how it looks and how it can be extracted…you can remove original data and retain headers to know the format
Do you need ssn always or any other firlds…the firlda that you need are present always? Or they might be present only sometimes and without the format of how data looks it’s difficult to give an answer