I had similar problem with downloaded PDF files while going through dev training. If you have a problem with scraping partial text in PDF, there’s a note in the introduction to lesson 10 on how could you solve this issue.
It didn’t worked for me tho. As I remember I used OCR scraping and string formatting to get the data. You could also use Computer Vision activities, which shyamm told you earlier about. This technology looks pretty neat, but I did not have use it in the development yet.