Getting a certain type of text from PDF using screen scraping

Hello everybody, am a newbie to this Studio,
I was trying to get problems and their solutions from a text pdf, on which web scraping wouldnt work, so after trying screen scraping but it would scrape only that text which i would indicate, unlike web scraping which would find a pattern and do the job. could someone suggest me a way in which I would get the data I out of the pdf
any effort is appreciated

Hi @ShawnShaw

Use read pdf with ocr activity and do the string manipulations and then do the is match that is regex expression and based on that you can get the appropriate text

Ashwin S

Try converting the pdf to text where you can do string manipulations to get the question and answer @ShawnShaw,

I know you are looking for some other thing which will be easier than the one I suggested. but not sure if it is possible

Thanks for the prompting reply, but is it possible to use Regex over an MCQ, or a question, for I think it would get confused over other text

Thanks for the reply, I cant convert pdf to texts for the book has two halves on each page or is that possible, then I dont know

That was a great question , never ever thought of that @ShawnShaw . I don’t think we have option to read both the sides till now. will check and get back to you soon

1 Like