Based on the screenshot below , I have a PDF file that is a scanned. PDF taxes may go up to 100 pages so I don’t want to digitize all at once since it will take took too long.
Is there a way we can loop at each page of a PDF file to check if a text . For example If bot found that the page that contains the text “U.S. Individual Income Tax Return” then the bot will stop and get the page number. Any idea would be a great help thank you.
@Jelrey - Please take a look at this …dowload this workflow and this will give you an idea about how to loop through pages…here I have looped through pages and deleteed a page where text found …one thing you have to change is instead of Read PDF Text, you have to use Read PDF using OCR…
@prasath17 , do you have some example that instead of deleting the page where the text is found , I want to retain or remain the page that contains the text and then delete other pages that does not contain the text , the opposite of what you did ? is that possible ?