How to extract searched text from PDF file of online document?

Hi All,

I need to extract the searched text from PDF file of online and do something with those data’s.

If there’s a solution please let me know it as soon as possible.

Hi jitendra_123 Sir,

Hi Sir,

Yes Sir This is the link Sir,

http://clists.nic.in/ddir/PDFCauselists/madras/2019/Oct/01331102019.pdf

did READ PDF or READ PDF OCR activity helped us on this
–that activity would give us a string output and that output variable can be passed as input to a GENERATE DATATABLE ACTIVITY and get the output with a variable of type datatable named dt
–then it can be written to a excel with WRITE RANGE activity

Cheers @SOURAV_KUMAR_DAS

1 Like

Hi Palaniyappan Sir,

No Sir PDF Extraction didn’t extract while it’s opened.

May i know why we have kept it open
if not needed kindly close the pdf and try once
Cheers @SOURAV_KUMAR_DAS

2 Likes

Hi Palaniyappan Sir,

As our requirements is of to search the text while it’s in open and this PDF generates day by day so we need to process at online itself.

do we have the privilege to download that pdf file so that we can process them with those activities
Cheers @SOURAV_KUMAR_DAS

1 Like

Hi Palaniyappan Sir,

Yes Sir we can download but it is time consumeable where the particular text to be searched through multiple files so at that time it is time consuming and more activities we need to create it.

fine
so if we are trying to get only one specific inforrmation
we can try with anchor base activity
is it so…are we trying to fetch only one text information from that pdf
@SOURAV_KUMAR_DAS

1 Like

Hi Palaniyappan Sir,

The specified text to be searched from multiple pdf files.

1 Like

awesome thats fine
anchor base would work in multiple pdf filess
kindly have a view on this thread

Cheers @SOURAV_KUMAR_DAS

1 Like

Hi Palaniyappan Sir,

If the search result is found I need to extract those messages and print it in one document. else it should find it another PDF file and retrieve those messages.

@SOURAV_KUMAR_DAS

What you’re trying to get is a word or number or a big amount of data?

You can use regex to return you a array of your results based on your specific pattern.

1 Like

Hi rmunro Sir,

If my searched text is “TCA/796/2019” in the current pdf link

http://clists.nic.in/ddir/PDFCauselists/madras/2019/Oct/01331102019.pdf

then it should extract these message and write it in any document.

so can you explain in sample workflow.

Hi Palaniyappan Sir,

If my searched text is “TCA/796/2019” in the current pdf link

http://clists.nic.in/ddir/PDFCauselists/madras/2019/Oct/01331102019.pdf

then it should extract these message and write it in any document.

so can you explain in sample workflow.

Hi All,

Is there any solution please let me know it.

Hi @SOURAV_KUMAR_DAS,

To extract the text from your pdf, you can use read pdf with ocr and it will give you all data(in text form) which you can then use for further activities.

Regards
Sonali