Not able to select attributes from PDF to scrap the data

I want to scrap data from PDFs but i am not able to select the particular texts, only the whole section is getting selected, so how should scrap the data as via read PDF activity i am not getting the desirable string.

Hey @Aishwarya_Bhargava

I hope you are well.

Have you tried using the Computer Vision (CV) functions?

Once the CV Scope has been added, you will be able to see the elements you can scrape. I suspect this might help you.

Please let me know if you have any other questions! Please mark this as the solution if it helps you solve your problem

Have a great one further!
Kind regards,

I am not able to access the Computer Vision functions can you please explain me the expected steps to use it.


Instead of scraping you can try Read PDF, Read PDF with OCR Activities, you can directly get the Text of the pdf and after that you can use Regex function to required string

For any help, please share some sample string, so that we can try on that

Hope this helps you


Read PDF is not working as the string that i am trying to obtain does not have a constant pattern, i am trying to extract information from different PDFs which have different layouts, so every time i use the read PDF activity the string changes it position and it becomes really hard to extract it.
I have to obtain invoice number and invoice date which are quite similarly represented in different PDFs, so none of the methods r working suitably

i am facing the following problem:

  1. not able to select text from pdf
  2. even if i am able to select the text i am not able to make selector dynamic, as when i omit the things that make the selector static, it goes on selecting some other text
  3. read pdf is not giving similar(constant pattern) output for all kind of pdfs, so that not working

Computer Vision functions are not usable as the scope is not selecting all kind of texts.