Best way to capture data from a PDF generated on the webpage

Rafaeloneil · June 14, 2019, 3:23pm

Hi guys,
I am very grateful for the help I have received here in the forum, each day that passes my understanding of the tool increases!
I have a Workflow that inserts some information and in the end generates a PDF ticket according to image:

Inspecting the Element I get this code:

I’m trying to get PDF content via Get Text but I have not yet succeeded.
Can you give me an orientation?get_PDF_File.zip (119.3 KB)

KarthikByggari · June 14, 2019, 3:27pm

If you are trying to get particular information from the PDF, you can use relative scraping or Get all text and then use string manipulations to extract the required information.

Regards,
Karthik Byggari

Rafaeloneil · June 14, 2019, 3:34pm

This is what I’m trying to do:

Through the Get Value I try to allocate the text in a variable (vlr_DadosPDF) but a system.nullreferenceexception error occurs

when I try to view the value of the variable in a message box

KarthikByggari · June 14, 2019, 3:37pm

This is because the selector is failing to locate the UI element and failing to assign the output.

And also make sure the PDF document is visible while scraping.

Can you please validate the selector using UIExplorer.

Regards,
Karthik Byggari

Rafaeloneil · June 14, 2019, 8:43pm

I have not yet succeeded in observing the Element,
I upload a flow (Flowchart.xaml) that exemplifies my goal:
get_PDF_File.zip (119.3 KB)

Palaniyappan · June 14, 2019, 9:22pm

Hi buddy @Rafaeloneil

Kindly follow the below steps that could really help you sort this out
Well this can be handled in many ways and let me tell you one by one
–if we are trying to extract specific terms from the pdf and if the pdf is not a native pdf, means the words in pdf can be selected as individual elements and if so we can use GET TEXT activity…But before using tis activity we need to use START PROCESS activity where we need to pass the file path of pdf as input to the file name property in start process…the reason to use start process is to open the pdf and bring that file in front of screen, then only the bot will be able to see the elements and get the text with get text activity…
–another way is if we are trying to get the text, a specific text but the pdf is native pdf, we cannot select the wordings as individual element, in that case we can use SCREEN SCRAPPING with OCR activities option in the design menu of studio, which will look for the text and get the text with OCR engines…and for this also the pdf must be brought up in front of screen so we need to use START PROCESS activity before this screen scrapping activity.
–Or if we are trying to extract the whole data from the pdf as a text, we can use READ PDF TEXT activity if there is no imaged text format as all the text wordings can be selected as individual elements, and if they are in image format like the whole pdf gets selected when trying to indicate a word with the selector, then we can use READ PDF OCR ACTIVITY, but for these two activities we dont need to use START PROCESS activity as they will read the content without opening the application

Kindly try this and let know for any queries buddy
you were almost done
Cheers @Rafaeloneil

Rafaeloneil · June 17, 2019, 10:58am

Thanks for the great explanation, I solved the problem with some substrings by reading the pdf file after saving it

Palaniyappan · June 17, 2019, 11:13am

Fantastic
Cheers @Rafaeloneil

system · June 20, 2019, 11:13am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Scraping an element from the PDF file Help studio	4	1045	October 26, 2018
Used "Get Text" to extract a variable from pdf,but it is not opening the particular pdf file from the folder, it says cannot fond the ui element Help	19	3320	November 13, 2018
I have a problem with getting an element from a pdf file Help activities	11	2160	June 4, 2019
Get Particular Data From PDF Studio uiautomation	15	1393	January 17, 2024
Get Text : How to indicate particular element from pdf file? Activities pdf , activities , question	16	1027	October 11, 2022

Best way to capture data from a PDF generated on the webpage

Related topics