Downloading and scrapping data from multi PDF

dumua · June 2, 2017, 8:28pm

Hi everyone,

I want to download all the pdf files from a website in one folder, then extract all the data from each pdf then reorganise the data table.

To do this, I try to download the files with Open Browser → Click the files → Clik Saved As → Click Registred. That work, but only for one file, I can’t automate the process for all the files.

For the data table, I use the snippet For Each Files in Folders who work pretty well, but I not able to scrape all the data from each PDF.

And for the third problem I don’t already try Something.

Sorry if its too obvious, but I’m new with RPA.
Thank you

dumua · June 5, 2017, 8:51pm

Hello there,

I need some help

Thanks

a-taniya · June 6, 2017, 12:27am

Hi, for the first issue, I had similar case.
In my case, I had to download CSV files on a webpage one by one.

First, if you can open the folder with explorer, I recommend you to do that since it may be easier to copy files from explorer.

Anyway, in my case I couldn’t do that 'cause the website is not just showing folder.

What I did is, by using UI explorer, to check available selectors of the 1st, 2nd and the last CSV file, and I found they had incremental ID; 1st CSV has “ID=0” in the selector and 2nd has “ID=1”, for example.

Then created two new variables, one int “csvId” (default 0) and one string “csvSelector”.

Now you’re almost there.

I copied Selector of click activity for download file and then replace ID va with csvId and assisted it to csvSelector.

e.g.
Selector

""

csvSelector

""

Then put csvSelector to Selector of the click activity.

Then I added assign process to increment csvId (csvId = csvId + 1) and made loop of “click download 〜 increment csvId”.

This worked for my case.
Hope you can get any hints from my case.

a-taniya · June 6, 2017, 1:07am

see also

dumua · June 6, 2017, 1:17pm

Thanks you for your help,

To do it easily, I juste use Data Scraping to download all the URL.

Then I open each URL in the list and download them one by one with a for each loop.

Finally , I use a ForEach loop in order to Data Scraping each PDF in the folder but i don’t know how to extract only the data (because the pdf is about 150 pages) or extract all the pdf in DataTable !

Any idea for that ?

Regards
Antoine

dumua · June 6, 2017, 4:25pm

Hi,

I’m facing a liitle probleme, I can’t loop to download my pdf from the URL stock in the excel files.

Can someone take a look at my process ?
<aclass=“attachment"href=”//cdck-file-uploads-global.s3.dualstack.us-west-2.amazonaws.com/uipath/original/2X/b/be52b58298644aba43bf69ddf844937ba7e035ac.xaml">Example.xaml (13.1 KB)

Regards

vvaidya · June 6, 2017, 5:28pm

There is no attachment.in your post.

dumua · June 6, 2017, 6:06pm

Example.xaml (13.1 KB)

Sorry

vvaidya · June 6, 2017, 6:45pm

Try this
Example.xaml (13.4 KB)

dumua · June 6, 2017, 7:00pm

It works really good, thank you

Best regards,
Antoine

Topic		Replies	Views
Help with Continuous download from the same webpage Help	14	2118	June 19, 2018
Download the multiple files from multiple pages Studio activities , studio , question , tools	6	1108	September 14, 2023
Download multiple pdf from website Help	3	2642	September 27, 2019
Download multiple pdf one below the other in a website? Studio studio , question , activities_panel	11	814	February 2, 2023
Downloading multiple pdf with differing url's Help	9	2113	March 2, 2020

Downloading and scrapping data from multi PDF

Related topics