PDF Data Scraping

PDcoder · August 2, 2019, 7:12am

Hi,
I want to scrape the data from multiple pdfs which contains the data as shown in the SS.Which process should I follow to scrape the data and put the same in an excel sheet

kalyanDev · August 2, 2019, 7:27am

Hi @PDcoder use the het pdf text activity and then you will get entire the data in string
then use split function split the data …first split the data with new line and then split the each line using space or : and assign that data.

venkatmalla6 · August 2, 2019, 7:30am

@PDcoder does the multiple pdfs are same in this format if yes you can simply go with scrape relative
-initially use build data table activity with the fields you want.
-then use directory like directory.getfiles(“path of the folder”,“*.pdf”)
-use for for each and give the input of arrayvariable of directory and change the argument to string.
-inside the body use recording of image type go to scrape relative and scrape all the values which you want.
-at last use add data row activity and give the variable names inside the array and give the datatable input.

PDcoder · August 2, 2019, 10:48am

Hi @kalyanDev,
How to get only that key value after “:”
Can you please explain with example in uipath?
Thanks in advance .

AshwinS2 · August 2, 2019, 10:52am

Hi @PDcoder

Use matches activity

And give the regular expressions as (?<=GSTIN:).*

Check this and let us know
Thanks
Ashwin.S

pattyricarte · August 2, 2019, 10:53am

Hi @AshwinS2

Hope this might be helpful to you Anchor Base

cheers

Happy learning

kalyanDev · August 2, 2019, 10:53am

use split activity or varible.split(“:”…ToCharArray)(1) try this and let me know

Topic		Replies	Views
Scenario pdf data extraction Help	7	883	October 24, 2019
Extracting Data from Pdf (multiple pages) to Excel Help excel , pdf , activities , question	4	808	November 19, 2019
Pdf data scrapping Help	16	1866	April 16, 2019
Read multiple PDF from a Folder Help pdf , activities , question , file_system	24	3044	January 10, 2020
Extracting pdf to excel Help excel , pdf , activities , question	5	1224	December 6, 2019

Most Active Users - Yesterday
ashokkarale
MD_Farhan1
Ajay_Mishra
postwick
Dheerendra_vishwakarma
Anil_G
chandreshsinh.jadeja
Gautham_Pattabiraman
vrdabberu
aravindbalineni123
More details...

PDF Data Scraping

Related Topics