Extracting pdf to excel

Hi I need to extract the pdf info and move this in to excel and from excel i need to move to sap,actual problem is here data might not be structured.

so i tried as a first step in python but no luck their one column am getting as issue ,below is the attachement in that sku desciption am not able to scrape.


for ur info i have used tabula in python to scrape.


Tried with read PDF Text activity?

All the PDFs will be in the same format?

no it wont wont be in same fromat

Hi @Karthik_Kulkarni

Try Read PDF Text With OCR Activity
and use Generated Datatable Activity

Ashwin S

ok will try

Hi, use read pdf text activity
In the properties , at ‘preserve format’ set to true
Then the output will be string but the data will be in Proper data table structure same as in PDF

First use write text file activity with above string input and execute the code until this part firt
Now copy the text written in text file and

Now use generate datatable activity , double click on the activity and paste the text at sample window
Use different properties ’ CSV parsing’to true
And Delimiter like space or tab and uncheck other properties like auto detect columns , consider first row as headers
Try use different filter options and check the preview
Now once you get the proper structure as your requirement, click save (ok) and give data table variable as output
Now use excel application scope with write range for this DT variable