Extract table from pdf as it is

How to extract the table format data from pdf to excel

Hi @Suraj_Gaikwad_Nuvama_Grou ,

You can follow these general steps:

  1. Install the UiPath.PDF.Activities package from the Manage Packages option in UiPath Studio.
  2. Use the Read PDF Text activity to extract the text from the PDF file.
  3. Use the Generate Data Table activity to convert the extracted text into a DataTable.
  4. Use the Filter Data Table activity to remove any unwanted rows or columns from the DataTable.
  5. Use the Write Range activity to write the filtered DataTable to an Excel file.

Thanks

Hi @Suraj_Gaikwad_Nuvama_Grou

Please checkout the following thread:

Hope this helps,
Best Regards.

4 point i didn’t understand

Hi @Suraj_Gaikwad_Nuvama_Grou ,

Remove the unwanted columns or rows , take only those rows or column that you needed if in case there is no unwanted data then you can skip also.

Thanks

Hi
Just now we had a similar discussion of pdf table extraction

Have a view on this thread for more details

Cheers @Suraj_Gaikwad_Nuvama_Grou

@Suraj_Gaikwad_Nuvama_Grou

Check below post for your reference

You can use Python code and use Camelot Library

Hope this may help you

Thanks,
Srini

Can you share some example so it’s helpful for me

U got some demo on this
Hope that would help u to build the workflow

@Suraj_Gaikwad_Nuvama_Grou

Hi @Suraj_Gaikwad_Nuvama_Grou ,

Maybe you could also check if the below post suits your requirements :

i m not getting understand

please let know more info to extract as it is table from pdf file

@Suraj_Gaikwad_Nuvama_Grou ,

We do not know on what points you are not able to understand.

It would be better if you could provide us with a Sample Data file and then Provide us the Expected Extraction data from it formatted in an Excel maybe. This way we will be able to help you better and suggest the proper approach.

the scenario is we have 5 pdf files
2.select some specific text data and table from pdf files
3. selected data extract to the mail body

@Suraj_Gaikwad_Nuvama_Grou ,

The Highlighted point above is where we would need information on, Could you provide some sample data (by masking the original data) after the data is extracted to a text file ?

Do also keep the PreserveFormat option checked in Read PDF Text activity.

We need to know what is the Specific text that you want to extract, Is there any anchor or Keyword that we can refer for it to be extracted and also the same for the Table Data.

We did provide some examples above which also does suggest some form of solution.

Do note on a generic level we have already many posts related to the Data extraction from PDF files but a specific case and if not encountered before, we would need to check on the data formats of the Inputs.