Hi @ashwin.ashok,
If we assume your tables in the PDF have a standard pattern when the text is extracted, then there are two possible approaches (csv format is the savior in both):
Approach 1: Using only PDF activities
Suggested workflow: Main.xaml (12.4 KB)
Results first saved to temp.csv
Approach 2 - Open Pdf in word and extract the specific tables from word
Yes, you can open PDF files in word. Some pdfs wont work so well and will lose formating in word, but most structured ones will.
- Read PDF in word.exe
- Manipulate / convert the read text to a csv format (Hurdle! Multi level headers and multi values in single rows will lose formatting)
- Handling the formatting
- Write the resulted CSV text string to a temp.csv
- Read the CSV and
Major Part of this solution is from : How to read table in a Word document - #16 by Puransse thanks to @vvaidya
Workflow from the above link (slight mofications): wordTables.xaml (9.6 KB)
Results first saved to wordCsv.csv
Hope this helps!