Extracting Table from PDF

Hi everyone, I have a pdf file (Bank Statement) and I want to fetch transaction table from it to excel using Tesseract OCR. I have tried with data scraping and screen cannot able to be captured. since tesseract ocr is passing completely as string, is is possible to extract to data table? Below is the output from tesseract OCR,

Your Transaction Details
Date Details Withdrawals Deposits Balance
Apr 8 Opening Balance 5,234.09
Apr 8 Insurance 272.45 5,506.54
Apr 10 ATM 200.00 5,306.54
Apr 12 Internet Transfer 250.00 5,556.54
Apr 12 Payroll 2100.00 7,656.54
Apr 13 Bill payment 135.07 7,521.47
Apr 14 Direct debit 200.00 7,321.47
Apr 14 Deposit 250.00 7.567.87
Apr 15 Bill payment 525.72 7,042.15
Apr 17 Bill payment 327.63 6,714.52
Apr 17 Bill payment 729.96 5,984.56
Apr 18 Bill payment 223.69 5,710.87
Closing Balance $5,710.87

Hello @deepan.b,

Welcome to the community,

There is a activity to extract tables from pdf to excel. Have a look at it.

Hi @deepan.b welcome to forum

For extraction of tables from PDF

You can use document understanding feature of uipath

Check this video for understanding of extraction of tables from PDF using document understanding feature by @Parth_Doshi

Hope it helps you


Nived N :robot:

Happy Automation :slight_smile::slight_smile::slight_smile::slight_smile: