How to extract table data pdf with warped tables in pdf?

uiautomation
pdf
studio
datascraping

#1

Hi all,
Have a look at this pdf:
http://www.litigationmanagementreport.com/collateral/electronic_documents/invoice.pdf

I have few PDF files with similar format where the tables are warped. I’m able to read the data such as Order No.Freight Company, etc individually. But the problem comes when I’m trying to read/extract data from the second table. When I use extract structured data, it shows up an error saying This control does not support data extraction. Should the PDF be formatted in a particular fashion to make it readable? Should the tables be separated in order to read?


Email Challenge: This is similar, but unlike other Topics
#2

Hi,
Tried doing it.Its not able to identify.
All i think of is either by scraper or read pdf text but the both will give you string output which will give you more work to extract each string from entire plain text .
You need to make use of indexing ans substring to get each item and then pass to excel.

PS: Thought of doing copy and paste but it doesn’t allow to copy the text.
If it then you could’ve make use of hotkeys and dumped table directly into excel. :frowning:


#3

Hi @ddpadil, Thanks for the hints,

I have another pdf, in which I’m able to detect elements, but not sharing due to security concerns. My question is how to read the table data if the table is wrapped? The problem is, data of second table is not readable.
Should the pdf be in a certain structure to make UiPath to detect and read elements?


#4

Hi,
Can you copy that second table and paste in word and then try to use data scraping wizard .
If it works then your assumption will be right.