Currently, I’m working in a project which I have to extract data from multiple pdf files. But the problem is I have to extract data from a table which is the pdf file and it can vary from file to file ( Add or remove rows).
i want to extract those table data and fill it into a excel form. I have tried several pdf data extracting methods but none of them work out for this.
Kindly requesting your help on this
Herewith i have attached the same pdf files and images of the data table.
Enable PDF Accessibility mode. this step you have to do in the ur pdf document ( go to edit-
Accessibility- change reading option - choose use reading in raw)
(Make sure don’t forget to enable Pdf Accessibility mode it’s a most important step without this
you can not able select proper word.)
Then use get a text Activity then indicate to you are required Bold + italic format which one you want ok
Get text activity to choose proper selector n make them dynamic which one valid for you. get variable name
Use Assign activity use the same variable with replacing or remove with a trim function which suits you.
6.variable saving in string only,
Item
10
20 Material No.
Quantity
12317082
180
12317138
176 Vendor Mat. No
Unit
Case
Case Description
Delivery Date & Time Price/Unit *Net Value
NESTLE CORN FLAKES Cereal 1 8x275g N2 XK
30.03.2019 3,276.00
NESTLE CORN FLAKES Cereal N3 XK
30.03.2019
Total net value excl. tax 1,853.28
5,129.28
@charith_wickramasing Can you mark which all values can extracted? If i extract values from this string also, will format be same across all other strings you get?