I’ve a set of pdf files which has different table . (PFA attached Pdf file.) Those data in the PDF has to be converted to an Excel file with same headers. The thing is I need only selected columns from a particular table. For example in Allergies table, It has 6 columns, but I need only 2 columns to be extracted and stored as datatable and stored as PDF.
Another Issue, I’ve used substring to identify each tables based on the index value of their headers and stored that in a datatable. Everything gets stored in a single column. Pl. help.
Thanks for the reply @arivu96 ! I’ve tried all those, but the only issue I have is, the data is dynamic. For eg: In the workflow you have attached, The “Card Total”, “CardNo” are fixed data and the follow up data has to be retrieved. In my requirement, I need all the data which follows a particular column, say for example, I’ve a column named Allergy Name, I need all those names in that column.
When I convert to a text file and make it as an collection, the index is not static. i,e second column comes to first at times and third moves to last. So, it is not possible for me to extract that data based on line number and index.