I want to extract multiple tables present in a word document. (Content inside each cell may have bullet points)
I have tried various approaches like:
using Bala R packages
trying string manipulation using RegEx (extracting whole word file in one single string)
trying out approaches available on UiPath Forum
But all of this fail to extract all tables, they extract but without proper formatting or sometimes not extracting at all.
Is there is any approach where we can include PYTHON packages or some other language to extract all the tables from word file. Maybe save them all in excel sheet.
Out of curiosity, what failed when you were using the BalaReva packages? I assume you tried to use “Read All Tables”, I’m interested in knowing what went wrong. Thanks!
Yes, I tried “Read All Tables”. But it generated table but not in proper formatting and also most of the cell values were incomplete and not in proper order.
Word doc tables were of the format of {2 COLUMNS, multiple ROWS}, but after extracting, in the generated excel all data were in {1 COLUMN, multiple ROWS}
UPDATE:
There is a library that I found which extracts and generates expected results.
Using this package, I was able to extract multiple tables from one single word document and all the tables were saved with proper formatting and expected cell values.