How can I extract multiple Tables from a Word Document?

I want to extract multiple tables present in a word document. (Content inside each cell may have bullet points)
I have tried various approaches like:

  1. using Bala R packages
  2. trying string manipulation using RegEx (extracting whole word file in one single string)
  3. trying out approaches available on UiPath Forum

But all of this fail to extract all tables, they extract but without proper formatting or sometimes not extracting at all.

Is there is any approach where we can include PYTHON packages or some other language to extract all the tables from word file. Maybe save them all in excel sheet.

Have you tried the Table Extraction wizard?

Yes, I tried. It only extracts the first table and also formatting doesn’t remain same.

You’d have to use the Table Extraction wizard twice, using selector to differentiate between the tables. But it’s never going to keep the formatting.

Out of curiosity, what failed when you were using the BalaReva packages? I assume you tried to use “Read All Tables”, I’m interested in knowing what went wrong. Thanks! :slight_smile:

Yes, I tried “Read All Tables”. But it generated table but not in proper formatting and also most of the cell values were incomplete and not in proper order.
Word doc tables were of the format of {2 COLUMNS, multiple ROWS}, but after extracting, in the generated excel all data were in {1 COLUMN, multiple ROWS}

Ok, thanks!

UPDATE:
There is a library that I found which extracts and generates expected results.
Using this package, I was able to extract multiple tables from one single word document and all the tables were saved with proper formatting and expected cell values.

Library: Extract or Update Tables in Word File - RPA Component | UiPath Marketplace

Prerequistes: It requires Studio Version 22.10 and above

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.