I need to extract Table of contents from a word document. The word document is about 600 pages.
Can anyone help me with this. Is there a way to do this ?
I need to extract Table of contents from a word document. The word document is about 600 pages.
Can anyone help me with this. Is there a way to do this ?
Check out this thread. It might help you.
Regards
Hi @Sairam_RPA ,
can you share sample file?
where is table contents, normally it in top of file
we can read file then split
This is the format how it looks like. It can be in one of the first 50 pages.
The table always starts with “Table of contents” as heading.
[If there is a regex that searches for “Table of contents” and gets all the values after it which start with word and end with a number and has space or spaces in between - I think will work ] { Can someone help me with this regex ?} @Yoichi @ppr
Below is a sample format
Table of contents
Cover Page 1
Contents 3
Sites 4
Information 5
Description 6
Narrative 7
Cited 8
Resources 13
Equipment 17
Attachments 20
Animals 25
Hi @Yochi
This is the regex I used made a small change.
System.Text.RegularExpressions.Regex.Match(strData,“(?<=Table Of Contents)[\s\S]+?(?=\r\r)”).Value.Trim.ToLower
Works great now. Thanks a lot for your input.
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.