Is there any method to Extract Numbered Headings with Docx4j? #574
Unanswered
Alex-Victor
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm trying to extract a text from a docx file using docx4j. So far I can extract all written text and tables, but I'm having problems while extracting numbered headers and lists. such as :
1. Heading 1
text....
1.1 Heading 2
text....
text....
2. Heading 1
text....
2.1 Heading 2
....
Opening a docx file by zip suffix, inside documents.xml, I found that all these headers and lists are inside numPr tags. like this:
<w:numPr>
<w:ilvl w:val="0"/>
<w:numId w:val="2"/>
</w:numPr>
<w:numPr>
<w:ilvl w:val="1"/>
<w:numId w:val="2"/>
</w:numPr>
My expectation is: is there an easy way to get these headers text (1. ; 1.1; a)...)? how can I convert the tags into text?
I'm most appreciative of your help.
Beta Was this translation helpful? Give feedback.
All reactions