
Markdown file parser and structured extraction tool (CLI + Python API).
pip install jostack-mdparse
# Extract as JSON
jostack-mdparse extract README.md -f json
# Extract specific sections
jostack-mdparse extract docs/ -s "Installation,Usage" -f text
# Print table of contents
jostack-mdparse toc README.md
# Print frontmatter metadata
jostack-mdparse meta blog-post.md
from jostack_mdparse import extract
# Basic extraction
result = extract("README.md", format="json")
# Filter by heading level
result = extract("docs/guide.md", heading_level="1,2", format="text")
# Extract specific sections
result = extract("README.md", sections="Usage", strip_html=True)
| Command |
Description |
extract |
Parse Markdown and output structured data (JSON, text, HTML) |
toc |
Print the table of contents (heading tree) |
meta |
Print frontmatter metadata as JSON |
| Parameter |
Type |
Default |
Description |
output-dir |
str |
None |
Directory where output files are written |
format |
str |
"json" |
Output format: json, text, html |
quiet |
bool |
False |
Suppress console logging output |
heading-level |
str |
None |
Filter by heading levels (comma-separated) |
sections |
str |
None |
Extract sections by heading text (comma-separated) |
include-frontmatter |
bool |
True |
Include YAML/TOML frontmatter in output |
strip-html |
bool |
False |
Strip inline HTML tags |
include-code-blocks |
bool |
True |
Include fenced code blocks |
include-toc |
bool |
False |
Add generated table of contents |
flatten-lists |
bool |
False |
Flatten nested lists |
section-separator |
str |
None |
Separator between sections in text output |
normalize-links |
bool |
False |
Convert relative links to absolute |
# Install in development mode
pip install -e .
# Run tests
make test
# Run linting
make lint
# Format code
make format
MIT