curldown

Fetch a webpage and return clean Markdown for AI workflows.

curldown is CLI-first:

Static mode: fetch HTML -> Cheerio cleanup -> Turndown markdown.
Dynamic mode: headless Chromium (Playwright) -> HTML -> markdown.
--auto tries static first and falls back to dynamic when static output is thin.
Direct markdown responses are passed through (including .md URLs served as text/plain).
--format json emits markdown plus metadata for agent pipelines.

Install

npm install -g @jenslys/curldown

Quick Start

# Print markdown to stdout
curldown https://example.com

# JS-heavy pages
curldown https://example.com --dynamic

# Auto fallback to dynamic when static output looks incomplete
curldown https://example.com --auto

# JSON output for AI pipelines
curldown https://example.com --format json

# Write output to a file
curldown https://example.com --output page.md

CLI

curldown <url> [options]

Options

--auto Try static first and fallback to dynamic when static output is thin.
--dynamic Use Playwright Chromium to render before extraction.
--format <type> Output format: markdown (default) or json.
-o, --output <path> Write output to file instead of stdout.
--timeout-ms <number> Request/render timeout in milliseconds.
--header <key:value> Custom request header (repeatable).
--help Show help.
--version Show version.

JSON Output Shape

--format json returns:

url
final_url
title
markdown
content_type
status
fetched_at
word_count
sha256
used_dynamic

Local Development

bun install
bun run build
bun run test
node dist/cli.js https://example.com

AGENTS.md Snippet (Optional)

Paste this into your AGENTS.md if you want agents to always use curldown for website content retrieval:

## Website Content Retrieval

- Use `curldown` for website/article page retrieval in agent workflows.
- Do not use `curldown` for raw code files or repository file blobs (for those, fetch the file directly).
- Default command: `curldown <url>`.
- Prefer `curldown <url> --auto` when page rendering might be uncertain.
- Use `curldown <url> --format json` when downstream steps need structured metadata.
- Prefer stdout output unless a task explicitly requires a file (`--output <path>`).
- Do not use ad-hoc HTML scraping or direct browser automation when `curldown` can handle it.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
package.json		package.json
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

curldown

Install

Quick Start

CLI

Options

JSON Output Shape

Local Development

AGENTS.md Snippet (Optional)

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

curldown

Install

Quick Start

CLI

Options

JSON Output Shape

Local Development

AGENTS.md Snippet (Optional)

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages