English | 日本語
Convert Office / PDF specification documents into machine-readable EARS (Easy Approach to Requirements Syntax) Markdown.
spec2ears is a Claude Agent Skill that bridges legacy specification documents and spec-driven development. It works in two phases:
- Phase 1 — Deterministic conversion (no LLM by default): Microsoft MarkItDown converts the source document to structure-preserving Markdown.
- Phase 2 — Rule-based EARS structuring: the intermediate Markdown is parsed into EARS requirements using rules defined in
references/convert-rule.md(editable without code changes).
- Two-phase pipeline with an intermediate Markdown artifact that is always saved
- EARS output is produced by default and can be suppressed with
--no-ears - Rule-based Phase 2 — change behavior by editing
references/convert-rule.md, no code changes required - Requirement IDs, traceability (source file / heading), and ambiguous-word
要確認flags - Embedded images are extracted to
build/intermediate/images/and Markdown links are rewritten - Optional LLM mode (opt-in via
.env) for image description / OCR-like extraction - Conversion report and failure list per run
- Python 3.10 or higher
- MarkItDown (with the extras for your input formats)
This project uses uv for environment and dependency management.
git clone https://github.com/potofo/spec2ears.git
cd spec2ears
# Create a virtual environment (.venv) with uv
uv venv
# Activate it
# macOS / Linux:
source .venv/bin/activate
# Windows (PowerShell):
# .venv\Scripts\Activate.ps1
# Install runtime dependencies
uv pip install -r requirements.txtThe dependency lists are:
requirements.txt— runtime libraries (markitdown[all], andopenaifor LLM mode)requirements-dev.txt— testing and sample-generation tools (pytest,python-docx,reportlab)
For development (tests and sample generation), install the dev set instead:
uv pip install -r requirements-dev.txtDon't have uv yet? Install it from the official guide, or fall back to plain pip:
pip install -r requirements.txt.
The skill itself lives in the spec2ears/ folder (it contains SKILL.md, scripts/, and references/). To make it available globally to a Claude Agent Skills host, place that folder in your personal skills directory.
macOS / Linux
mkdir -p ~/.claude/skills
cp -r spec2ears ~/.claude/skills/spec2earsWindows (PowerShell)
New-Item -ItemType Directory -Force "$env:USERPROFILE\.claude\skills" | Out-Null
Copy-Item -Recurse spec2ears "$env:USERPROFILE\.claude\skills\spec2ears"Make sure the dependencies above are installed in the Python environment the agent uses. Once installed, the skill activates automatically when your request mentions trigger topics such as Office / Word / Excel / PowerPoint / PDF, Markdown conversion, MarkItDown, EARS, or spec-driven development.
You can then simply ask, for example:
Convert
spec.docxinto an EARS specification.
Once installed as a global skill, you drive spec2ears with natural language. The agent reads SKILL.md and runs the bundled pipeline for you. Below are common patterns and the equivalent CLI command each one maps to.
| What you say | Equivalent CLI |
|---|---|
"Convert spec.docx to EARS." |
python -m spec2ears.scripts.pipeline spec.docx --build-dir build |
"Turn the Excel spec spec.xlsx into EARS requirements." |
python -m spec2ears.scripts.pipeline spec.xlsx --build-dir build |
"Extract EARS requirements from deck.pptx." |
python -m spec2ears.scripts.pipeline deck.pptx --build-dir build |
"Convert spec.pdf into an EARS specification." |
python -m spec2ears.scripts.pipeline spec.pdf --build-dir build |
The result is build/ears/<name>.ears.md (requirement table + traceability), plus the intermediate Markdown and a report.
When you only want the deterministic Markdown conversion and not the EARS structuring:
| What you say | Equivalent CLI |
|---|---|
"Just convert spec.docx to Markdown, I don't need EARS yet." |
python -m spec2ears.scripts.pipeline spec.docx --no-ears |
"Give me the Markdown of spec.pdf to review before EARS." |
python -m spec2ears.scripts.pipeline spec.pdf --no-ears |
This produces build/intermediate/<name>.md (and a report) only — Phase 2 is skipped.
When the requirements are drawn as text inside images, enable LLM mode first (see LLM mode), then:
| What you say | Equivalent CLI |
|---|---|
"Read the spec written in diagram.png and convert it to EARS." |
python -m spec2ears.scripts.pipeline diagram.png --build-dir build |
"Extract the requirements from the images in slides.pptx." |
python -m spec2ears.scripts.pipeline slides.pptx --build-dir build |
In the default deterministic mode, text embedded in images is not extracted; LLM mode is required.
| What you say | Equivalent CLI |
|---|---|
"Convert every spec under samples/ to EARS." |
python -m spec2ears.scripts.pipeline samples/ --build-dir build |
Each file is processed independently; failures are skipped and recorded in build/failures.json, and a combined build/report.md is produced.
| What you say | Equivalent CLI |
|---|---|
"Convert spec.docx using my rules in my-rule.md." |
python -m spec2ears.scripts.pipeline spec.docx --rule my-rule.md |
See Customizing the conversion rules for the rule file format.
# Convert a single file (intermediate Markdown + EARS + report into build/)
python -m spec2ears.scripts.pipeline path/to/spec.docx --build-dir build
# Convert a directory recursively
python -m spec2ears.scripts.pipeline path/to/specs/ --build-dir build
# Suppress EARS output (keep intermediate Markdown and report only)
python -m spec2ears.scripts.pipeline path/to/spec.docx --no-ears| Option | Description |
|---|---|
--build-dir <dir> |
Output directory (default: build) |
--no-ears |
Skip Phase 2 and EARS output; keep intermediate Markdown and report |
--rule <path> |
Path to the Phase 2 rule file (default: spec2ears/references/convert-rule.md) |
--ambiguous <path> |
Path to the ambiguous-word list |
--env <path> |
Path to the .env file (auto-detected by default) |
| Path | Description |
|---|---|
build/intermediate/<name>.md |
Intermediate Markdown (always saved) |
build/intermediate/images/<name>/ |
Images extracted from the source document |
build/ears/<name>.ears.md |
EARS Markdown (default output; suppressed by --no-ears) |
build/report.md |
Conversion report (file counts, pattern breakdown, review count) |
build/failures.json |
Phase 1 failure list |
The EARS Markdown contains a requirement table (ID / pattern / statement / review flag) and a traceability table (requirement ID ↔ source).
By default Phase 1 runs without any LLM (deterministic and offline). You can opt in to LLM-assisted image description via a .env file.
# 1) Copy the template
cp .env.example .env # PowerShell: Copy-Item .env.example .env
# 2) Edit .env
# SPEC2EARS_LLM_MODE=on
# SPEC2EARS_LLM_API_KEY=<your key>
# SPEC2EARS_LLM_MODEL=gpt-4o # optional
# SPEC2EARS_LLM_BASE_URL=<endpoint> # optional (OpenAI-compatible)
# SPEC2EARS_LLM_PROMPT=<custom prompt> # optional
# 3) Make sure the OpenAI client is installed (already included in requirements.txt) and run as usual
python -m spec2ears.scripts.pipeline path/to/spec.pptx --build-dir buildNotes:
.envmay contain secrets and is git-ignored. Only.env.exampleis committed.- LLM mode is non-deterministic; the deterministic default is recommended for reproducibility.
- See
.env.examplefor full documentation of each setting (Japanese:.env.ja-JP.example).
spec2ears runs in one of two modes:
- Standard mode (deterministic, no LLM) — the default. Reproducible and offline. Best for text-based documents.
- LLM mode — opt-in via
.env. Adds image understanding (see LLM mode).
| Category | Extensions | Standard mode | LLM mode |
|---|---|---|---|
| Office / PDF | .docx .pptx .xlsx .xls .pdf |
Text, tables, structure | + describes embedded images |
| Images | .jpg .jpeg .png .gif .bmp .tif .tiff .webp |
Metadata only | Reads/describes image content |
| Audio | .wav .mp3 .m4a .mp4 .flac .ogg |
Metadata / transcription | same |
| Web / text | .html .htm .csv .tsv .json .xml .txt .md |
Structure-preserving | same |
| Other | .ipynb .zip .epub .msg |
Converted (zip iterates contents) | same |
Text-based formats are extracted faithfully without an LLM: .docx, .pptx, .xlsx, .xls, text-based .pdf, .html/.htm, .csv/.tsv, .json, .xml, .txt, .md, .epub, .zip, .ipynb, .msg. This is the recommended mode when your requirements are written as actual text in the document.
- Text inside images is not extracted. For image files (
.png,.jpg, ...) only metadata is read, so requirements drawn as pixels are not recovered. - Image-only / scanned PDFs (no text layer) yield little or no text.
- Embedded images in Office documents are extracted as files but not described.
- Audio transcription depends on optional extras and external tools (e.g., ffmpeg).
Enable LLM mode for inputs whose content lives in images:
- Image files (
.png,.jpg, etc.) that contain text or diagrams to read - Image-only or scanned PDFs
- PowerPoint/Word documents where the specification is drawn inside images, or where you want image descriptions
URL inputs (e.g., YouTube) are out of scope for safety (local-only convert_local()).
Phase 2 behavior is defined entirely in spec2ears/references/convert-rule.md (extraction cues, pattern classification, ID naming, splitting conditions, ambiguous-word reference). Edit that file to adapt the tool to your project; changes take effect on the next run. If the file is missing or malformed, Phase 2 stops with a restore hint.
uv pip install -r requirements-dev.txt
python -m pytest -qMIT License. See LICENSE.txt.