Skip to content

feat: add context-audit skill#28

Open
tomiir wants to merge 2 commits intomainfrom
tomiir/ai-context-quality
Open

feat: add context-audit skill#28
tomiir wants to merge 2 commits intomainfrom
tomiir/ai-context-quality

Conversation

@tomiir
Copy link

@tomiir tomiir commented Feb 17, 2026

Summary

  • Adds new context-audit skill that audits AI context files (CLAUDE.md, .cursorrules, SKILL.md, etc.) against the actual codebase
  • Detects stale references, vague instructions, missing coverage, and structural issues
  • Uses 3 parallel subagents (reference validator, coverage analyzer, quality assessor) with weighted scoring (A-F grade)
  • Includes CHECKLISTS.md reference file with detailed quality criteria
  • Updates README.md with skill documentation

Test plan

  • ./scripts/validate.sh passes (11 skills validated)
  • Run /context-audit on a repo with context files — verify discovery, scoring, and report output
  • Run /context-audit on a repo with no context files — verify it suggests what to create
  • Verify activation triggers ("audit context files", "context quality", "context audit")

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings February 17, 2026 11:21
@@ -0,0 +1,110 @@
# AI Context Quality Checklists
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry not helpful - but can you find a more intuitive name for this skill?

Copy link
Author

@tomiir tomiir Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context-audit or agent-docs-check wyt?
context-healthcheck?

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new ai-context-quality skill that audits AI context files (like CLAUDE.md, .cursorrules, SKILL.md) for accuracy, coverage, and clarity by comparing them against the actual codebase. The skill uses 3 parallel subagents to validate references, analyze coverage, and assess quality, producing a scored report (A-F grade) with actionable remediation.

Changes:

  • Added new ai-context-quality skill with SKILL.md defining the audit workflow and CHECKLISTS.md providing detailed evaluation criteria
  • Updated README.md to document the new skill in the table, details section, and directory structure
  • Skill supports multiple AI tools (Claude Code, Cursor, Copilot, Windsurf) and handles repos with zero context files

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
skills/ai-context-quality/SKILL.md Defines the skill workflow with 3 parallel subagents, weighted scoring system, output format, and grading scale
skills/ai-context-quality/CHECKLISTS.md Provides detailed evaluation criteria for reference validity, coverage, and clarity with scoring rubrics
README.md Adds skill to table, creates detail section with features and examples, updates directory structure

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


### Dependency References
- [ ] Package names match entries in dependency manifests
- [ ] Referenced CLI tools are available (`npm`, `cargo`, `pnpm`)
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term "available" is ambiguous. Does this mean checking if the CLI tool is referenced in package.json scripts or CI configs, or checking if it's installed on the system? Given the context of validating references against the codebase, consider clarifying this to "Referenced CLI tools match actual usage in scripts, configs, or documentation" to make it clear this is a reference validation check, not a system dependency check.

Suggested change
- [ ] Referenced CLI tools are available (`npm`, `cargo`, `pnpm`)
- [ ] Referenced CLI tools match actual usage in scripts, configs, or documentation (`npm`, `cargo`, `pnpm`)

Copilot uses AI. Check for mistakes.
README.md Outdated
**Example:**
```bash
/ai-context-quality # Full audit of all context files
/ai-context-quality focus on CLAUDE.md only # Audit specific file
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README example shows /ai-context-quality focus on CLAUDE.md only suggesting users can audit specific files, but the SKILL.md workflow (line 46) always scans "all AI context files" without describing how to handle user requests to focus on specific files. Either the workflow should clarify how $ARGUMENTS is used to filter which files are analyzed, or the README example should be updated to match the actual behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +56 to +65
Launch **3 parallel Task agents** (`subagent_type: general-purpose`), each focused on a different dimension:

#### Agent 1: Reference Validator
- Extract all file paths, directory references, function/class/method names, CLI commands, and package names mentioned in context files
- Verify each reference exists in the codebase using Glob and Grep
- Flag references that don't resolve as **stale references**
- Check that code examples in context files match actual code patterns
- Score: **Accuracy (0-100)**

#### Agent 2: Coverage Analyzer
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The terms "subagent" (line 54) and "agent" (line 56) are used interchangeably to refer to the same entities. For consistency, consider using one term consistently throughout (e.g., "subagent" in the heading and body, or just "agent" throughout).

Suggested change
Launch **3 parallel Task agents** (`subagent_type: general-purpose`), each focused on a different dimension:
#### Agent 1: Reference Validator
- Extract all file paths, directory references, function/class/method names, CLI commands, and package names mentioned in context files
- Verify each reference exists in the codebase using Glob and Grep
- Flag references that don't resolve as **stale references**
- Check that code examples in context files match actual code patterns
- Score: **Accuracy (0-100)**
#### Agent 2: Coverage Analyzer
Launch **3 parallel Task subagents** (`subagent_type: general-purpose`), each focused on a different dimension:
#### Subagent 1: Reference Validator
- Extract all file paths, directory references, function/class/method names, CLI commands, and package names mentioned in context files
- Verify each reference exists in the codebase using Glob and Grep
- Flag references that don't resolve as **stale references**
- Check that code examples in context files match actual code patterns
- Score: **Accuracy (0-100)**
#### Subagent 2: Coverage Analyzer

Copilot uses AI. Check for mistakes.
README.md Outdated
Comment on lines +66 to +80
#### ai-context-quality
Audits AI context files (CLAUDE.md, .cursorrules, SKILL.md, etc.) against the actual codebase. Detects stale references, vague instructions, missing coverage, and structural issues.

**Features:**
- Discovers all AI context files across tools (Claude Code, Cursor, Copilot, Windsurf)
- Parallel analysis with 3 specialized subagents (reference validity, coverage, clarity)
- Scored quality report (Accuracy, Coverage, Clarity → Overall grade A-F)
- Severity-ranked findings with actionable remediation
- Handles repos with zero context files (suggests what to create)

**Example:**
```bash
/ai-context-quality # Full audit of all context files
/ai-context-quality focus on CLAUDE.md only # Audit specific file
```
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ordering of skill details sections is inconsistent with the skills table. In the table, ai-context-quality appears as the 6th skill (between code-simplifier and aws-limits), but in the details section, it appears first. The details section should follow the same order as the table for consistency and easier navigation.

Copilot uses AI. Check for mistakes.
Comment on lines +63 to +79
- Score: **Accuracy (0-100)**

#### Agent 2: Coverage Analyzer
- Detect the repo's actual tech stack from: package managers (`package.json`, `Cargo.toml`, `pyproject.toml`, `go.mod`), frameworks, CI/CD configs, infrastructure files
- Identify key architectural patterns by scanning source code structure
- Compare what the context files document vs what actually exists
- Flag undocumented areas using the coverage checklist in [CHECKLISTS.md](CHECKLISTS.md)
- Score: **Coverage (0-100)**

#### Agent 3: Quality Assessor
- Evaluate each context file for clarity, specificity, and actionability
- Flag vague instructions (e.g., "handle errors properly" without specifying how)
- Check for inconsistent terminology across files
- Assess length appropriateness (flag files under 20 lines as too sparse, over 500 lines as potentially bloated)
- Check for presence of examples and anti-patterns
- Verify instructions don't contradict each other across files
- Score: **Clarity (0-100)**
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow describes each subagent producing a single score (0-100) for their dimension across all files (lines 63, 70, 79), but the output format shows per-file scores for each dimension (line 127-130). The workflow should clarify whether agents need to provide per-file breakdowns in addition to overall scores, or if the per-file scores are derived/computed somehow from the issues found per file.

Copilot uses AI. Check for mistakes.
## Supported Context Files

| File Pattern | Tool |
|---|---|
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Table separator uses minimal dashes (|---|---|) while other tables in the same file use column-width dashes (e.g., line 91: |-----------|--------|-----------------|). For consistency within the file and with other skills in the repository, consider using column-width separators.

Suggested change
|---|---|
|-------------|------|

Copilot uses AI. Check for mistakes.
@tomiir tomiir changed the title feat: add ai-context-quality skill feat: add context-audit skill Feb 17, 2026
@tomiir tomiir force-pushed the tomiir/ai-context-quality branch from d467ab4 to 253e3b1 Compare February 27, 2026 10:29
tomiir and others added 2 commits March 3, 2026 12:40
…eedback

- Rename skill from ai-context-quality to context-audit for clarity
- Use consistent "subagent" terminology throughout
- Clarify per-file scoring in workflow description
- Add $ARGUMENTS handling for file-specific audits
- Fix CLI tools checklist wording for precision
- Fix table separator formatting consistency
- Reorder README skill details alphabetically

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@tomiir tomiir force-pushed the tomiir/ai-context-quality branch from 253e3b1 to 1331529 Compare March 3, 2026 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants