Enhance EVAL Mode with Anti-Sycophancy and Objective Metrics by JeremyDev87 · Pull Request #76 · JeremyDev87/codingbuddy

JeremyDev87 · 2025-12-22T01:53:59Z

Enhance EVAL Mode with Anti-Sycophancy and Objective Metrics

Summary

This PR significantly enhances EVAL mode to ensure objective, evidence-based code reviews without praise or subjective assessments. The changes transform the code reviewer agent into a skeptical third-party auditor that evaluates code output only, not implementation intent.

Problem Statement

Current Issues

Before this enhancement, EVAL mode had several problems:

Subjective praise - Reviews often included phrases like "Great job", "Well done", "Excellent work" which don't add value
Intent-based evaluation - Reviews defended implementation decisions from PLAN/ACT phases instead of evaluating output objectively
Lack of measurable criteria - Findings were often subjective without concrete metrics
Praise-first approach - Strengths were listed before problems, reducing focus on improvements
Missing adversarial analysis - No systematic challenge of assumptions or edge cases
Insufficient impact analysis - Changes weren't evaluated for side effects and dependencies

Solution

Transform EVAL mode to:

Prohibit praise phrases (English and Korean)
Require objective, measurable metrics for all findings
Restructure output to critique-first format
Add Devil's Advocate Analysis section
Add Impact Radius Analysis for dependency tracking
Require minimum 3 improvements per evaluation
Evaluate OUTPUT only, never implementation INTENT

Features

1. Anti-Sycophancy Rules

Philosophy: Evaluate like a skeptical third-party auditor who has never seen this code before.

Key Rules:

Evaluate OUTPUT only, never implementer's INTENT
Assume bugs exist until proven otherwise
Challenge every design decision
Start with problems, not praise
Identify at least 3 improvement areas OR all identified issues

Prohibited Phrases (English + Korean):

English: "Great job", "Well done", "Excellent work", "Good implementation", "Perfect", "Impressive", etc.
Korean: "잘했어", "훌륭해", "완벽해", "깔끔해", "좋아", "멋져", etc.

Required Language:

Findings: "Evidence shows...", "Metric indicates...", "Violation found at...", "Gap identified..."
Neutral observations: "The implementation uses...", "The code contains...", "Measurement shows..."

2. Objective Metrics Framework

All evaluations must be based on measurable, objective criteria:

Code Metrics:

Test coverage: >=90% target
Type safety: 0 any usages target
Cyclomatic complexity: <=10 per function
Function length: <=20 lines
Nesting depth: <=3 levels
Bundle size delta: <=20KB per feature

Checklist Metrics:

Security: OWASP Top 10 checklist
Accessibility: WCAG 2.1 AA criteria
Performance: Core Web Vitals targets

Documentation Metrics (for non-code changes):

Clarity: ambiguous terms count (target: 0)
Completeness: missing sections count (target: 0)
Consistency: inconsistency count (target: 0)
Actionability: vague instruction count (target: 0)

Output Requirement: Every finding MUST include:

Location (file:line or section)
Measured value
Target value
Gap/delta

3. Critique-First Output Format

New Structure Order:

Mode indicator
Agent name
Context (Reference Only) - factual summary, no defense
Critical Findings - ALL metric violations FIRST
Devil's Advocate Analysis - challenge assumptions, edge cases, failure modes
Impact Radius Analysis - dependencies, contract changes, side effects
Objective Assessment - PASS/FAIL metrics table
What Works - facts only, NO praise
Improvement Plan - prioritized with evidence
Anti-Sycophancy Verification - self-check

4. Devil's Advocate Analysis

Systematic challenge from opposing viewpoints:

Mandatory Questions:

What assumptions might be wrong?
What edge cases are unhandled?
How might this fail under load/scale?
What security vectors are exposed?
Where could this introduce regression?
What happens when dependencies change?

Subsections:

What could go wrong?
Assumptions that might be wrong
Unhandled edge cases

5. Impact Radius Analysis

Analyze side effects and ripple effects beyond modified files:

Direct Dependencies:

Files that directly import/reference changed files
Table format: Changed File | Imported By | Potential Impact

Contract Changes:

Function signature changes
Type definition changes
Export changes
Before/After comparison with breaking change assessment

Side Effect Checklist:

Type compatibility
Behavior compatibility
Test coverage
Error handling
State management
Async flow

Breaking Change Criteria:

Definitely breaking: Removed exports, changed required parameters, narrowed return types
Potentially breaking: Added required parameters with defaults, widened return types
Safe changes: Added new exports, added optional parameters, internal refactoring

Changes

Modified Files

.ai-rules/agents/code-reviewer.json (+393 lines, -41 lines)
- Added anti_sycophancy section with prohibited phrases and required language
- Added objective_metrics section with code, checklist, and documentation metrics
- Added impact_radius_analysis section with dependency analysis framework
- Restructured evaluation_output_format to critique-first order
- Updated persona to "Skeptical third-party auditor"
- Updated execution_order with new 17-step process
- Added 7 new mandatory checklist items
- Updated verification guide with new checks
.ai-rules/rules/core.md (+101 lines, -1 line)
- Added Anti-Sycophancy Rules section to EVAL mode
- Restructured output format to critique-first approach
- Added Critical Findings table format
- Added Devil's Advocate Analysis section
- Added Impact Radius Analysis section
- Added Objective Assessment table
- Changed "Strengths" to "What Works (Evidence Required)"
- Added Anti-Sycophancy Verification checklist
- Added special cases handling (documentation-only, no changes)
docs/tickets/eval-mode-neutrality-implementation.md (new file, 337 lines)
- Complete implementation plan document
- Phase-by-phase breakdown
- Validation criteria
- Test scenarios
- Rollback plan

Execution Flow

New 17-Step Process:

Write # Mode: EVAL
Write ## Agent : Code Reviewer
Write ## Context (Reference Only) - factual summary, no defense
Gather objective metrics (run coverage, count any usages, measure complexity)
Write ## Critical Findings table - ALL metric violations FIRST
Write ## Devil's Advocate Analysis - challenge assumptions, edge cases, failure modes
Write ## Impact Radius Analysis - analyze dependencies and side effects
- 7a. Search for files importing changed files
- 7b. List direct dependencies in table format
- 7c. Identify contract changes (signatures, types, exports)
- 7d. Complete Side Effect Checklist
Write ## Objective Assessment table - PASS/FAIL for each metric
Write ## What Works - facts only, NO praise or positive adjectives
For each improvement: Call web_search → Write with evidence
Create todo list using todo_write tool (prioritized, all pending)
Write ## Improvement Plan
Write ## Anti-Sycophancy Verification - self-check
Verify: No prohibited phrases used
Verify: Minimum 3 improvements identified
Verify: All findings have location + metric + target
Verify: Impact Radius Analysis completed

Benefits

Objective Evaluations - All findings backed by measurable metrics
No Praise Pollution - Reviews focus on actionable improvements
Adversarial Thinking - Systematic challenge prevents overlooked issues
Impact Awareness - Dependency analysis prevents regression bugs
Consistent Quality - Minimum 3 improvements ensures thorough review
Intent Separation - Evaluates code output, not implementation reasoning

Verification

Self-Check Requirements:

No prohibited phrases used (English + Korean)
At least 3 improvement areas OR all identified issues reported
All findings include objective evidence (location, metric, target)
Devil's Advocate Analysis completed
Impact Radius Analysis completed (dependencies, contract changes, side effects)
Critical Findings section appears before What Works
No defense of implementation decisions

Special Cases

Documentation-only changes:

Use documentation_metrics instead of code metrics
Evaluate: clarity, completeness, consistency, actionability
Critical Findings table references section names instead of file:line

No changes to evaluate:

State "No implementation to evaluate" in Context section
Skip Critical Findings and Objective Assessment tables
Focus Devil's Advocate on the request/plan itself

Test-only changes:

State "Test-only changes - no production impact"
Skip Direct Dependencies, focus on test coverage impact

New file with no dependencies:

State "New file - no existing dependencies"
Evaluate API design for future maintainability

Related Issue

Closes #70

Files Changed

.ai-rules/agents/code-reviewer.json - Major update (+393/-41 lines)
.ai-rules/rules/core.md - EVAL mode section restructured (+101/-1 line)
docs/tickets/eval-mode-neutrality-implementation.md - New implementation plan (337 lines)

Statistics

3 files changed
790 insertions, 41 deletions
1 new file added (implementation plan)
2 files significantly updated

Testing

Success Metrics

Zero prohibited phrases in EVAL output
100% of findings include objective metrics
Minimum 3 improvements identified per evaluation
Devil's Advocate section present in every EVAL
"What Works" contains only factual observations (no praise)
Impact Radius Analysis completed for all code changes

Test Scenarios

Simple implementation - Should still find 3+ improvements
Good implementation - Should still challenge assumptions
Poor implementation - Should not over-emphasize negatives (remain balanced)
AI's own code - Should evaluate with same rigor (no self-defense)
Documentation changes - Should use documentation metrics
No changes - Should handle gracefully

Migration Notes

Breaking Changes:

EVAL output format has changed significantly
Old "Strengths" section replaced with "What Works"
New mandatory sections: Critical Findings, Devil's Advocate, Impact Radius Analysis

Backward Compatibility:

Existing EVAL requests will automatically use new format
No user action required
Old format is completely replaced

Rollback Plan

If issues arise:

Revert code-reviewer.json to remove anti_sycophancy and objective_metrics sections
Revert core.md EVAL mode section to original format
Remove implementation plan document

Next Steps

Monitor EVAL outputs for compliance with new rules
Gather feedback on review quality improvements
Adjust prohibited phrases list if needed
Consider extending to other review contexts if successful

Notes

This is a fundamental shift in how EVAL mode operates. The changes ensure that code reviews are:

Objective - Based on measurable criteria, not opinions
Thorough - Minimum 3 improvements ensures comprehensive review
Adversarial - Challenges assumptions and finds edge cases
Impact-aware - Analyzes dependencies and side effects
Neutral - No praise, no criticism, only factual observations

The implementation is rule-based (no infrastructure changes needed) - AI assistants follow the documented guidelines to implement the enhanced evaluation process.

- Add anti-sycophancy rules prohibiting praise phrases - Add objective metrics framework for measurable evaluations - Restructure output to critique-first format - Add Devil's Advocate and Impact Radius Analysis sections - Require minimum 3 improvements per evaluation - Update code-reviewer.json and core.md close #70

JeremyDev87 self-assigned this Dec 22, 2025

JeremyDev87 added the feat label Dec 22, 2025

JeremyDev87 marked this pull request as ready for review December 22, 2025 01:54

JeremyDev87 merged commit ff19738 into master Dec 22, 2025
9 checks passed

JeremyDev87 deleted the feat/70 branch December 22, 2025 01:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enhance EVAL Mode with Anti-Sycophancy and Objective Metrics#76

Enhance EVAL Mode with Anti-Sycophancy and Objective Metrics#76
JeremyDev87 merged 1 commit into
masterfrom
feat/70

JeremyDev87 commented Dec 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

JeremyDev87 commented Dec 22, 2025

Enhance EVAL Mode with Anti-Sycophancy and Objective Metrics

Summary

Problem Statement

Current Issues

Solution

Features

1. Anti-Sycophancy Rules

2. Objective Metrics Framework

3. Critique-First Output Format

4. Devil's Advocate Analysis

5. Impact Radius Analysis

Changes

Modified Files

Execution Flow

Benefits

Verification

Special Cases

Related Issue

Files Changed

Statistics

Testing

Success Metrics

Test Scenarios

Migration Notes

Rollback Plan

Next Steps

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants