添加 claude code game studios 到项目

2026-05-15 14:52:29 +08:00
parent dff559462d
commit a16fe4bff7
415 changed files with 78609 additions and 0 deletions
--- a/Framework/skills/analysis/asset-audit.md
+++ b/Framework/skills/analysis/asset-audit.md
@@ -0,0 +1,170 @@
+# Skill Test Spec: /asset-audit
+
+## Skill Summary
+
+`/asset-audit` audits the `assets/` directory for naming convention compliance,
+missing metadata, and format/size issues. It reads asset files against the
+conventions and budgets defined in `technical-preferences.md`. No director gates
+are invoked. The skill does not write without user approval. Verdicts: COMPLIANT,
+WARNINGS, or NON-COMPLIANT.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLIANT, WARNINGS, NON-COMPLIANT
+- [ ] Does NOT require "May I write" language (read-only; optional report requires approval)
+- [ ] Has a next-step handoff (what to do after audit results)
+
+---
+
+## Director Gate Checks
+
+None. Asset auditing is a read-only analysis skill; no gates are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All assets follow naming conventions
+
+**Fixture:**
+- `technical-preferences.md` specifies naming convention: `snake_case`, e.g., `enemy_grunt_idle.png`
+- `assets/art/characters/` contains: `enemy_grunt_idle.png`, `enemy_sniper_run.png`
+- `assets/audio/sfx/` contains: `sfx_jump_land.ogg`, `sfx_item_pickup.ogg`
+- All files are within size budget (textures ≤2MB, audio ≤500KB)
+
+**Input:** `/asset-audit`
+
+**Expected behavior:**
+1. Skill reads naming conventions and size budgets from `technical-preferences.md`
+2. Skill scans `assets/` recursively
+3. All files match `snake_case` convention; all within budget
+4. Audit table shows all rows PASS
+5. Verdict is COMPLIANT
+
+**Assertions:**
+- [ ] Audit covers both art and audio asset directories
+- [ ] Each file is checked against naming convention and size budget
+- [ ] All rows show PASS when compliant
+- [ ] Verdict is COMPLIANT
+- [ ] No files are written
+
+---
+
+### Case 2: Non-Compliant — Textures exceed size budget
+
+**Fixture:**
+- `assets/art/environment/` contains 5 texture files
+- 3 texture files are 4MB each (budget: ≤2MB)
+- 2 texture files are within budget
+
+**Input:** `/asset-audit`
+
+**Expected behavior:**
+1. Skill reads size budget from `technical-preferences.md` (2MB for textures)
+2. Skill scans `assets/art/environment/` — finds 3 oversized textures
+3. Audit table lists each oversized file with actual size and budget
+4. Verdict is NON-COMPLIANT
+5. Skill recommends compression or resolution reduction for flagged files
+
+**Assertions:**
+- [ ] All 3 oversized files are listed by name with actual size and budget size
+- [ ] Verdict is NON-COMPLIANT when any file exceeds its budget
+- [ ] Optimization recommendation is given for oversized files
+- [ ] Within-budget files are also listed (showing PASS) for completeness
+
+---
+
+### Case 3: Format Issue — Audio in wrong format
+
+**Fixture:**
+- `technical-preferences.md` specifies audio format: OGG
+- `assets/audio/music/theme_main.wav` exists (WAV format)
+- `assets/audio/sfx/sfx_footstep.ogg` exists (correct OGG format)
+
+**Input:** `/asset-audit`
+
+**Expected behavior:**
+1. Skill reads audio format requirement: OGG
+2. Skill scans `assets/audio/` — finds `theme_main.wav` in wrong format
+3. Audit table flags `theme_main.wav` as FORMAT ISSUE (expected OGG, found WAV)
+4. `sfx_footstep.ogg` shows PASS
+5. Verdict is WARNINGS (format issues are correctable)
+
+**Assertions:**
+- [ ] `theme_main.wav` is flagged as FORMAT ISSUE with expected and actual format noted
+- [ ] Verdict is WARNINGS (not NON-COMPLIANT) for format issues, which are correctable
+- [ ] Correct-format assets are shown as PASS
+- [ ] Skill does not modify or convert any asset files
+
+---
+
+### Case 4: Missing Asset — Asset referenced by GDD but absent from assets/
+
+**Fixture:**
+- `design/gdd/enemies.md` references `enemy_boss_idle.png`
+- `assets/art/characters/boss/` directory is empty — file does not exist
+
+**Input:** `/asset-audit`
+
+**Expected behavior:**
+1. Skill reads GDD references to find expected assets (cross-references with `/content-audit` scope)
+2. Skill scans `assets/art/characters/boss/` — file not found
+3. Audit table flags `enemy_boss_idle.png` as MISSING ASSET
+4. Verdict is NON-COMPLIANT (missing critical art asset)
+
+**Assertions:**
+- [ ] Skill checks GDD references to identify expected assets
+- [ ] Missing assets are flagged as MISSING ASSET with the GDD reference noted
+- [ ] Verdict is NON-COMPLIANT when critical assets are missing
+- [ ] Skill does not create or add placeholder assets
+
+---
+
+### Case 5: Gate Compliance — No gate; technical-artist may be consulted separately
+
+**Fixture:**
+- 2 files have naming convention violations (CamelCase instead of snake_case)
+- `review-mode.txt` contains `full`
+
+**Input:** `/asset-audit`
+
+**Expected behavior:**
+1. Skill scans assets and finds 2 naming violations
+2. No director gate is invoked regardless of review mode
+3. Verdict is WARNINGS
+4. Output notes: "Consider having a Technical Artist review naming conventions"
+5. Skill presents findings; offers optional audit report write
+6. If user opts in: "May I write to `production/qa/asset-audit-[date].md`?"
+
+**Assertions:**
+- [ ] No director gate is invoked in any review mode
+- [ ] Technical artist consultation is suggested (not mandated)
+- [ ] Findings table is presented before any write prompt
+- [ ] Optional audit report write asks "May I write" before writing
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads `technical-preferences.md` for naming conventions, formats, and size budgets
+- [ ] Scans `assets/` directory recursively
+- [ ] Audit table shows file name, check type, expected value, actual value, and result
+- [ ] Does not modify any asset files
+- [ ] No director gates are invoked
+- [ ] Verdict is one of: COMPLIANT, WARNINGS, NON-COMPLIANT
+
+---
+
+## Coverage Notes
+
+- Metadata checks (e.g., missing texture import settings in Godot `.import` files)
+  are not explicitly tested here; they follow the same FORMAT ISSUE flagging pattern.
+- The interaction between `/asset-audit` and `/content-audit` (both check GDD
+  references vs. assets) is intentional overlap; `/asset-audit` focuses on
+  compliance while `/content-audit` focuses on completeness.
--- a/Framework/skills/analysis/balance-check.md
+++ b/Framework/skills/analysis/balance-check.md
@@ -0,0 +1,172 @@
+# Skill Test Spec: /balance-check
+
+## Skill Summary
+
+`/balance-check` reads balance data files (JSON or YAML in `assets/data/`) and
+checks each value against the design formulas defined in GDDs under `design/gdd/`.
+It produces a findings table with columns: Value → Formula → Deviation → Severity.
+No director gates are invoked (read-only analysis). The skill may optionally write
+a balance report but asks "May I write" before doing so. Verdicts: BALANCED,
+CONCERNS, or OUT OF BALANCE.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: BALANCED, CONCERNS, OUT OF BALANCE
+- [ ] Contains "May I write" language (optional report write)
+- [ ] Has a next-step handoff (what to do after findings are reviewed)
+
+---
+
+## Director Gate Checks
+
+None. Balance check is a read-only analysis skill; no gates are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All balance values within formula tolerances
+
+**Fixture:**
+- `assets/data/combat-balance.json` exists with 6 stat values
+- `design/gdd/combat-system.md` contains formulas for all 6 stats with ±10% tolerance
+- All 6 values fall within tolerance
+
+**Input:** `/balance-check`
+
+**Expected behavior:**
+1. Skill reads all balance data files in `assets/data/`
+2. Skill reads GDD formulas from `design/gdd/`
+3. Skill computes deviation for each value against its formula
+4. All deviations are within ±10% tolerance
+5. Skill outputs findings table with all rows showing PASS
+6. Verdict is BALANCED
+
+**Assertions:**
+- [ ] Findings table is shown for all checked values
+- [ ] Each row shows: stat name, formula target, actual value, deviation percentage
+- [ ] All rows show PASS or equivalent when within tolerance
+- [ ] Verdict is BALANCED
+- [ ] No files are written without user approval
+
+---
+
+### Case 2: Out of Balance — Player damage 40% above formula target
+
+**Fixture:**
+- `assets/data/combat-balance.json` has `player_damage_base: 140`
+- `design/gdd/combat-system.md` formula specifies `player_damage_base = 100` (±10%)
+- All other stats are within tolerance
+
+**Input:** `/balance-check`
+
+**Expected behavior:**
+1. Skill reads combat-balance.json and computes deviation for `player_damage_base`
+2. Deviation is +40% — far outside ±10% tolerance
+3. Skill flags this row as severity HIGH in the findings table
+4. Verdict is OUT OF BALANCE
+5. Skill surfaces the HIGH severity item prominently before the table
+
+**Assertions:**
+- [ ] `player_damage_base` row shows deviation of +40%
+- [ ] Severity is HIGH for deviations exceeding tolerance by more than 2×
+- [ ] Verdict is OUT OF BALANCE when any stat has HIGH severity deviation
+- [ ] The HIGH severity item is called out explicitly, not buried in table rows
+
+---
+
+### Case 3: No GDD Formulas — Cannot validate, guidance given
+
+**Fixture:**
+- `assets/data/economy-balance.yaml` exists with 10 stat values
+- No GDD in `design/gdd/` contains formula definitions for economy stats
+
+**Input:** `/balance-check`
+
+**Expected behavior:**
+1. Skill reads balance data files
+2. Skill searches GDDs for formula definitions — finds none for economy stats
+3. Skill outputs: "Cannot validate economy stats — no formulas defined. Run /design-system first."
+4. No findings table is generated for the economy stats
+5. Verdict is CONCERNS (data exists but cannot be validated)
+
+**Assertions:**
+- [ ] Skill does not fabricate formula targets when none exist in GDDs
+- [ ] Output explicitly names the missing formula source
+- [ ] Output recommends running `/design-system` to define formulas
+- [ ] Verdict is CONCERNS (not BALANCED, since validation was impossible)
+
+---
+
+### Case 4: Orphan Reference — Balance file references an undefined stat
+
+**Fixture:**
+- `assets/data/combat-balance.json` contains a stat `legacy_armor_mult: 1.5`
+- `design/gdd/combat-system.md` has no formula for `legacy_armor_mult`
+- All other stats have formula definitions and pass validation
+
+**Input:** `/balance-check`
+
+**Expected behavior:**
+1. Skill reads all stats from combat-balance.json
+2. Skill cannot find a formula for `legacy_armor_mult` in any GDD
+3. Skill flags `legacy_armor_mult` as ORPHAN REFERENCE in the findings table
+4. Other stats are evaluated normally; those within tolerance show PASS
+5. Verdict is CONCERNS (orphan reference prevents full validation)
+
+**Assertions:**
+- [ ] `legacy_armor_mult` appears in findings table with status ORPHAN REFERENCE
+- [ ] Orphan references are distinguished from formula deviations in the table
+- [ ] Verdict is CONCERNS when any orphan references are found
+- [ ] Skill does not skip orphan stats silently
+
+---
+
+### Case 5: Gate Compliance — Read-only; no gate; optional report requires approval
+
+**Fixture:**
+- Balance data and GDD formulas exist; 1 stat has CONCERNS-level deviation (15% above target)
+- `review-mode.txt` contains `full`
+
+**Input:** `/balance-check`
+
+**Expected behavior:**
+1. Skill reads data and GDDs; generates findings table
+2. Verdict is CONCERNS (one stat slightly out of range)
+3. No director gate is invoked
+4. Skill presents findings table to user
+5. Skill offers to write an optional balance report
+6. If user says yes: skill asks "May I write to `production/qa/balance-report-[date].md`?"
+7. If user says no: skill ends without writing
+
+**Assertions:**
+- [ ] No director gate is invoked in any review mode
+- [ ] Findings table is presented without writing anything automatically
+- [ ] Optional report write is offered but not forced
+- [ ] "May I write" prompt appears only if user opts in to the report
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads both balance data files and GDD formulas before analysis
+- [ ] Findings table shows Value, Formula, Deviation, and Severity columns
+- [ ] Does not write any files without explicit user approval
+- [ ] No director gates are invoked
+- [ ] Verdict is one of: BALANCED, CONCERNS, OUT OF BALANCE
+
+---
+
+## Coverage Notes
+
+- The case where `assets/data/` is entirely empty is not tested; behavior
+  follows the CONCERNS pattern with a message that no data files were found.
+- Tolerance thresholds (±10%, ±20%) are implementation details of the skill;
+  the tests verify that deviations are detected and classified, not the
+  exact threshold values.
--- a/Framework/skills/analysis/code-review.md
+++ b/Framework/skills/analysis/code-review.md
@@ -0,0 +1,172 @@
+# Skill Test Spec: /code-review
+
+## Skill Summary
+
+`/code-review` performs an architectural code review of source files in `src/`,
+checking coding standards from `CLAUDE.md` (doc comments on public APIs,
+dependency injection over singletons, data-driven values, testability). Findings
+are advisory. No director gates are invoked. No code edits are made. Verdicts:
+APPROVED, CONCERNS, or NEEDS CHANGES.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: APPROVED, CONCERNS, NEEDS CHANGES
+- [ ] Does NOT require "May I write" language (read-only; findings are advisory output)
+- [ ] Has a next-step handoff (what to do with findings)
+
+---
+
+## Director Gate Checks
+
+None. Code review is a read-only advisory skill; no gates are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Source file follows all coding standards
+
+**Fixture:**
+- `src/gameplay/health_component.gd` exists with:
+  - All public methods have doc comments (`##` notation)
+  - No singletons used; dependencies injected via constructor
+  - No hardcoded values; all constants reference `assets/data/`
+  - ADR reference in file header: `# ADR: docs/architecture/adr-004-health.md`
+  - Referenced ADR has `Status: Accepted`
+
+**Input:** `/code-review src/gameplay/health_component.gd`
+
+**Expected behavior:**
+1. Skill reads the source file
+2. Skill checks all coding standards: doc comments, DI, data-driven, ADR status
+3. All checks pass
+4. Skill outputs findings summary with all checks PASS
+5. Verdict is APPROVED
+
+**Assertions:**
+- [ ] Each coding standard check is listed in the output
+- [ ] All checks show PASS when standards are met
+- [ ] Skill reads referenced ADR to confirm its status
+- [ ] Verdict is APPROVED
+- [ ] No edits are made to any file
+
+---
+
+### Case 2: Needs Changes — Missing doc comment and singleton usage
+
+**Fixture:**
+- `src/ui/inventory_ui.gd` has:
+  - 2 public methods without doc comments
+  - Uses `GameManager.instance` (singleton pattern)
+  - All other standards met
+
+**Input:** `/code-review src/ui/inventory_ui.gd`
+
+**Expected behavior:**
+1. Skill reads the source file
+2. Skill detects: 2 missing doc comments on public methods
+3. Skill detects: singleton usage at specific lines (e.g., line 42, line 87)
+4. Findings list the exact method names and line numbers
+5. Verdict is NEEDS CHANGES
+
+**Assertions:**
+- [ ] Missing doc comments are listed with method names
+- [ ] Singleton usage is flagged with file and line number
+- [ ] Verdict is NEEDS CHANGES when BLOCKING-level standard violations exist
+- [ ] Skill does not edit the file — findings are for the developer to act on
+- [ ] Output suggests replacing singleton with dependency injection
+
+---
+
+### Case 3: Architecture Risk — ADR reference is Proposed, not Accepted
+
+**Fixture:**
+- `src/core/save_system.gd` has a header comment: `# ADR: docs/architecture/adr-010-save.md`
+- `adr-010-save.md` exists but has `Status: Proposed`
+- Code itself follows all other coding standards
+
+**Input:** `/code-review src/core/save_system.gd`
+
+**Expected behavior:**
+1. Skill reads the source file
+2. Skill reads referenced ADR — finds `Status: Proposed`
+3. Skill flags this as ARCHITECTURE RISK (code is implementing an unaccepted ADR)
+4. Other coding standard checks pass
+5. Verdict is CONCERNS (risk flag is advisory, not a hard NEEDS CHANGES)
+
+**Assertions:**
+- [ ] Skill reads referenced ADR file to check its status
+- [ ] ARCHITECTURE RISK is flagged when ADR status is Proposed
+- [ ] Verdict is CONCERNS (not NEEDS CHANGES) for ADR risk — advisory severity
+- [ ] Output recommends resolving the ADR before the code goes to production
+
+---
+
+### Case 4: Edge Case — No source files found at specified path
+
+**Fixture:**
+- User calls `/code-review src/networking/`
+- `src/networking/` directory does not exist
+
+**Input:** `/code-review src/networking/`
+
+**Expected behavior:**
+1. Skill attempts to read files in `src/networking/`
+2. Directory or files not found
+3. Skill outputs an error: "No source files found at `src/networking/`"
+4. Skill suggests checking `src/` for valid directories
+5. No verdict is emitted (nothing was reviewed)
+
+**Assertions:**
+- [ ] Skill does not crash when path does not exist
+- [ ] Output names the attempted path in the error message
+- [ ] Output suggests checking `src/` for valid file paths
+- [ ] No verdict is emitted when there is nothing to review
+
+---
+
+### Case 5: Gate Compliance — No gate; LP may be consulted separately
+
+**Fixture:**
+- Source file follows most standards but has 1 CONCERNS-level finding (a magic number)
+- `review-mode.txt` contains `full`
+
+**Input:** `/code-review src/gameplay/loot_system.gd`
+
+**Expected behavior:**
+1. Skill reads and reviews the source file
+2. No director gate is invoked (code review findings are advisory)
+3. Skill presents findings with the CONCERNS verdict
+4. Output notes: "Consider requesting a Lead Programmer review for architecture concerns"
+5. Skill does not invoke any agent automatically
+
+**Assertions:**
+- [ ] No director gate is invoked in any review mode
+- [ ] LP consultation is suggested (not mandated) in the output
+- [ ] No code edits are made
+- [ ] Verdict is CONCERNS for advisory-level findings
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads source file(s) and coding standards before reviewing
+- [ ] Lists each coding standard check in findings output
+- [ ] Does not edit any source files (read-only skill)
+- [ ] No director gates are invoked
+- [ ] Verdict is one of: APPROVED, CONCERNS, NEEDS CHANGES
+
+---
+
+## Coverage Notes
+
+- Batch review of all files in a directory is not explicitly tested; behavior
+  is assumed to apply the same checks file by file and aggregate the verdict.
+- Test coverage checks (verifying corresponding test files exist) are a stretch
+  goal not tested here; that is primarily the domain of `/test-evidence-review`.
--- a/Framework/skills/analysis/consistency-check.md
+++ b/Framework/skills/analysis/consistency-check.md
@@ -0,0 +1,176 @@
+# Skill Test Spec: /consistency-check
+
+## Skill Summary
+
+`/consistency-check` scans all GDDs in `design/gdd/` and checks for internal
+conflicts across documents. It produces a structured findings table with columns:
+System A vs System B, Conflict Type, Severity (HIGH / MEDIUM / LOW). Conflict
+types include: formula mismatch, competing ownership, stale reference, and
+dependency gap.
+
+The skill is read-only during analysis. It has no director gates. An optional
+consistency report can be written to `design/consistency-report-[date].md` if the
+user requests it, but the skill asks "May I write" before doing so.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: CONSISTENT, CONFLICTS FOUND, DEPENDENCY GAP
+- [ ] Does NOT require "May I write" language during analysis (read-only scan)
+- [ ] Has a next-step handoff at the end
+- [ ] Documents that report writing is optional and requires approval
+
+---
+
+## Director Gate Checks
+
+No director gates — this skill spawns no director gate agents. Consistency
+checking is a mechanical scan; no creative or technical director review is
+required as part of the scan itself.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — 4 GDDs with no conflicts
+
+**Fixture:**
+- `design/gdd/` contains exactly 4 system GDDs
+- All GDDs have consistent formulas (no overlapping variables with different values)
+- No two GDDs claim ownership of the same game entity or mechanic
+- All dependency references point to GDDs that exist
+
+**Input:** `/consistency-check`
+
+**Expected behavior:**
+1. Skill reads all 4 GDDs in `design/gdd/`
+2. Runs cross-GDD consistency checks (formulas, ownership, references)
+3. No conflicts found
+4. Outputs structured findings table showing 0 issues
+5. Verdict: CONSISTENT
+
+**Assertions:**
+- [ ] All 4 GDDs are read before producing output
+- [ ] Findings table is present (even if empty — shows "No conflicts found")
+- [ ] Verdict is CONSISTENT when no conflicts exist
+- [ ] Skill does NOT write any files without user approval
+- [ ] Next-step handoff is present
+
+---
+
+### Case 2: Failure Path — Two GDDs with conflicting damage formulas
+
+**Fixture:**
+- GDD-A defines damage formula: `damage = attack * 1.5`
+- GDD-B defines damage formula: `damage = attack * 2.0` for the same entity type
+- Both GDDs refer to the same "attack" variable
+
+**Input:** `/consistency-check`
+
+**Expected behavior:**
+1. Skill reads all GDDs and detects the formula mismatch
+2. Findings table includes an entry: GDD-A vs GDD-B | Formula Mismatch | HIGH
+3. Specific conflicting formulas are shown (not just "formula conflict exists")
+4. Verdict: CONFLICTS FOUND
+
+**Assertions:**
+- [ ] Verdict is CONFLICTS FOUND (not CONSISTENT)
+- [ ] Conflict entry names both GDD filenames
+- [ ] Conflict type is "Formula Mismatch"
+- [ ] Severity is HIGH for a direct formula contradiction
+- [ ] Both conflicting formulas are shown in the findings table
+- [ ] Skill does NOT auto-resolve the conflict
+
+---
+
+### Case 3: Partial Path — GDD references a system with no GDD
+
+**Fixture:**
+- GDD-A's Dependencies section lists "system-B" as a dependency
+- No GDD for system-B exists in `design/gdd/`
+- All other GDDs are consistent
+
+**Input:** `/consistency-check`
+
+**Expected behavior:**
+1. Skill reads all GDDs and checks dependency references
+2. GDD-A's reference to "system-B" cannot be resolved — no GDD exists for it
+3. Findings table includes: GDD-A vs (missing) | Dependency Gap | MEDIUM
+4. Verdict: DEPENDENCY GAP (not CONSISTENT, not CONFLICTS FOUND)
+
+**Assertions:**
+- [ ] Verdict is DEPENDENCY GAP (distinct from CONSISTENT and CONFLICTS FOUND)
+- [ ] Findings entry names GDD-A and the missing system-B
+- [ ] Severity is MEDIUM for an unresolved dependency reference
+- [ ] Skill suggests running `/design-system system-B` to create the missing GDD
+
+---
+
+### Case 4: Edge Case — No GDDs found
+
+**Fixture:**
+- `design/gdd/` directory is empty or does not exist
+
+**Input:** `/consistency-check`
+
+**Expected behavior:**
+1. Skill attempts to read files in `design/gdd/`
+2. No GDD files found
+3. Skill outputs an error: "No GDDs found in `design/gdd/`. Run `/design-system` to create GDDs first."
+4. No findings table is produced
+5. No verdict is issued
+
+**Assertions:**
+- [ ] Skill outputs a clear error message when no GDDs are found
+- [ ] No verdict is produced (CONSISTENT / CONFLICTS FOUND / DEPENDENCY GAP)
+- [ ] Skill recommends the correct next action (`/design-system`)
+- [ ] Skill does NOT crash or produce a partial report
+
+---
+
+### Case 5: Director Gate — No gate spawned; no review-mode.txt read
+
+**Fixture:**
+- `design/gdd/` contains ≥2 GDDs
+- `production/session-state/review-mode.txt` exists with `full`
+
+**Input:** `/consistency-check`
+
+**Expected behavior:**
+1. Skill reads all GDDs and runs the consistency scan
+2. Skill does NOT read `production/session-state/review-mode.txt`
+3. No director gate agents are spawned at any point
+4. Findings table and verdict are produced normally
+
+**Assertions:**
+- [ ] No director gate agents are spawned (no CD-, TD-, PR-, AD- prefixed gates)
+- [ ] Skill does NOT read `production/session-state/review-mode.txt`
+- [ ] Output contains no "Gate: [GATE-ID]" or gate-skipped entries
+- [ ] Review mode has no effect on this skill's behavior
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads all GDDs before producing the findings table
+- [ ] Findings table shown in full before any write ask (if report is requested)
+- [ ] Verdict is one of exactly: CONSISTENT, CONFLICTS FOUND, DEPENDENCY GAP
+- [ ] No director gates — no review-mode.txt read
+- [ ] Report writing (if requested) gated by "May I write" approval
+- [ ] Ends with next-step handoff appropriate to verdict
+
+---
+
+## Coverage Notes
+
+- This skill checks for structural consistency between GDDs. Deep design theory
+  analysis (pillar drift, dominant strategies) is handled by `/review-all-gdds`.
+- Formula conflict detection relies on consistent formula notation across GDDs —
+  informal descriptions of the same mechanic may not be detected.
+- The conflict severity rubric (HIGH / MEDIUM / LOW) is defined in the skill body
+  and not re-enumerated here.
--- a/Framework/skills/analysis/content-audit.md
+++ b/Framework/skills/analysis/content-audit.md
@@ -0,0 +1,164 @@
+# Skill Test Spec: /content-audit
+
+## Skill Summary
+
+`/content-audit` reads GDDs in `design/gdd/` and checks whether all content
+items specified there (enemies, items, levels, etc.) are accounted for in
+`assets/`. It produces a gap table: Content Type → Specified Count → Found Count
+→ Missing Items. No director gates are invoked. The skill does not write without
+user approval. Verdicts: COMPLETE, GAPS FOUND, or MISSING CRITICAL CONTENT.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, GAPS FOUND, MISSING CRITICAL CONTENT
+- [ ] Does NOT require "May I write" language (read-only output; write is optional report)
+- [ ] Has a next-step handoff (what to do after gap table is reviewed)
+
+---
+
+## Director Gate Checks
+
+None. Content audit is a read-only analysis skill; no gates are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All specified content present
+
+**Fixture:**
+- `design/gdd/enemies.md` specifies 4 enemy types: Grunt, Sniper, Tank, Boss
+- `assets/art/characters/` contains folders: `grunt/`, `sniper/`, `tank/`, `boss/`
+- `design/gdd/items.md` specifies 3 item types; all 3 found in `assets/data/items/`
+
+**Input:** `/content-audit`
+
+**Expected behavior:**
+1. Skill reads all GDDs in `design/gdd/`
+2. Skill scans `assets/` for each specified content item
+3. All 4 enemy types and 3 item types are found
+4. Gap table shows: all rows have Found Count = Specified Count, no missing items
+5. Verdict is COMPLETE
+
+**Assertions:**
+- [ ] Gap table covers all content types found in GDDs
+- [ ] Each row shows Specified Count and Found Count
+- [ ] No missing items when counts match
+- [ ] Verdict is COMPLETE
+- [ ] No files are written
+
+---
+
+### Case 2: Gaps Found — Enemy type missing from assets
+
+**Fixture:**
+- `design/gdd/enemies.md` specifies 3 enemy types: Grunt, Sniper, Boss
+- `assets/art/characters/` contains: `grunt/`, `sniper/` only (Boss folder missing)
+
+**Input:** `/content-audit`
+
+**Expected behavior:**
+1. Skill reads GDD — finds 3 enemy types specified
+2. Skill scans `assets/art/characters/` — finds only 2
+3. Gap table row for enemies: Specified 3, Found 2, Missing: Boss
+4. Verdict is GAPS FOUND
+
+**Assertions:**
+- [ ] Gap table row identifies "Boss" as the missing item by name
+- [ ] Specified Count (3) and Found Count (2) are both shown
+- [ ] Verdict is GAPS FOUND when any content item is missing
+- [ ] Skill does not assume the asset will be added later — it flags it now
+
+---
+
+### Case 3: No GDD Content Specs Found — Guidance given
+
+**Fixture:**
+- `design/gdd/` contains only `core-loop.md` which has no content inventory section
+- No other GDDs exist with content specifications
+
+**Input:** `/content-audit`
+
+**Expected behavior:**
+1. Skill reads all GDDs — finds no content inventory sections
+2. Skill outputs: "No content specifications found in GDDs — run /design-system first to define content lists"
+3. No gap table is produced
+4. Verdict is GAPS FOUND (cannot confirm completeness without specs)
+
+**Assertions:**
+- [ ] Skill does not produce a gap table when no GDD content specs exist
+- [ ] Output recommends running `/design-system`
+- [ ] Verdict reflects inability to confirm completeness
+
+---
+
+### Case 4: Edge Case — Asset in wrong format for target platform
+
+**Fixture:**
+- `design/gdd/audio.md` specifies audio assets as OGG format
+- `assets/audio/sfx/jump.wav` exists (WAV format, not OGG)
+- `assets/audio/sfx/land.ogg` exists (correct format)
+- `technical-preferences.md` specifies audio format: OGG
+
+**Input:** `/content-audit`
+
+**Expected behavior:**
+1. Skill reads GDD audio spec and technical preferences for format requirements
+2. Skill finds `jump.wav` — present but in wrong format
+3. Gap table row for audio: Specified 2, Found 2 (by name), but `jump.wav` flagged as FORMAT ISSUE
+4. Verdict is GAPS FOUND (format compliance is part of content completeness)
+
+**Assertions:**
+- [ ] Skill checks asset format against GDD or technical preferences when format is specified
+- [ ] `jump.wav` is flagged as FORMAT ISSUE with expected format (OGG) noted
+- [ ] Format issues are distinct from missing content in the gap table
+- [ ] Verdict is GAPS FOUND when format issues exist
+
+---
+
+### Case 5: Gate Compliance — Read-only; no gate; gap table for human review
+
+**Fixture:**
+- GDDs specify 10 content items; 9 are found in assets; 1 is missing
+- `review-mode.txt` contains `full`
+
+**Input:** `/content-audit`
+
+**Expected behavior:**
+1. Skill reads GDDs and scans assets; produces gap table
+2. No director gate is invoked regardless of review mode
+3. Skill presents gap table to user as read-only output
+4. Verdict is GAPS FOUND
+5. Skill offers to write an audit report but does not write automatically
+
+**Assertions:**
+- [ ] No director gate is invoked in any review mode
+- [ ] Gap table is presented without auto-writing any file
+- [ ] Optional report write is offered but not forced
+- [ ] Skill does not modify any asset files
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads GDDs and asset directory before producing gap table
+- [ ] Gap table shows Content Type, Specified Count, Found Count, Missing Items
+- [ ] Does not write files without explicit user approval
+- [ ] No director gates are invoked
+- [ ] Verdict is one of: COMPLETE, GAPS FOUND, MISSING CRITICAL CONTENT
+
+---
+
+## Coverage Notes
+
+- MISSING CRITICAL CONTENT verdict (vs. GAPS FOUND) is triggered when the
+  missing item is tagged as critical in the GDD; this is not explicitly tested
+  but follows the same detection path.
+- The case where `assets/` directory does not exist is not tested; the skill
+  would produce a MISSING CRITICAL CONTENT verdict for all specified items.
--- a/Framework/skills/analysis/estimate.md
+++ b/Framework/skills/analysis/estimate.md
@@ -0,0 +1,168 @@
+# Skill Test Spec: /estimate
+
+## Skill Summary
+
+`/estimate` estimates task or story effort using a relative-size scale (S / M /
+L / XL) based on story complexity, acceptance criteria count, and historical
+sprint velocity from past sprint files. Estimates are advisory and are never
+written automatically. No director gates are invoked. Verdicts are effort ranges,
+not pass/fail — every run produces an estimate.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains size labels: S, M, L, XL (the "verdict" equivalents for this skill)
+- [ ] Does NOT require "May I write" language (advisory output only)
+- [ ] Has a next-step handoff (how to use the estimate in sprint planning)
+
+---
+
+## Director Gate Checks
+
+None. Estimation is an advisory informational skill; no gates are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Clear story with known tech stack
+
+**Fixture:**
+- `production/epics/combat/story-hitbox-detection.md` exists with:
+  - 4 clear Acceptance Criteria
+  - ADR reference (Accepted status)
+  - No "unknown" or "TBD" language in story body
+- `production/sprints/sprint-003.md` through `sprint-005.md` exist with velocity data
+- Tech stack is GDScript (well-understood by team per sprint history)
+
+**Input:** `/estimate production/epics/combat/story-hitbox-detection.md`
+
+**Expected behavior:**
+1. Skill reads the story file — assesses clarity, AC count, tech stack
+2. Skill reads sprint history to determine average velocity
+3. Skill outputs estimate: M (1–2 days) with reasoning
+4. No files are written
+
+**Assertions:**
+- [ ] Estimate is M for a clear, well-scoped story with known tech
+- [ ] Reasoning references AC count, tech stack familiarity, and velocity data
+- [ ] Estimate is presented as a range (e.g., "1–2 days"), not a single point
+- [ ] No files are written
+
+---
+
+### Case 2: High Uncertainty — Unknown system, no ADR yet
+
+**Fixture:**
+- `production/epics/online/story-lobby-matchmaking.md` exists with:
+  - 2 vague Acceptance Criteria (using "should" and "TBD")
+  - No ADR reference — matchmaking architecture not yet decided
+  - References new subsystem ("online/matchmaking") with no existing source files
+
+**Input:** `/estimate production/epics/online/story-lobby-matchmaking.md`
+
+**Expected behavior:**
+1. Skill reads story — finds vague AC, no ADR, no existing source
+2. Skill flags multiple uncertainty factors
+3. Estimate is L–XL with an explicit risk note: "Estimate range is wide due to architectural unknowns"
+4. Skill recommends creating an ADR before development begins
+
+**Assertions:**
+- [ ] Estimate is L or XL (not S or M) when significant unknowns exist
+- [ ] Risk note explains the specific unknowns driving the wide range
+- [ ] Output recommends resolving architectural questions first
+- [ ] No files are written
+
+---
+
+### Case 3: No Sprint Velocity Data — Conservative defaults used
+
+**Fixture:**
+- Story file exists and is well-defined
+- `production/sprints/` is empty — no historical sprints
+
+**Input:** `/estimate production/epics/core/story-save-load.md`
+
+**Expected behavior:**
+1. Skill reads story — assesses complexity
+2. Skill attempts to read sprint velocity data — finds none
+3. Skill notes: "No sprint history found — using conservative defaults for velocity"
+4. Estimate is produced using default assumptions (e.g., 1 story point = 1 day)
+5. No files are written
+
+**Assertions:**
+- [ ] Skill does not error when no sprint history exists
+- [ ] Output explicitly notes that conservative defaults are being used
+- [ ] Estimate is still produced (not blocked by missing velocity)
+- [ ] Conservative defaults produce a higher (not lower) estimate range
+
+---
+
+### Case 4: Multiple Stories — Each estimated individually plus sprint total
+
+**Fixture:**
+- User provides a sprint file: `production/sprints/sprint-007.md` with 4 stories
+- Sprint history exists (3 previous sprints)
+
+**Input:** `/estimate production/sprints/sprint-007.md`
+
+**Expected behavior:**
+1. Skill reads sprint file — identifies 4 stories
+2. Skill estimates each story individually: S, M, M, L
+3. Skill computes sprint total: approximately 6–8 story points
+4. Skill presents per-story estimates followed by sprint total
+5. No files are written
+
+**Assertions:**
+- [ ] Each story receives its own estimate label
+- [ ] Sprint total is presented after individual estimates
+- [ ] Total is a sum range derived from individual ranges
+- [ ] Skill handles sprint files (not just single story files) as input
+
+---
+
+### Case 5: Gate Compliance — No gate; estimates are informational
+
+**Fixture:**
+- Story file exists with medium complexity
+- `review-mode.txt` contains `full`
+
+**Input:** `/estimate production/epics/core/story-item-pickup.md`
+
+**Expected behavior:**
+1. Skill reads story and sprint history; computes estimate
+2. No director gate is invoked in any review mode
+3. Estimate is presented as advisory output only
+4. Skill notes: "Use this estimate in /sprint-plan when selecting stories for the next sprint"
+
+**Assertions:**
+- [ ] No director gate is invoked regardless of review mode
+- [ ] Output is purely informational — no approval or write prompt
+- [ ] Next-step recommendation references `/sprint-plan`
+- [ ] Estimate does not change based on review mode
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads story file before estimating
+- [ ] Reads sprint velocity history when available
+- [ ] Produces effort range (S/M/L/XL), not a single number
+- [ ] Does not write any files
+- [ ] No director gates are invoked
+- [ ] Always produces an estimate (never blocked by missing data; uses defaults instead)
+
+---
+
+## Coverage Notes
+
+- The skill does not produce PASS/FAIL verdicts; the "verdict" here is the
+  effort range itself. Test assertions focus on the accuracy of the range
+  and the quality of the reasoning, not a binary outcome.
+- Team-specific velocity calibration (what "M" means for this team) is an
+  implementation detail not tested here; it is configured via sprint history.
--- a/Framework/skills/analysis/perf-profile.md
+++ b/Framework/skills/analysis/perf-profile.md
@@ -0,0 +1,171 @@
+# Skill Test Spec: /perf-profile
+
+## Skill Summary
+
+`/perf-profile` is a structured performance profiling workflow that identifies
+bottlenecks and recommends optimizations. If profiler data or performance logs
+are provided, it analyzes them directly. If not, it guides the user through a
+manual profiling checklist. No director gates are invoked. The skill asks
+"May I write to `production/qa/perf-[date].md`?" before persisting a report.
+Verdicts: WITHIN BUDGET, CONCERNS, or OVER BUDGET.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: WITHIN BUDGET, CONCERNS, OVER BUDGET
+- [ ] Contains "May I write" language (skill writes perf report)
+- [ ] Has a next-step handoff (what to do after performance findings are reviewed)
+
+---
+
+## Director Gate Checks
+
+None. Performance profiling is an advisory analysis skill; no gates are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Frame data provided, draw call spike found
+
+**Fixture:**
+- User provides `production/qa/profiler-export-2026-03-15.json` with frame time data
+- Data shows: average frame time 14ms (within 16.6ms budget), but frames 42–48 spike to 28ms
+- Spike correlates with a scene with 450 draw calls (budget: 200)
+
+**Input:** `/perf-profile production/qa/profiler-export-2026-03-15.json`
+
+**Expected behavior:**
+1. Skill reads profiler data
+2. Skill identifies average frame time is within budget
+3. Skill identifies draw call spike on frames 42–48 (450 calls vs 200 budget)
+4. Verdict is CONCERNS (average OK, but spikes indicate an issue)
+5. Skill recommends batching or culling for the identified scene
+6. Skill asks "May I write to `production/qa/perf-2026-04-06.md`?"
+
+**Assertions:**
+- [ ] Spike frames are identified by frame number
+- [ ] Draw call count and budget are compared explicitly
+- [ ] Verdict is CONCERNS when spikes exceed budget even if average is OK
+- [ ] At least one specific optimization recommendation is given
+- [ ] "May I write" prompt appears before writing report
+
+---
+
+### Case 2: No Profiler Data — Manual checklist output
+
+**Fixture:**
+- User runs `/perf-profile` with no arguments
+- No profiler data files exist in `production/qa/`
+
+**Input:** `/perf-profile`
+
+**Expected behavior:**
+1. Skill finds no profiler data
+2. Skill outputs a manual profiling checklist for the user to work through:
+   - Enable Godot profiler or target engine's profiler
+   - Record a 60-second play session
+   - Export frame time data
+   - Note any dropped frames or hitches
+3. Skill asks user to provide data once collected before running analysis
+
+**Assertions:**
+- [ ] Skill does not crash or emit a verdict when no data is provided
+- [ ] Manual profiling checklist is output (actionable steps, not just an error)
+- [ ] No verdict is emitted (there is nothing to assess yet)
+- [ ] No files are written
+
+---
+
+### Case 3: Over Budget — Frame budget exceeded for target platform
+
+**Fixture:**
+- Profiler data shows consistent 22ms frame times (target: 16.6ms for 60fps)
+- All frames exceed budget; no single spike — systemic issue
+- `technical-preferences.md` specifies target platform: PC, 60fps
+
+**Input:** `/perf-profile production/qa/profiler-export-2026-03-20.json`
+
+**Expected behavior:**
+1. Skill reads profiler data and technical preferences for performance budget
+2. All frames are over the 16.6ms budget
+3. Verdict is OVER BUDGET
+4. Skill outputs a prioritized optimization list (e.g., LOD system, shader complexity, physics tick rate)
+5. Skill asks "May I write" before writing report
+
+**Assertions:**
+- [ ] Verdict is OVER BUDGET when all or most frames exceed budget
+- [ ] Target frame budget is read from `technical-preferences.md` (not hardcoded)
+- [ ] Optimization priority list is provided, not just the raw verdict
+- [ ] "May I write" prompt appears before report write
+
+---
+
+### Case 4: Previous Perf Report Exists — Delta comparison
+
+**Fixture:**
+- `production/qa/perf-2026-03-28.md` exists with prior results (avg 15ms, max 19ms)
+- New profiler export shows: avg 13ms, max 17ms
+- Both reports are for the same scene
+
+**Input:** `/perf-profile production/qa/profiler-export-2026-04-05.json`
+
+**Expected behavior:**
+1. Skill reads new profiler data
+2. Skill detects prior report for the same scene
+3. Skill computes deltas: avg improved 2ms, max improved 2ms
+4. Skill presents regression check: no regressions detected
+5. Verdict is WITHIN BUDGET; report notes improvement since last profile
+
+**Assertions:**
+- [ ] Skill checks `production/qa/` for prior perf reports before writing
+- [ ] Delta comparison is shown (prior vs. current for key metrics)
+- [ ] Verdict is WITHIN BUDGET when current metrics are within budget
+- [ ] Improvement trend is noted positively in the report
+
+---
+
+### Case 5: Gate Compliance — No gate; performance-analyst separate
+
+**Fixture:**
+- Profiler data shows CONCERNS-level findings (some spikes)
+- `review-mode.txt` contains `full`
+
+**Input:** `/perf-profile production/qa/profiler-export-2026-04-01.json`
+
+**Expected behavior:**
+1. Skill analyzes profiler data; verdict is CONCERNS
+2. No director gate is invoked regardless of review mode
+3. Output notes: "For in-depth analysis, consider running `/perf-profile` with the performance-analyst agent"
+4. Skill asks "May I write" and writes report on user approval
+
+**Assertions:**
+- [ ] No director gate is invoked in any review mode
+- [ ] Performance-analyst consultation is suggested (not mandated)
+- [ ] "May I write" prompt appears before report write
+- [ ] Verdict is CONCERNS for spike-based findings
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads profiler data when provided; outputs checklist when not
+- [ ] Reads `technical-preferences.md` for target platform frame budget
+- [ ] Checks for prior perf reports to enable delta comparison
+- [ ] Always asks "May I write" before writing report
+- [ ] No director gates are invoked
+- [ ] Verdict is one of: WITHIN BUDGET, CONCERNS, OVER BUDGET
+
+---
+
+## Coverage Notes
+
+- Platform-specific profiling workflows (console, mobile) are not tested here;
+  the checklist output in Case 2 would be platform-specific in practice.
+- The delta comparison in Case 4 assumes reports cover the same scene; cross-scene
+  comparisons are not explicitly handled.
--- a/Framework/skills/analysis/scope-check.md
+++ b/Framework/skills/analysis/scope-check.md
@@ -0,0 +1,168 @@
+# Skill Test Spec: /scope-check
+
+## Skill Summary
+
+`/scope-check` is a Haiku-tier read-only skill that analyzes a feature, sprint,
+or story for scope creep risk. It reads sprint and story files and compares them
+against the active milestone goals. It is designed for fast, low-cost checks
+before or during planning. No director gates are invoked. No files are written.
+Verdicts: ON SCOPE, CONCERNS, or SCOPE CREEP DETECTED.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: ON SCOPE, CONCERNS, SCOPE CREEP DETECTED
+- [ ] Does NOT require "May I write" language (read-only skill)
+- [ ] Has a next-step handoff (what to do based on verdict)
+
+---
+
+## Director Gate Checks
+
+None. Scope check is a read-only advisory skill; no gates are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Sprint stories align with milestone goals
+
+**Fixture:**
+- `production/milestones/milestone-03.md` lists 3 goals: combat system, enemy AI, level loading
+- `production/sprints/sprint-006.md` contains 5 stories, all tagged to one of the 3 goals
+- `production/session-state/active.md` references milestone-03 as the active milestone
+
+**Input:** `/scope-check`
+
+**Expected behavior:**
+1. Skill reads active milestone goals from milestone-03
+2. Skill reads sprint-006 stories and checks each against milestone goals
+3. All 5 stories map to one of the 3 goals
+4. Skill outputs a mapping table: story → milestone goal
+5. Verdict is ON SCOPE
+
+**Assertions:**
+- [ ] Each story is mapped to a milestone goal in the output
+- [ ] Verdict is ON SCOPE when all stories map to milestone goals
+- [ ] No files are written
+- [ ] Skill does not modify sprint or milestone files
+
+---
+
+### Case 2: Scope Creep Detected — Stories introducing systems not in milestone
+
+**Fixture:**
+- `production/milestones/milestone-03.md` goals: combat, enemy AI, level loading
+- `production/sprints/sprint-006.md` contains 5 stories:
+  - 3 stories map to milestone goals
+  - 2 stories reference "online leaderboard" and "achievement system" (not in milestone-03)
+
+**Input:** `/scope-check`
+
+**Expected behavior:**
+1. Skill reads milestone goals and sprint stories
+2. Skill identifies 2 stories with no matching milestone goal
+3. Skill names the out-of-scope stories: "Online Leaderboard Feature", "Achievement System Setup"
+4. Verdict is SCOPE CREEP DETECTED
+
+**Assertions:**
+- [ ] Out-of-scope stories are named explicitly in the output
+- [ ] Verdict is SCOPE CREEP DETECTED when any story has no milestone goal match
+- [ ] Skill does not automatically remove the stories — findings are advisory
+- [ ] Output recommends deferring the out-of-scope stories to a later milestone
+
+---
+
+### Case 3: No Milestone Defined — CONCERNS; scope cannot be validated
+
+**Fixture:**
+- `production/session-state/active.md` has no milestone reference
+- `production/milestones/` directory exists but is empty
+- `production/sprints/sprint-006.md` has 4 stories
+
+**Input:** `/scope-check`
+
+**Expected behavior:**
+1. Skill reads active.md — finds no milestone reference
+2. Skill checks `production/milestones/` — no milestone files found
+3. Skill outputs: "No active milestone defined — scope cannot be validated"
+4. Verdict is CONCERNS
+
+**Assertions:**
+- [ ] Skill does not error when no milestone is defined
+- [ ] Output explicitly states that scope validation requires a milestone reference
+- [ ] Verdict is CONCERNS (not ON SCOPE or SCOPE CREEP DETECTED without data)
+- [ ] Output suggests running `/milestone-review` or creating a milestone
+
+---
+
+### Case 4: Single Story Check — Evaluated against its parent epic
+
+**Fixture:**
+- User targets a single story: `production/epics/combat/story-parry-timing.md`
+- Story references parent epic: `epic-combat.md`
+- `production/epics/combat/epic-combat.md` has scope: "melee combat mechanics"
+- Story title: "Implement parry timing window" — matches epic scope
+
+**Input:** `/scope-check production/epics/combat/story-parry-timing.md`
+
+**Expected behavior:**
+1. Skill reads the specified story file
+2. Skill reads the parent epic to get scope definition
+3. Skill evaluates story against epic scope — "parry timing" matches "melee combat"
+4. Verdict is ON SCOPE
+
+**Assertions:**
+- [ ] Single-file argument is accepted (story path, not sprint)
+- [ ] Skill reads the parent epic referenced in the story file
+- [ ] Story is evaluated against epic scope (not milestone scope) in single-story mode
+- [ ] Verdict is ON SCOPE when story matches epic scope
+
+---
+
+### Case 5: Gate Compliance — No gate; PR may be consulted separately
+
+**Fixture:**
+- Sprint has 2 SCOPE CREEP stories and 3 ON SCOPE stories
+- `review-mode.txt` contains `full`
+
+**Input:** `/scope-check`
+
+**Expected behavior:**
+1. Skill reads milestone and sprint; identifies 2 scope creep items
+2. No director gate is invoked regardless of review mode
+3. Skill presents findings with SCOPE CREEP DETECTED verdict
+4. Output notes: "Consider raising scope concerns with the Producer before sprint begins"
+5. Skill ends without writing any files
+
+**Assertions:**
+- [ ] No director gate is invoked in any review mode
+- [ ] Producer consultation is suggested (not mandated)
+- [ ] No files are written
+- [ ] Verdict is SCOPE CREEP DETECTED
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads milestone goals and sprint/story files before analysis
+- [ ] Maps each story to a milestone goal (or flags as unmapped)
+- [ ] Does not write any files
+- [ ] No director gates are invoked
+- [ ] Runs on Haiku model tier (fast, low-cost)
+- [ ] Verdict is one of: ON SCOPE, CONCERNS, SCOPE CREEP DETECTED
+
+---
+
+## Coverage Notes
+
+- The case where the sprint file itself does not exist is not tested; the
+  skill would output a CONCERNS verdict with a message about missing sprint data.
+- Partial scope overlap (story touches a milestone goal but also introduces
+  new scope) is not explicitly tested; implementation may classify this as
+  CONCERNS rather than SCOPE CREEP DETECTED.
--- a/Framework/skills/analysis/security-audit.md
+++ b/Framework/skills/analysis/security-audit.md
@@ -0,0 +1,167 @@
+# Skill Test Spec: /security-audit
+
+## Skill Summary
+
+`/security-audit` audits the game for security risks including save data
+integrity, network communication, anti-cheat exposure, and data privacy. It
+reads source files in `src/` for security patterns and checks whether sensitive
+data is handled correctly. No director gates are invoked. The skill does not
+write files (findings report only). Verdicts: SECURE, CONCERNS, or
+VULNERABILITIES FOUND.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: SECURE, CONCERNS, VULNERABILITIES FOUND
+- [ ] Does NOT require "May I write" language (read-only; findings report only)
+- [ ] Has a next-step handoff (what to do with findings)
+
+---
+
+## Director Gate Checks
+
+None. Security audit is a read-only advisory skill; no gates are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Save data encrypted, no hardcoded credentials
+
+**Fixture:**
+- `src/core/save_system.gd` uses `Crypto` class to encrypt save data before writing
+- No hardcoded API keys, passwords, or credentials in any `src/` file
+- No version numbers or internal build IDs exposed in client-facing output
+
+**Input:** `/security-audit`
+
+**Expected behavior:**
+1. Skill scans `src/` for security patterns: encryption usage, hardcoded credentials, exposed internals
+2. All checks pass: save data encrypted, no credentials found, no exposed internals
+3. Findings report shows all checks PASS
+4. Verdict is SECURE
+
+**Assertions:**
+- [ ] Skill checks save data handling for encryption usage
+- [ ] Skill scans for hardcoded credentials (API keys, passwords, tokens)
+- [ ] Skill checks for version/build numbers exposed to players
+- [ ] All checks shown in findings report
+- [ ] Verdict is SECURE when all checks pass
+
+---
+
+### Case 2: Vulnerabilities Found — Unencrypted save data and exposed version
+
+**Fixture:**
+- `src/core/save_system.gd` writes save data as plain JSON (no encryption)
+- `src/ui/debug_overlay.gd` contains: `label.text = "Build: " + ProjectSettings.get("application/config/version")`
+  (exposes internal build version to player)
+
+**Input:** `/security-audit`
+
+**Expected behavior:**
+1. Skill scans `src/` — finds unencrypted save write in `save_system.gd`
+2. Skill finds exposed version string in `debug_overlay.gd`
+3. Both findings are flagged as VULNERABILITIES
+4. Verdict is VULNERABILITIES FOUND
+5. Skill provides remediation recommendations for each vulnerability
+
+**Assertions:**
+- [ ] Unencrypted save data is flagged as a vulnerability with file and approximate line
+- [ ] Exposed version string is flagged as a vulnerability
+- [ ] Remediation suggestion is given for each vulnerability
+- [ ] Verdict is VULNERABILITIES FOUND when any vulnerability is detected
+- [ ] No files are written or modified
+
+---
+
+### Case 3: Online Features Without Authentication — CONCERNS
+
+**Fixture:**
+- `src/networking/lobby.gd` exists with functions: `join_lobby()`, `send_chat()`
+- No authentication check is found before `send_chat()` — players can call it without being verified
+- Game has online multiplayer features (inferred from file presence)
+
+**Input:** `/security-audit`
+
+**Expected behavior:**
+1. Skill scans `src/networking/` — detects online feature code
+2. Skill checks for authentication guard before network calls — finds none on `send_chat()`
+3. Flags: "Online feature without authentication check — CONCERNS"
+4. Verdict is CONCERNS (not VULNERABILITIES FOUND, as this is a missing control, not an exploit)
+
+**Assertions:**
+- [ ] Skill detects online features by scanning for networking source files
+- [ ] Missing authentication checks before network operations are flagged
+- [ ] Verdict is CONCERNS (advisory severity) for missing authentication guards
+- [ ] Output recommends adding authentication before network calls
+
+---
+
+### Case 4: Edge Case — No Source Files to Analyze
+
+**Fixture:**
+- `src/` directory does not exist or is completely empty
+
+**Input:** `/security-audit`
+
+**Expected behavior:**
+1. Skill attempts to scan `src/` — no files found
+2. Skill outputs an error: "No source files found in `src/` — nothing to audit"
+3. No findings report is generated
+4. No verdict is emitted
+
+**Assertions:**
+- [ ] Skill does not crash when `src/` is empty or absent
+- [ ] Output clearly states that no source files were found
+- [ ] No verdict is emitted (there is nothing to assess)
+- [ ] Skill suggests verifying the `src/` directory path
+
+---
+
+### Case 5: Gate Compliance — No gate; security-engineer invoked separately
+
+**Fixture:**
+- Source files exist; 1 CONCERNS-level finding detected (debug logging enabled in release build)
+- `review-mode.txt` contains `full`
+
+**Input:** `/security-audit`
+
+**Expected behavior:**
+1. Skill scans source; finds debug logging active in release path
+2. No director gate is invoked regardless of review mode
+3. Verdict is CONCERNS
+4. Output notes: "For formal security review, consider engaging a security-engineer agent"
+5. Findings are presented as a read-only report; no files written
+
+**Assertions:**
+- [ ] No director gate is invoked in any review mode
+- [ ] Security-engineer consultation is suggested (not mandated)
+- [ ] No files are written
+- [ ] Verdict is CONCERNS for advisory-level security findings
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads source files in `src/` before auditing
+- [ ] Checks save data encryption, hardcoded credentials, exposed internals, auth guards
+- [ ] Provides remediation recommendations for each finding
+- [ ] Does not write any files (read-only skill)
+- [ ] No director gates are invoked
+- [ ] Verdict is one of: SECURE, CONCERNS, VULNERABILITIES FOUND
+
+---
+
+## Coverage Notes
+
+- Anti-cheat analysis (client-side value validation, server authority) is not
+  explicitly tested here; it follows the CONCERNS or VULNERABILITIES pattern
+  depending on severity.
+- Data privacy compliance (GDPR, COPPA) is out of scope for this spec; those
+  require legal review beyond code scanning.
--- a/Framework/skills/analysis/tech-debt.md
+++ b/Framework/skills/analysis/tech-debt.md
@@ -0,0 +1,171 @@
+# Skill Test Spec: /tech-debt
+
+## Skill Summary
+
+`/tech-debt` tracks, categorizes, and prioritizes technical debt across the
+codebase. It reads `docs/tech-debt-register.md` for the existing debt register
+and scans source files in `src/` for inline `TODO` and `FIXME` comments. It
+merges and sorts items by severity. No director gates are invoked. The skill
+asks "May I write to `docs/tech-debt-register.md`?" before updating. Verdicts:
+REGISTER UPDATED or NO NEW DEBT FOUND.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: REGISTER UPDATED, NO NEW DEBT FOUND
+- [ ] Contains "May I write" language (skill writes to debt register)
+- [ ] Has a next-step handoff (what to do after register is updated)
+
+---
+
+## Director Gate Checks
+
+None. Tech debt tracking is an internal codebase analysis skill; no gates are
+invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Inline TODOs plus existing register items merged
+
+**Fixture:**
+- `docs/tech-debt-register.md` exists with 2 items (LOW and MEDIUM severity)
+- `src/gameplay/combat.gd` has 2 `# TODO` comments and 1 `# FIXME` comment
+- `src/ui/hud.gd` has 0 inline debt comments
+
+**Input:** `/tech-debt`
+
+**Expected behavior:**
+1. Skill reads `docs/tech-debt-register.md` — finds 2 existing items
+2. Skill scans `src/` — finds 3 inline comments (2 TODOs, 1 FIXME)
+3. Skill checks whether inline comments already exist in the register (deduplication)
+4. Skill presents combined list sorted by severity (FIXME before TODO by default)
+5. Skill asks "May I write to `docs/tech-debt-register.md`?"
+6. User approves; register updated; verdict REGISTER UPDATED
+
+**Assertions:**
+- [ ] Inline comments are found by scanning `src/` recursively
+- [ ] Existing register items are not duplicated
+- [ ] Combined list is sorted by severity
+- [ ] "May I write" prompt appears before any write
+- [ ] Verdict is REGISTER UPDATED
+
+---
+
+### Case 2: Register Doesn't Exist — Offered to create it
+
+**Fixture:**
+- `docs/tech-debt-register.md` does NOT exist
+- `src/` contains 4 inline TODO/FIXME comments
+
+**Input:** `/tech-debt`
+
+**Expected behavior:**
+1. Skill attempts to read `docs/tech-debt-register.md` — not found
+2. Skill informs user: "No tech-debt-register.md found"
+3. Skill offers to create the register with the inline items it found
+4. Skill asks "May I write to `docs/tech-debt-register.md`?" (create)
+5. User approves; register created with 4 items; verdict REGISTER UPDATED
+
+**Assertions:**
+- [ ] Skill does not crash when register file is absent
+- [ ] User is offered register creation (not silently skipping)
+- [ ] "May I write" prompt reflects file creation (not update)
+- [ ] Verdict is REGISTER UPDATED after creation
+
+---
+
+### Case 3: Resolved Item Detected — Marked resolved in register
+
+**Fixture:**
+- `docs/tech-debt-register.md` has 3 items; one references `src/gameplay/legacy_input.gd`
+- `src/gameplay/legacy_input.gd` has been deleted (refactored away)
+- The referenced TODO comment no longer exists in source
+
+**Input:** `/tech-debt`
+
+**Expected behavior:**
+1. Skill reads register — finds 3 items
+2. Skill scans `src/` — does not find the source location referenced by item 2
+3. Skill flags item 2 as RESOLVED (source is gone)
+4. Skill presents the resolved item to user for confirmation
+5. On approval, register is updated with item 2 marked `Status: Resolved`
+
+**Assertions:**
+- [ ] Skill checks whether each register item's source reference still exists
+- [ ] Missing source locations result in items being flagged as RESOLVED
+- [ ] User confirms before resolved items are written
+- [ ] RESOLVED items are kept in the register (not deleted) for audit history
+
+---
+
+### Case 4: Edge Case — CRITICAL debt item surfaces prominently
+
+**Fixture:**
+- `src/core/network_sync.gd` has a comment: `# FIXME(CRITICAL): race condition in sync buffer — can corrupt save data`
+- `docs/tech-debt-register.md` exists with 5 lower-severity items
+
+**Input:** `/tech-debt`
+
+**Expected behavior:**
+1. Skill scans source and finds the CRITICAL-tagged FIXME
+2. Skill presents the CRITICAL item at the top of the output — before the full table
+3. Skill asks user to acknowledge the critical item before proceeding
+4. After acknowledgment, skill presents full debt table and asks to write
+5. Register is updated with CRITICAL item at top; verdict REGISTER UPDATED
+
+**Assertions:**
+- [ ] CRITICAL items appear at the top of the output, not buried in the table
+- [ ] Skill surfaces CRITICAL items before asking to write
+- [ ] User acknowledgment of the CRITICAL item is requested
+- [ ] CRITICAL severity is preserved in the written register entry
+
+---
+
+### Case 5: Gate Compliance — No gate; register updated only with approval
+
+**Fixture:**
+- Inline scan finds 2 new TODOs; register has 3 existing items
+- `review-mode.txt` contains `full`
+
+**Input:** `/tech-debt`
+
+**Expected behavior:**
+1. Skill scans source and reads register; compiles combined debt list
+2. No director gate is invoked regardless of review mode
+3. Skill presents sorted debt table to user
+4. Skill asks "May I write to `docs/tech-debt-register.md`?"
+5. User approves; register updated; verdict REGISTER UPDATED
+
+**Assertions:**
+- [ ] No director gate is invoked in any review mode
+- [ ] Debt table is presented before any write prompt
+- [ ] "May I write" prompt appears before file update
+- [ ] Write only occurs with explicit user approval
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads `docs/tech-debt-register.md` and scans `src/` before compiling
+- [ ] Deduplicates inline comments against existing register items
+- [ ] Sorts combined list by severity
+- [ ] Always asks "May I write" before updating register
+- [ ] No director gates are invoked
+- [ ] Verdict is REGISTER UPDATED or NO NEW DEBT FOUND
+
+---
+
+## Coverage Notes
+
+- The case where `src/` is empty or absent is not tested; behavior follows
+  the NO NEW DEBT FOUND path for the inline scan, but register items would
+  still be read and presented.
+- TODO comments without severity tags are treated as LOW severity by default;
+  this classification detail is an implementation concern, not tested here.
--- a/Framework/skills/analysis/test-evidence-review.md
+++ b/Framework/skills/analysis/test-evidence-review.md
@@ -0,0 +1,175 @@
+# Skill Test Spec: /test-evidence-review
+
+## Skill Summary
+
+`/test-evidence-review` performs a quality review of test files in `tests/`,
+checking test naming conventions, determinism, isolation, and absence of
+hardcoded magic numbers — all against the project's test standards defined in
+`coding-standards.md`. Findings may be flagged for qa-lead review. No director
+gates are invoked. The skill does not write without user approval. Verdicts:
+PASS, WARNINGS, or FAIL.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: PASS, WARNINGS, FAIL
+- [ ] Does NOT require "May I write" language (read-only; write is optional flagging report)
+- [ ] Has a next-step handoff (what to do after findings are reviewed)
+
+---
+
+## Director Gate Checks
+
+None. Test evidence review is an advisory quality skill; QL-TEST-COVERAGE gate
+is a separate skill invocation and is NOT triggered here.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Tests follow all standards
+
+**Fixture:**
+- `tests/unit/combat/health_system_take_damage_test.gd` exists with:
+  - Naming: `test_health_system_take_damage_reduces_health()` (follows `test_[system]_[scenario]_[expected]`)
+  - Arrange/Act/Assert structure present
+  - No `sleep()`, `await` with time values, or random seeds
+  - No calls to external APIs or file I/O
+  - No inline magic numbers (uses constants from `tests/unit/combat/fixtures/`)
+
+**Input:** `/test-evidence-review tests/unit/combat/`
+
+**Expected behavior:**
+1. Skill reads test standards from `coding-standards.md`
+2. Skill reads the test file; checks all 5 standards
+3. All checks pass: naming, structure, determinism, isolation, no hardcoded data
+4. Verdict is PASS
+
+**Assertions:**
+- [ ] Each of the 5 test standards is checked and reported
+- [ ] All checks show PASS when standards are met
+- [ ] Verdict is PASS
+- [ ] No files are written
+
+---
+
+### Case 2: Fail — Timing dependency detected
+
+**Fixture:**
+- `tests/unit/ui/hud_update_test.gd` contains:
+  ```gdscript
+  await get_tree().create_timer(1.0).timeout
+  assert_eq(label.text, "Ready")
+  ```
+- Real-time wait of 1 second used instead of mock or signal-based assertion
+
+**Input:** `/test-evidence-review tests/unit/ui/hud_update_test.gd`
+
+**Expected behavior:**
+1. Skill reads the test file
+2. Skill detects real-time wait (`create_timer(1.0)`) — non-deterministic timing dependency
+3. Skill flags this as a FAIL-level finding
+4. Verdict is FAIL
+5. Skill recommends replacing the timer with a signal-based assertion or mock
+
+**Assertions:**
+- [ ] Real-time wait usage is detected as a non-deterministic timing dependency
+- [ ] Finding is classified as FAIL severity (blocking — violates determinism standard)
+- [ ] Verdict is FAIL
+- [ ] Remediation suggestion references signal-based or mock-based approach
+- [ ] Skill does not edit the test file
+
+---
+
+### Case 3: Fail — Test calls external API directly
+
+**Fixture:**
+- `tests/unit/networking/auth_test.gd` contains:
+  ```gdscript
+  var result = HTTPRequest.new().request("https://api.example.com/auth")
+  ```
+- Direct HTTP call to external API without a mock
+
+**Input:** `/test-evidence-review tests/unit/networking/auth_test.gd`
+
+**Expected behavior:**
+1. Skill reads the test file
+2. Skill detects direct external API call (HTTPRequest to live URL)
+3. Skill flags this as a FAIL-level finding — violates isolation standard
+4. Verdict is FAIL
+5. Skill recommends injecting a mock HTTP client
+
+**Assertions:**
+- [ ] Direct external API call is detected and flagged
+- [ ] Finding is classified as FAIL severity (violates isolation standard)
+- [ ] Verdict is FAIL
+- [ ] Remediation references dependency injection with a mock HTTP client
+- [ ] Skill does not modify the test file
+
+---
+
+### Case 4: Edge Case — No Test Files Found
+
+**Fixture:**
+- User calls `/test-evidence-review tests/unit/audio/`
+- `tests/unit/audio/` directory does not exist
+
+**Input:** `/test-evidence-review tests/unit/audio/`
+
+**Expected behavior:**
+1. Skill attempts to read files in `tests/unit/audio/` — not found
+2. Skill outputs: "No test files found at `tests/unit/audio/` — run `/test-setup` to scaffold test directories"
+3. No verdict is emitted
+
+**Assertions:**
+- [ ] Skill does not crash when path does not exist
+- [ ] Output names the attempted path in the message
+- [ ] Output recommends `/test-setup` for scaffolding
+- [ ] No verdict is emitted when there is nothing to review
+
+---
+
+### Case 5: Gate Compliance — No gate; QL-TEST-COVERAGE is a separate skill
+
+**Fixture:**
+- Test file has 1 WARNINGS-level finding (magic number in a non-boundary test)
+- `review-mode.txt` contains `full`
+
+**Input:** `/test-evidence-review tests/unit/combat/`
+
+**Expected behavior:**
+1. Skill reviews tests; finds 1 WARNINGS-level finding
+2. No director gate is invoked (QL-TEST-COVERAGE is invoked separately, not here)
+3. Verdict is WARNINGS
+4. Output notes: "For full test coverage gate, run `/gate-check` which invokes QL-TEST-COVERAGE"
+5. Skill offers optional report write; asks "May I write" if user opts in
+
+**Assertions:**
+- [ ] No director gate is invoked in any review mode
+- [ ] Output distinguishes this skill from the QL-TEST-COVERAGE gate invocation
+- [ ] Optional report requires "May I write" before writing
+- [ ] Verdict is WARNINGS for advisory-level test quality issues
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads `coding-standards.md` test standards before reviewing test files
+- [ ] Checks naming, Arrange/Act/Assert structure, determinism, isolation, no hardcoded data
+- [ ] Does not edit any test files (read-only skill)
+- [ ] No director gates are invoked
+- [ ] Verdict is one of: PASS, WARNINGS, FAIL
+
+---
+
+## Coverage Notes
+
+- Batch review of all test files in `tests/` is not explicitly tested; behavior
+  is assumed to apply the same checks file by file and aggregate the verdict.
+- The QL-TEST-COVERAGE director gate (which checks test coverage percentage) is
+  a separate concern and is intentionally NOT invoked by this skill.
--- a/Framework/skills/analysis/test-flakiness.md
+++ b/Framework/skills/analysis/test-flakiness.md
@@ -0,0 +1,177 @@
+# Skill Test Spec: /test-flakiness
+
+## Skill Summary
+
+`/test-flakiness` detects non-deterministic tests by analyzing test history logs
+(if available) or scanning test source code for common flakiness patterns (random
+numbers without seeds, real-time waits, external I/O). No director gates are
+invoked. The skill does not write without user approval. Verdicts: NO FLAKINESS,
+SUSPECT TESTS FOUND, or CONFIRMED FLAKY.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: NO FLAKINESS, SUSPECT TESTS FOUND, CONFIRMED FLAKY
+- [ ] Does NOT require "May I write" language (read-only; optional report requires approval)
+- [ ] Has a next-step handoff (what to do after flakiness findings)
+
+---
+
+## Director Gate Checks
+
+None. Flakiness detection is an advisory quality skill for the QA lead; no gates
+are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Clean test history, no flakiness
+
+**Fixture:**
+- `production/qa/test-history/` contains logs for 10 test runs
+- All tests pass consistently across all 10 runs (100% pass rate per test)
+- No test has a failure pattern
+
+**Input:** `/test-flakiness`
+
+**Expected behavior:**
+1. Skill reads test history logs from `production/qa/test-history/`
+2. Skill computes per-test pass rate across 10 runs
+3. All tests pass all 10 runs — no inconsistency detected
+4. Verdict is NO FLAKINESS
+
+**Assertions:**
+- [ ] Skill reads test history logs when available
+- [ ] Per-test pass rate is computed across all available runs
+- [ ] Verdict is NO FLAKINESS when all tests pass consistently
+- [ ] No files are written
+
+---
+
+### Case 2: Suspect Tests Found — Test fails intermittently in history
+
+**Fixture:**
+- `production/qa/test-history/` contains logs for 10 test runs
+- `test_combat_damage_applies_crit_multiplier` passes 7 times, fails 3 times
+- Failure messages differ (sometimes timeout, sometimes wrong value)
+
+**Input:** `/test-flakiness`
+
+**Expected behavior:**
+1. Skill reads test history logs — computes pass rates
+2. `test_combat_damage_applies_crit_multiplier` has 70% pass rate (threshold: 95%)
+3. Skill flags it as SUSPECT with pass rate (7/10) and failure pattern noted
+4. Verdict is SUSPECT TESTS FOUND
+5. Skill recommends investigating the test for timing or state dependencies
+
+**Assertions:**
+- [ ] Tests below the pass-rate threshold are flagged by name
+- [ ] Pass rate (fraction and percentage) is shown for each suspect test
+- [ ] Failure pattern (e.g., inconsistent error messages) is noted if detectable
+- [ ] Verdict is SUSPECT TESTS FOUND
+- [ ] Skill recommends investigation steps
+
+---
+
+### Case 3: Source Pattern — Random number used without seed
+
+**Fixture:**
+- No test history logs exist
+- `tests/unit/loot/loot_drop_test.gd` contains:
+  ```gdscript
+  var roll = randf()  # unseeded random — non-deterministic
+  assert_gt(roll, 0.5, "Loot should drop above 50%")
+  ```
+
+**Input:** `/test-flakiness`
+
+**Expected behavior:**
+1. Skill finds no test history logs
+2. Skill falls back to source code analysis
+3. Skill detects `randf()` call without a preceding `seed()` call
+4. Skill flags the test as FLAKINESS RISK (source pattern, not confirmed)
+5. Verdict is SUSPECT TESTS FOUND (pattern detected, not confirmed by history)
+6. Skill recommends seeding random before the call or mocking the random function
+
+**Assertions:**
+- [ ] Source code analysis is used as fallback when no history logs exist
+- [ ] Unseeded random number usage is detected as a flakiness risk
+- [ ] Verdict is SUSPECT TESTS FOUND (not CONFIRMED FLAKY — no history to confirm)
+- [ ] Remediation recommends seeding or mocking
+
+---
+
+### Case 4: No Test History — Source-only analysis with common patterns
+
+**Fixture:**
+- `production/qa/test-history/` does not exist
+- `tests/` contains 15 test files
+- Scan finds 2 tests using `OS.get_ticks_msec()` for timing assertions
+- No other flakiness patterns found
+
+**Input:** `/test-flakiness`
+
+**Expected behavior:**
+1. Skill checks for test history — not found
+2. Skill notes: "No test history available — analyzing source code for flakiness patterns only"
+3. Skill scans all test files for known patterns: unseeded random, real-time waits, system clock usage
+4. Finds 2 tests using `OS.get_ticks_msec()` — flags as FLAKINESS RISK
+5. Verdict is SUSPECT TESTS FOUND
+
+**Assertions:**
+- [ ] Skill notes clearly that source-only analysis is being performed (no history)
+- [ ] Common flakiness patterns are scanned: random, time-based assertions, external I/O
+- [ ] `OS.get_ticks_msec()` usage for assertions is flagged as a flakiness risk
+- [ ] Verdict is SUSPECT TESTS FOUND when source patterns are found
+
+---
+
+### Case 5: Gate Compliance — No gate; flakiness report is advisory
+
+**Fixture:**
+- Test history shows 1 CONFIRMED FLAKY test (fails 6 out of 10 runs)
+- `review-mode.txt` contains `full`
+
+**Input:** `/test-flakiness`
+
+**Expected behavior:**
+1. Skill analyzes test history; identifies 1 confirmed flaky test
+2. No director gate is invoked regardless of review mode
+3. Verdict is CONFIRMED FLAKY
+4. Skill presents findings and offers optional written report
+5. If user opts in: "May I write to `production/qa/flakiness-report-[date].md`?"
+
+**Assertions:**
+- [ ] No director gate is invoked in any review mode
+- [ ] CONFIRMED FLAKY verdict requires history-based evidence (not just source patterns)
+- [ ] Optional report requires "May I write" before writing
+- [ ] Flakiness report is advisory for qa-lead; skill does not auto-disable tests
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads test history logs when available; falls back to source analysis when not
+- [ ] Notes clearly which analysis mode is being used (history vs. source-only)
+- [ ] Flakiness threshold (e.g., 95% pass rate) is used for SUSPECT classification
+- [ ] CONFIRMED FLAKY requires history evidence; SUSPECT covers source patterns only
+- [ ] Does not disable or modify any test files
+- [ ] No director gates are invoked
+- [ ] Verdict is one of: NO FLAKINESS, SUSPECT TESTS FOUND, CONFIRMED FLAKY
+
+---
+
+## Coverage Notes
+
+- The pass-rate threshold for SUSPECT classification (95% suggested above) is an
+  implementation detail; the tests verify that intermittent failures are flagged,
+  not the exact threshold value.
+- Tests that fail due to environment issues (missing assets, wrong platform) are
+  not flakiness — the skill distinguishes environment failures from non-determinism
+  in the test itself; this distinction is not explicitly tested here.
--- a/Framework/skills/authoring/architecture-decision.md
+++ b/Framework/skills/authoring/architecture-decision.md
@@ -0,0 +1,197 @@
+# Skill Test Spec: /architecture-decision
+
+## Skill Summary
+
+`/architecture-decision` guides the user through section-by-section authoring of
+a new Architecture Decision Record (ADR). Required sections are: Status, Context,
+Decision, Consequences, Alternatives, and Related ADRs. The skill also stamps the
+engine version reference from `docs/engine-reference/` into the ADR for traceability.
+
+In `full` review mode, TD-ADR (technical-director) and LP-FEASIBILITY
+(lead-programmer) gate agents spawn after the draft is complete. If both gates
+return APPROVED, the ADR status is set to Accepted. In `lean` or `solo` mode,
+both gates are skipped and the ADR is written with Status: Proposed. The skill
+asks "May I write" per section during authoring. ADRs are written to
+`docs/architecture/adr-NNN-[name].md`.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: ACCEPTED, PROPOSED, CONCERNS
+- [ ] Contains "May I write" collaborative protocol language (per-section approval)
+- [ ] Has a next-step handoff at the end
+- [ ] Documents gate behavior: TD-ADR + LP-FEASIBILITY in full mode; skipped in lean/solo
+- [ ] Documents that ADR status is Accepted (full, gates approve) or Proposed (otherwise)
+- [ ] Mentions engine version stamp from `docs/engine-reference/`
+
+---
+
+## Director Gate Checks
+
+In `full` mode: TD-ADR (technical-director) and LP-FEASIBILITY (lead-programmer)
+spawn after the ADR draft is complete. If both return APPROVED, ADR Status is set
+to Accepted. If either returns CONCERNS or FAIL, ADR stays Proposed.
+
+In `lean` mode: both gates are skipped. ADR is written with Status: Proposed.
+Output notes: "TD-ADR skipped — lean mode" and "LP-FEASIBILITY skipped — lean mode".
+
+In `solo` mode: both gates are skipped. ADR is written with Status: Proposed.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — New ADR for rendering approach, full mode, gates approve
+
+**Fixture:**
+- `docs/architecture/` exists with no existing ADR for rendering
+- `docs/engine-reference/[engine]/VERSION.md` exists
+- `production/session-state/review-mode.txt` contains `full`
+
+**Input:** `/architecture-decision rendering-approach`
+
+**Expected behavior:**
+1. Skill guides user through each required section (Status, Context, Decision, Consequences, Alternatives, Related ADRs)
+2. Engine version is stamped into the ADR from `docs/engine-reference/`
+3. For each section: draft shown, "May I write this section?" asked, approved
+4. After all sections: TD-ADR and LP-FEASIBILITY gates spawn in parallel
+5. Both gates return APPROVED
+6. ADR Status is set to Accepted
+7. Skill writes `docs/architecture/adr-NNN-rendering-approach.md`
+8. `docs/architecture/tr-registry.yaml` updated if new TR-IDs are defined
+
+**Assertions:**
+- [ ] All 6 required sections are authored and written
+- [ ] Engine version reference is stamped in the ADR
+- [ ] TD-ADR and LP-FEASIBILITY spawn in parallel (not sequentially)
+- [ ] ADR Status is Accepted when both gates return APPROVED in full mode
+- [ ] "May I write" is asked per section during authoring
+- [ ] File is written to `docs/architecture/adr-NNN-[name].md`
+
+---
+
+### Case 2: Failure Path — TD-ADR returns CONCERNS
+
+**Fixture:**
+- ADR draft is complete (all sections filled)
+- `production/session-state/review-mode.txt` contains `full`
+- TD-ADR gate returns CONCERNS: "The decision does not address [specific concern]"
+
+**Input:** `/architecture-decision [topic]`
+
+**Expected behavior:**
+1. TD-ADR gate spawns and returns CONCERNS with specific feedback
+2. Skill surfaces the concerns to the user
+3. ADR Status remains Proposed (not Accepted)
+4. User is asked: revise the decision to address concerns, or accept as Proposed
+5. ADR is written with Status: Proposed if concerns are not resolved
+
+**Assertions:**
+- [ ] TD-ADR concerns are shown to the user verbatim
+- [ ] ADR Status is Proposed (not Accepted) when TD-ADR returns CONCERNS
+- [ ] Skill does NOT set Status: Accepted while CONCERNS are unresolved
+- [ ] User is given the option to revise and re-run the gate
+
+---
+
+### Case 3: Lean Mode — Both gates skipped; ADR written as Proposed
+
+**Fixture:**
+- `production/session-state/review-mode.txt` contains `lean`
+- ADR draft is authored for a new technical decision
+
+**Input:** `/architecture-decision [topic]`
+
+**Expected behavior:**
+1. Skill guides user through all 6 sections
+2. After draft is complete: both TD-ADR and LP-FEASIBILITY are skipped
+3. Output notes: "TD-ADR skipped — lean mode" and "LP-FEASIBILITY skipped — lean mode"
+4. ADR is written with Status: Proposed (not Accepted, since gates did not approve)
+5. "May I write" is still asked before the final file write
+
+**Assertions:**
+- [ ] Both gate skip notes appear in output
+- [ ] ADR Status is Proposed (not Accepted) in lean mode
+- [ ] "May I write" is still asked before writing the file
+- [ ] Skill writes the ADR after user approval
+
+---
+
+### Case 4: Edge Case — ADR already exists for this topic
+
+**Fixture:**
+- `docs/architecture/` contains an existing ADR covering the same topic
+- The existing ADR has Status: Accepted
+
+**Input:** `/architecture-decision [same-topic]`
+
+**Expected behavior:**
+1. Skill detects an existing ADR covering the same topic
+2. Skill asks: "An ADR for [topic] already exists ([filename]). Update it, or create a new superseding ADR?"
+3. User selects update or supersede
+4. Skill does NOT silently create a duplicate ADR
+
+**Assertions:**
+- [ ] Skill detects the existing ADR before authoring begins
+- [ ] User is offered update or supersede options — no silent duplicate
+- [ ] If update: skill opens the existing ADR for section-by-section revision
+- [ ] If supersede: new ADR references the superseded one in Related ADRs section
+
+---
+
+### Case 5: Director Gate — Status set correctly based on mode and gate outcome
+
+**Fixture:**
+- ADR draft is complete
+- Two scenarios: (a) full mode, both gates APPROVED; (b) full mode, one gate CONCERNS
+
+**Full mode, both APPROVED:**
+- ADR Status is set to Accepted
+
+**Assertions (both approved):**
+- [ ] ADR frontmatter/header shows `Status: Accepted`
+- [ ] Both TD-ADR and LP-FEASIBILITY appear as APPROVED in output
+
+**Full mode, one gate returns CONCERNS:**
+- ADR Status stays Proposed
+
+**Assertions (CONCERNS):**
+- [ ] ADR frontmatter/header shows `Status: Proposed`
+- [ ] Concerns are listed in output
+- [ ] Skill does NOT set Status: Accepted when any gate returns CONCERNS
+
+**Lean/solo mode:**
+- ADR Status is always Proposed regardless of content quality
+
+**Assertions (lean/solo):**
+- [ ] ADR Status is Proposed in lean mode
+- [ ] ADR Status is Proposed in solo mode
+- [ ] No gate output appears in lean or solo mode
+
+---
+
+## Protocol Compliance
+
+- [ ] All 6 required sections authored before gate review
+- [ ] Engine version stamped in ADR from `docs/engine-reference/`
+- [ ] "May I write" asked per section during authoring
+- [ ] TD-ADR and LP-FEASIBILITY spawn in parallel in full mode
+- [ ] Skipped gates noted by name and mode in lean/solo output
+- [ ] ADR Status: Accepted only when full mode AND both gates APPROVED
+- [ ] Ends with next-step handoff: `/architecture-review` or `/create-control-manifest`
+
+---
+
+## Coverage Notes
+
+- ADR numbering (auto-incrementing NNN) is not independently fixture-tested —
+  the skill reads existing ADR filenames to assign the next number.
+- Related ADRs section linking (supersedes / related-to) is tested structurally
+  via Case 4 but not all link types are individually verified.
+- The TR-registry update (when new TR-IDs are defined in the ADR) is part of the
+  write phase — tested implicitly via Case 1.
--- a/Framework/skills/authoring/art-bible.md
+++ b/Framework/skills/authoring/art-bible.md
@@ -0,0 +1,185 @@
+# Skill Test Spec: /art-bible
+
+## Skill Summary
+
+`/art-bible` is a guided, section-by-section art bible authoring skill. It
+produces a comprehensive visual direction document covering: Visual Style overview,
+Color Palette, Typography, Character Design Rules, Environment Style, and UI
+Visual Language. The skill follows the skeleton-first pattern: creates the file
+with all section headers immediately, then fills each section through discussion
+and writes each to disk after user approval.
+
+In `full` review mode, the AD-ART-BIBLE director gate (art director) runs after
+the draft is complete and before any section is written. In `lean` and `solo`
+modes, AD-ART-BIBLE is skipped and only user approval is required. The verdict
+is COMPLETE when all sections are written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" language per section
+- [ ] Documents the AD-ART-BIBLE director gate and its mode behavior
+- [ ] Has a next-step handoff (e.g., `/asset-spec` or `/design-system`)
+
+---
+
+## Director Gate Checks
+
+| Gate ID      | Trigger condition              | Mode guard            |
+|--------------|--------------------------------|-----------------------|
+| AD-ART-BIBLE | After draft is complete        | full only (not lean/solo) |
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Full mode, art bible drafted, AD-ART-BIBLE approves
+
+**Fixture:**
+- No existing `design/art-bible.md`
+- `production/session-state/review-mode.txt` contains `full`
+- `design/gdd/game-concept.md` exists with visual tone described
+
+**Input:** `/art-bible`
+
+**Expected behavior:**
+1. Skill creates skeleton `design/art-bible.md` with all section headers
+2. Skill discusses and drafts each section with user collaboration
+3. After all sections are drafted, AD-ART-BIBLE gate is invoked (art director review)
+4. AD-ART-BIBLE returns APPROVED
+5. Skill asks "May I write section [N] to `design/art-bible.md`?" per section
+6. All sections written after approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Skeleton file is created first (before any section content is written)
+- [ ] AD-ART-BIBLE gate is invoked in full mode after draft is complete
+- [ ] Gate approval precedes the "May I write" section asks
+- [ ] All sections are present in the final file
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: AD-ART-BIBLE Returns CONCERNS — Section revised before writing
+
+**Fixture:**
+- Art bible draft complete
+- `production/session-state/review-mode.txt` contains `full`
+- AD-ART-BIBLE gate returns CONCERNS: "Color palette clashes with the dark
+  atmospheric tone described in the game concept"
+
+**Input:** `/art-bible`
+
+**Expected behavior:**
+1. AD-ART-BIBLE gate returns CONCERNS with specific feedback about palette
+2. Skill surfaces feedback to user: "Art director has concerns about the color palette"
+3. Skill returns to the Color Palette section for revision
+4. User and skill revise the palette to align with game concept tone
+5. AD-ART-BIBLE is not re-invoked (user decides to proceed after revision)
+6. Revised section is written after "May I write" approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] CONCERNS are shown to user before any section is written
+- [ ] Skill returns to the affected section for revision (not all sections)
+- [ ] Revised content (not original) is written to file
+- [ ] Verdict is COMPLETE after revision and approval
+
+---
+
+### Case 3: Lean Mode — AD-ART-BIBLE Skipped, Written With User Approval Only
+
+**Fixture:**
+- No existing art bible
+- `production/session-state/review-mode.txt` contains `lean`
+
+**Input:** `/art-bible`
+
+**Expected behavior:**
+1. Skill reads review mode — determines `lean`
+2. Skill drafts all sections with user collaboration
+3. AD-ART-BIBLE gate is skipped: output notes "[AD-ART-BIBLE] skipped — lean mode"
+4. Skill asks user for direct approval of each section
+5. Sections are written after user confirmation; verdict is COMPLETE
+
+**Assertions:**
+- [ ] AD-ART-BIBLE gate is NOT invoked in lean mode
+- [ ] Skip is explicitly noted: "[AD-ART-BIBLE] skipped — lean mode"
+- [ ] User approval is still required per section (gate skip ≠ approval skip)
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: Existing Art Bible — Retrofit Mode
+
+**Fixture:**
+- `design/art-bible.md` already exists with all sections populated
+- User wants to update the Character Design Rules section
+
+**Input:** `/art-bible`
+
+**Expected behavior:**
+1. Skill reads existing art bible and detects all sections populated
+2. Skill offers retrofit: "Art bible exists — which section would you like to update?"
+3. User selects Character Design Rules
+4. Skill drafts updated content; in full mode, AD-ART-BIBLE is invoked for the
+   revised section before writing
+5. Skill asks "May I write Character Design Rules to `design/art-bible.md`?"
+6. Only that section is updated; other sections preserved; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Existing art bible is detected and retrofit is offered
+- [ ] Only the selected section is updated
+- [ ] In full mode: AD-ART-BIBLE gate runs even for single-section retrofit
+- [ ] Other sections are preserved
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Solo Mode — AD-ART-BIBLE Skipped, Noted in Output
+
+**Fixture:**
+- No existing art bible
+- `production/session-state/review-mode.txt` contains `solo`
+
+**Input:** `/art-bible`
+
+**Expected behavior:**
+1. Skill reads review mode — determines `solo`
+2. Art bible is drafted and written with only user approval
+3. AD-ART-BIBLE gate is skipped: output notes "[AD-ART-BIBLE] skipped — solo mode"
+4. No director agents are spawned
+5. Verdict is COMPLETE
+
+**Assertions:**
+- [ ] AD-ART-BIBLE gate is NOT invoked in solo mode
+- [ ] Skip is explicitly noted with "solo mode" label
+- [ ] No director agents of any kind are spawned
+- [ ] Verdict is COMPLETE
+
+---
+
+## Protocol Compliance
+
+- [ ] Creates skeleton file immediately with all section headers
+- [ ] Discusses and drafts one section at a time
+- [ ] AD-ART-BIBLE gate runs in full mode after all sections are drafted
+- [ ] AD-ART-BIBLE is skipped in lean and solo modes — noted by name
+- [ ] Asks "May I write section [N]" per section
+- [ ] Verdict is COMPLETE when all sections are written
+
+---
+
+## Coverage Notes
+
+- The case where AD-ART-BIBLE returns REJECT (not just CONCERNS) is not
+  separately tested; the skill would block writing and ask the user how to
+  proceed (revise or override).
+- The Typography section is listed as a required art bible section but its
+  specific content requirements are not assertion-tested here.
+- The art bible feeds into `/asset-spec` — this relationship is noted in the
+  handoff but not tested as part of this skill's spec.
--- a/Framework/skills/authoring/create-architecture.md
+++ b/Framework/skills/authoring/create-architecture.md
@@ -0,0 +1,187 @@
+# Skill Test Spec: /create-architecture
+
+## Skill Summary
+
+`/create-architecture` guides the user through section-by-section authoring of a
+technical architecture document. It uses a skeleton-first approach — the file is
+created with all required section headers before any content is filled. Each
+section is discussed, drafted, and written individually after user approval. If an
+architecture document already exists, the skill offers retrofit mode to update
+specific sections.
+
+In `full` review mode, TD-ARCHITECTURE (technical-director) and LP-FEASIBILITY
+(lead-programmer) spawn after the complete draft is finished. In `lean` or `solo`
+mode, both gates are skipped. The skill writes to `docs/architecture/architecture.md`.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: APPROVED, NEEDS REVISION, MAJOR REVISION NEEDED
+- [ ] Contains "May I write" collaborative protocol language (per-section approval)
+- [ ] Has a next-step handoff at the end (`/architecture-review` or `/create-control-manifest`)
+- [ ] Documents skeleton-first approach
+- [ ] Documents gate behavior: TD-ARCHITECTURE + LP-FEASIBILITY in full mode; skipped in lean/solo
+- [ ] Documents retrofit mode for existing architecture documents
+
+---
+
+## Director Gate Checks
+
+In `full` mode: TD-ARCHITECTURE (technical-director) and LP-FEASIBILITY
+(lead-programmer) spawn in parallel after all sections are drafted and before
+any final approval write.
+
+In `lean` mode: both gates are skipped. Output notes:
+"TD-ARCHITECTURE skipped — lean mode" and "LP-FEASIBILITY skipped — lean mode".
+
+In `solo` mode: both gates are skipped with equivalent notes.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — New architecture doc, skeleton-first, full mode gates approve
+
+**Fixture:**
+- No existing `docs/architecture/architecture.md`
+- `docs/architecture/` contains Accepted ADRs for reference
+- `production/session-state/review-mode.txt` contains `full`
+
+**Input:** `/create-architecture`
+
+**Expected behavior:**
+1. Skill creates skeleton `docs/architecture/architecture.md` with all required section headers
+2. For each section: drafts content, shows draft, asks "May I write [section]?", writes after approval
+3. After all sections are drafted: TD-ARCHITECTURE and LP-FEASIBILITY spawn in parallel
+4. Both gates return APPROVED
+5. Final "May I confirm architecture is complete?" asked
+6. Session state updated
+
+**Assertions:**
+- [ ] Skeleton file is created with all section headers before any content is written
+- [ ] "May I write [section]?" asked per section during authoring
+- [ ] TD-ARCHITECTURE and LP-FEASIBILITY spawn in parallel (not sequentially)
+- [ ] Both gates complete before the final completion confirmation
+- [ ] Verdict is APPROVED when both gates return APPROVED
+- [ ] Next-step handoff to `/architecture-review` or `/create-control-manifest` is present
+
+---
+
+### Case 2: Failure Path — TD-ARCHITECTURE returns MAJOR REVISION
+
+**Fixture:**
+- Architecture doc is fully drafted (all sections)
+- `production/session-state/review-mode.txt` contains `full`
+- TD-ARCHITECTURE gate returns MAJOR REVISION: "[specific structural issue]"
+
+**Input:** `/create-architecture`
+
+**Expected behavior:**
+1. All sections are drafted and written
+2. TD-ARCHITECTURE gate runs and returns MAJOR REVISION with specific feedback
+3. Skill surfaces the feedback to the user
+4. Architecture is NOT marked as finalized
+5. User is asked: revise the flagged sections, or accept the document as a draft
+
+**Assertions:**
+- [ ] Architecture is NOT marked finalized when TD-ARCHITECTURE returns MAJOR REVISION
+- [ ] Gate feedback is shown to the user with specific issue descriptions
+- [ ] User is given the option to revise specific sections
+- [ ] Skill does NOT auto-finalize despite MAJOR REVISION feedback
+
+---
+
+### Case 3: Lean Mode — Both gates skipped; architecture written with user approval only
+
+**Fixture:**
+- No existing architecture doc
+- `production/session-state/review-mode.txt` contains `lean`
+
+**Input:** `/create-architecture`
+
+**Expected behavior:**
+1. Skeleton file is created
+2. All sections are authored and written per-section with user approval
+3. After completion: TD-ARCHITECTURE and LP-FEASIBILITY are skipped
+4. Output notes: "TD-ARCHITECTURE skipped — lean mode" and "LP-FEASIBILITY skipped — lean mode"
+5. Architecture is considered complete based on user approval alone
+
+**Assertions:**
+- [ ] Both gate skip notes appear in output
+- [ ] Architecture document is written with only user approval in lean mode
+- [ ] Skill does NOT block completion because gates were skipped
+- [ ] Next-step handoff is still present
+
+---
+
+### Case 4: Retrofit Mode — Existing architecture doc, user updates a section
+
+**Fixture:**
+- `docs/architecture/architecture.md` already exists with all sections populated
+
+**Input:** `/create-architecture`
+
+**Expected behavior:**
+1. Skill detects existing architecture doc and reads its current content
+2. Skill offers retrofit mode: "Architecture doc already exists. Which section would you like to update?"
+3. User selects a section
+4. Skill authors only that section, asks "May I write [section]?"
+5. Only the selected section is updated — other sections unchanged
+
+**Assertions:**
+- [ ] Skill detects and reads the existing architecture doc before offering retrofit
+- [ ] User is asked which section to update — not asked to rewrite the whole document
+- [ ] Only the selected section is updated
+- [ ] Other sections are not modified during a retrofit session
+
+---
+
+### Case 5: Director Gate — Architecture references a Proposed ADR; flagged as risk
+
+**Fixture:**
+- Architecture doc is being authored
+- One section references or depends on an ADR that has `Status: Proposed`
+- `production/session-state/review-mode.txt` contains `full`
+
+**Input:** `/create-architecture`
+
+**Expected behavior:**
+1. Skill authors all sections
+2. During authoring, skill detects a reference to a Proposed ADR
+3. Skill flags: "Note: [section] references ADR-NNN which is Proposed — this is a risk until the ADR is accepted"
+4. Risk flag is embedded in the relevant section's content
+5. TD-ARCHITECTURE and LP-FEASIBILITY still run — they are informed of the Proposed ADR risk
+
+**Assertions:**
+- [ ] Proposed ADR reference is detected and flagged during section authoring
+- [ ] Risk note is embedded in the architecture document section
+- [ ] TD-ARCHITECTURE and LP-FEASIBILITY still spawn (the risk does not block the gates)
+- [ ] Risk flag names the specific ADR number and title
+
+---
+
+## Protocol Compliance
+
+- [ ] Skeleton file created with all section headers before any content is written
+- [ ] "May I write [section]?" asked per section during authoring
+- [ ] TD-ARCHITECTURE and LP-FEASIBILITY spawn in parallel in full mode
+- [ ] Skipped gates noted by name and mode in lean/solo output
+- [ ] Proposed ADR references flagged as risks in the document
+- [ ] Ends with next-step handoff: `/architecture-review` or `/create-control-manifest`
+
+---
+
+## Coverage Notes
+
+- The required section list for architecture documents is defined in the skill
+  body and in the `/architecture-review` skill — not re-enumerated here.
+- Engine version stamping in the architecture doc (parallel to ADR stamping)
+  is part of the authoring workflow — tested implicitly via Case 1.
+- The retrofit mode for updating multiple sections in one session follows the
+  same per-section approval pattern — not independently tested for multi-section
+  retrofits.
--- a/Framework/skills/authoring/design-system.md
+++ b/Framework/skills/authoring/design-system.md
@@ -0,0 +1,192 @@
+# Skill Test Spec: /design-system
+
+## Skill Summary
+
+`/design-system` guides the user through section-by-section authoring of a Game
+Design Document (GDD) for a single game system. All 8 required sections must be
+authored: Overview, Player Fantasy, Detailed Rules, Formulas, Edge Cases,
+Dependencies, Tuning Knobs, and Acceptance Criteria. The skill uses a
+skeleton-first approach — it creates the GDD file with all 8 section headers
+before filling any content — and writes each section individually after approval.
+
+The CD-GDD-ALIGN gate (creative-director) runs in both `full` AND `lean` modes.
+It is only skipped in `solo` mode. If an existing GDD file is found, the skill
+offers a retrofit mode to update specific sections rather than rewriting the whole
+document.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: APPROVED, NEEDS REVISION, MAJOR REVISION
+- [ ] Contains "May I write" collaborative protocol language (per-section approval)
+- [ ] Has a next-step handoff at the end
+- [ ] Documents skeleton-first approach (file created with headers before content)
+- [ ] Documents CD-GDD-ALIGN gate: active in full AND lean mode; skipped in solo only
+- [ ] Documents retrofit mode for existing GDD files
+
+---
+
+## Director Gate Checks
+
+In `full` mode: CD-GDD-ALIGN (creative-director) gate runs after each section is
+drafted, before writing. If MAJOR REVISION is returned, the section must be
+rewritten before proceeding.
+
+In `lean` mode: CD-GDD-ALIGN still runs (this gate is NOT skipped in lean mode —
+it runs in both full and lean). Only solo mode skips it.
+
+In `solo` mode: CD-GDD-ALIGN is skipped. Output notes:
+"CD-GDD-ALIGN skipped — solo mode". Sections are written with only user approval.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — New GDD, skeleton-first, CD-GDD-ALIGN in lean mode
+
+**Fixture:**
+- No existing GDD for the target system in `design/gdd/`
+- `production/session-state/review-mode.txt` contains `lean`
+
+**Input:** `/design-system [system-name]`
+
+**Expected behavior:**
+1. Skill creates skeleton file `design/gdd/[system-name].md` with all 8 section headers (empty bodies)
+2. For each section: discusses with user, drafts content, shows draft
+3. CD-GDD-ALIGN gate runs on each section draft (lean mode — gate is active)
+4. Gate returns APPROVED for each section
+5. "May I write [section]?" asked after gate approval
+6. Section written to file after user approval
+7. Process repeats for all 8 sections
+
+**Assertions:**
+- [ ] Skeleton file is created with all 8 section headers before any content is written
+- [ ] CD-GDD-ALIGN runs on each section in lean mode (not skipped)
+- [ ] "May I write" is asked per section (not once for all sections)
+- [ ] Each section is written individually after gate + user approval
+- [ ] All 8 sections are present in the final GDD file
+
+---
+
+### Case 2: Retrofit Mode — Existing GDD, update specific section
+
+**Fixture:**
+- `design/gdd/[system-name].md` already exists with all 8 sections populated
+
+**Input:** `/design-system [system-name]`
+
+**Expected behavior:**
+1. Skill detects existing GDD file and reads its current content
+2. Skill offers retrofit mode: "GDD already exists. Which section would you like to update?"
+3. User selects a specific section (e.g., Formulas)
+4. Skill authors only that section, runs CD-GDD-ALIGN, asks "May I write?"
+5. Only the selected section is updated — other sections are not modified
+
+**Assertions:**
+- [ ] Skill detects and reads existing GDD before offering retrofit mode
+- [ ] User is asked which section to update — not asked to rewrite the whole document
+- [ ] Only the selected section is rewritten — others remain unchanged
+- [ ] CD-GDD-ALIGN still runs on the updated section
+- [ ] "May I write" is asked before updating the section
+
+---
+
+### Case 3: Director Gate — CD-GDD-ALIGN returns MAJOR REVISION
+
+**Fixture:**
+- New GDD being authored
+- `production/session-state/review-mode.txt` contains `lean`
+- CD-GDD-ALIGN gate returns MAJOR REVISION on the Player Fantasy section
+
+**Input:** `/design-system [system-name]`
+
+**Expected behavior:**
+1. Player Fantasy section is drafted
+2. CD-GDD-ALIGN gate runs and returns MAJOR REVISION with specific feedback
+3. Skill surfaces the feedback to the user
+4. Section is NOT written to file while MAJOR REVISION is unresolved
+5. User rewrites the section in collaboration with the skill
+6. CD-GDD-ALIGN runs again on the revised section
+7. If revised section passes, "May I write?" is asked and section is written
+
+**Assertions:**
+- [ ] Section is NOT written when CD-GDD-ALIGN returns MAJOR REVISION
+- [ ] Gate feedback is shown to the user before requesting revision
+- [ ] CD-GDD-ALIGN runs again after the section is revised
+- [ ] Skill does NOT auto-proceed to the next section while MAJOR REVISION is unresolved
+
+---
+
+### Case 4: Solo Mode — CD-GDD-ALIGN skipped; sections written with user approval only
+
+**Fixture:**
+- New GDD being authored
+- `production/session-state/review-mode.txt` contains `solo`
+
+**Input:** `/design-system [system-name]`
+
+**Expected behavior:**
+1. Skeleton file is created with 8 section headers
+2. For each section: drafted, shown to user
+3. CD-GDD-ALIGN is skipped — noted per section: "CD-GDD-ALIGN skipped — solo mode"
+4. "May I write [section]?" asked after user reviews draft
+5. Section written after user approval
+6. No gate review at any stage
+
+**Assertions:**
+- [ ] "CD-GDD-ALIGN skipped — solo mode" noted for each section
+- [ ] Sections are written after user approval alone (no gate required)
+- [ ] Skill does NOT spawn any CD-GDD-ALIGN gate in solo mode
+- [ ] Full GDD is written with only user approval in solo mode
+
+---
+
+### Case 5: Director Gate — Empty sections not written to file
+
+**Fixture:**
+- GDD authoring in progress
+- User and skill discuss one section but do not produce any approved content
+  (e.g., discussion ends without a decision, or user says "skip for now")
+
+**Input:** `/design-system [system-name]`
+
+**Expected behavior:**
+1. Section discussion produces no approved content
+2. Skill does NOT write an empty or placeholder body to the section
+3. The section header remains in the skeleton file but the body stays empty
+4. Skill moves to the next section without writing the empty one
+5. At the end, incomplete sections are listed and user is reminded to return to them
+
+**Assertions:**
+- [ ] Empty or unapproved sections are NOT written to the file
+- [ ] Skeleton section header remains (preserves structure)
+- [ ] Skill tracks and lists incomplete sections at the end of the session
+- [ ] Skill does NOT write "TBD" or placeholder content without user approval
+
+---
+
+## Protocol Compliance
+
+- [ ] Skeleton file created with all 8 headers before any content is written
+- [ ] CD-GDD-ALIGN runs in both full AND lean mode (not just full)
+- [ ] CD-GDD-ALIGN skipped only in solo mode — noted per section
+- [ ] "May I write [section]?" asked per section (not once for the whole document)
+- [ ] MAJOR REVISION from CD-GDD-ALIGN blocks section write until resolved
+- [ ] Only approved, non-empty sections are written to the file
+- [ ] Ends with next-step handoff: `/review-all-gdds` or `/map-systems next`
+
+---
+
+## Coverage Notes
+
+- The 8 required sections are validated against the project's design document
+  standards defined in `CLAUDE.md` — not re-enumerated here.
+- The skill's internal section-ordering logic (which section to author first) is
+  not independently tested — the order follows the standard GDD template.
+- Pillar alignment checking within CD-GDD-ALIGN is evaluated holistically by
+  the gate agent — specific pillar checks are not fixture-tested here.
--- a/Framework/skills/authoring/quick-design.md
+++ b/Framework/skills/authoring/quick-design.md
@@ -0,0 +1,176 @@
+# Skill Test Spec: /quick-design
+
+## Skill Summary
+
+`/quick-design` produces a lightweight design spec for features too small to
+warrant a full 8-section GDD. The target scope is under 4 hours of design time
+for a single-system feature. Instead of the full 8-section GDD format, the
+quick-design spec uses a streamlined 3-section format: Overview, Rules, and
+Acceptance Criteria.
+
+The skill has no director gates — adding gate overhead would defeat the purpose
+of a lightweight design tool. The skill asks "May I write" before writing the
+design note to `design/quick-notes/[name].md`. If the feature scope is too large
+for a quick-design, the skill redirects to `/design-system` instead.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: CREATED, BLOCKED, REDIRECTED
+- [ ] Contains "May I write" collaborative protocol language (for quick-note file)
+- [ ] Has a next-step handoff at the end
+- [ ] Explicitly notes: no director gates (lightweight skill by design)
+- [ ] Mentions scope check: redirects to `/design-system` if scope exceeds sub-4h threshold
+
+---
+
+## Director Gate Checks
+
+No director gates — this skill spawns no director gate agents. The lightweight
+nature of quick-design means director gate overhead is intentionally absent.
+Full GDD review is not needed for sub-4-hour single-system features.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Small UI change produces a 3-section spec
+
+**Fixture:**
+- No existing quick-note for the target feature
+- Feature is clearly scoped: a single UI element change with no cross-system impact
+
+**Input:** `/quick-design [feature-name]`
+
+**Expected behavior:**
+1. Skill asks scoping questions: what system, what change, what is the acceptance signal
+2. Skill determines scope is within the sub-4h threshold
+3. Skill drafts a 3-section spec: Overview, Rules, Acceptance Criteria
+4. Draft is shown to user
+5. "May I write `design/quick-notes/[name].md`?" is asked
+6. File is written after approval
+
+**Assertions:**
+- [ ] Spec contains exactly 3 sections: Overview, Rules, Acceptance Criteria
+- [ ] Draft is shown to user before "May I write" ask
+- [ ] "May I write `design/quick-notes/[name].md`?" is asked before writing
+- [ ] File is written to the correct path: `design/quick-notes/[name].md`
+- [ ] Verdict is CREATED after successful write
+
+---
+
+### Case 2: Failure Path — Scope check fails; redirected to /design-system
+
+**Fixture:**
+- Feature described spans multiple systems or would take more than 4 hours of design time
+  (e.g., "redesign the entire combat system" or "new progression mechanic affecting all classes")
+
+**Input:** `/quick-design [large-feature]`
+
+**Expected behavior:**
+1. Skill asks scoping questions
+2. Skill determines scope exceeds the sub-4h / single-system threshold
+3. Skill outputs: "This feature is too large for a quick-design. Use `/design-system [name]` for a full GDD."
+4. Skill does NOT write a quick-note file
+5. Verdict is REDIRECTED
+
+**Assertions:**
+- [ ] Skill detects the scope excess and stops before drafting
+- [ ] Message explicitly names `/design-system` as the correct alternative
+- [ ] No quick-note file is written
+- [ ] Verdict is REDIRECTED (not CREATED or BLOCKED)
+
+---
+
+### Case 3: Edge Case — File already exists; offered to update
+
+**Fixture:**
+- `design/quick-notes/[name].md` already exists from a previous session
+
+**Input:** `/quick-design [name]`
+
+**Expected behavior:**
+1. Skill detects existing quick-note file and reads its current content
+2. Skill asks: "[name].md already exists. Update it, or create a new version?"
+3. User selects update
+4. Skill shows the existing spec and asks which section to revise
+5. Updated spec is shown, "May I write?" asked, file updated after approval
+
+**Assertions:**
+- [ ] Skill detects and reads the existing file before offering to update
+- [ ] User is offered update or create-new options — not auto-overwritten
+- [ ] Only the revised section is updated (or the whole spec if user chooses full rewrite)
+- [ ] "May I write" is asked before overwriting the existing file
+
+---
+
+### Case 4: Edge Case — No argument provided
+
+**Fixture:**
+- `design/quick-notes/` directory may or may not exist
+
+**Input:** `/quick-design` (no argument)
+
+**Expected behavior:**
+1. Skill detects no argument is provided
+2. Skill outputs a usage error: "No feature name specified. Usage: /quick-design [feature-name]"
+3. Skill provides an example: `/quick-design pause-menu-settings`
+4. No file is created
+
+**Assertions:**
+- [ ] Skill outputs a usage error when no argument is given
+- [ ] A usage example is shown with the correct format
+- [ ] No quick-note file is written
+- [ ] Skill does NOT silently pick a feature name or default to any action
+
+---
+
+### Case 5: Director Gate — No gate spawned; explicitly noted for sub-4h features
+
+**Fixture:**
+- Feature is within scope for quick-design
+- `production/session-state/review-mode.txt` exists with `full`
+
+**Input:** `/quick-design [feature-name]`
+
+**Expected behavior:**
+1. Skill asks scoping questions and determines scope is within threshold
+2. Skill does NOT read `production/session-state/review-mode.txt`
+3. Skill does NOT spawn any director gate agent
+4. Spec is drafted, "May I write" asked, file written after approval
+5. Output explicitly notes: "No director gate review — quick-design is for sub-4h features"
+
+**Assertions:**
+- [ ] No director gate agents are spawned (no CD-, TD-, PR-, AD- prefixed gates)
+- [ ] Skill does NOT read `production/session-state/review-mode.txt`
+- [ ] Output contains a note explaining why no gate review is needed
+- [ ] Review mode has no effect on this skill's behavior
+- [ ] Full GDD review path (`/design-system`) is mentioned as the alternative for larger features
+
+---
+
+## Protocol Compliance
+
+- [ ] Scope check runs before drafting (redirects to `/design-system` if scope too large)
+- [ ] 3-section format used (Overview, Rules, Acceptance Criteria) — NOT the 8-section GDD format
+- [ ] Draft shown to user before "May I write" ask
+- [ ] "May I write `design/quick-notes/[name].md`?" asked before writing
+- [ ] No director gates — no review-mode.txt read
+- [ ] Ends with next-step handoff (e.g., proceed to implementation or `/dev-story`)
+
+---
+
+## Coverage Notes
+
+- The scope threshold heuristic (sub-4h, single-system) is a judgment call —
+  the skill's internal check is the authoritative definition and is not
+  independently tested by counting hours.
+- The `design/quick-notes/` directory is created automatically if it does not
+  exist — this filesystem behavior is not independently tested here.
+- Integration with the story pipeline (can a quick-design generate a story
+  directly?) is out of scope for this spec — quick-designs are standalone.
--- a/Framework/skills/authoring/ux-design.md
+++ b/Framework/skills/authoring/ux-design.md
@@ -0,0 +1,176 @@
+# Skill Test Spec: /ux-design
+
+## Skill Summary
+
+`/ux-design` is a guided, section-by-section UX spec authoring skill. It produces
+user flow diagrams (described textually), interaction state definitions, wireframe
+descriptions, and accessibility notes for a specified screen or HUD element. The
+skill follows the skeleton-first pattern: it creates the file with all section
+headers immediately, then fills each section through discussion and writes each
+section to disk after user approval.
+
+The skill has no inline director gates — `/ux-review` is the separate review step.
+Each section requires a "May I write section [N] to [filepath]?" ask. If a UX spec
+already exists for the named screen, the skill offers to retrofit individual sections
+rather than replace. Verdict is COMPLETE when all sections are written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" language per section
+- [ ] Has a next-step handoff (e.g., `/ux-review` to validate the completed spec)
+
+---
+
+## Director Gate Checks
+
+None. `/ux-design` has no inline director gates. `/ux-review` is the separate
+review skill invoked after this skill completes.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — New HUD spec, all sections authored and written
+
+**Fixture:**
+- No existing HUD UX spec in `design/ux/`
+- Engine and rendering preferences configured
+
+**Input:** `/ux-design hud`
+
+**Expected behavior:**
+1. Skill creates a skeleton file `design/ux/hud.md` with all section headers
+2. Skill discusses and drafts each section: User Flows, Interaction States
+   (normal/hover/focus/disabled), Wireframe Description, Accessibility Notes
+3. After each section is drafted and user confirms, skill asks "May I write
+   section [N] to `design/ux/hud.md`?"
+4. Each section is written in sequence after approval
+5. After all sections are written, verdict is COMPLETE
+6. Skill suggests running `/ux-review` as the next step
+
+**Assertions:**
+- [ ] Skeleton file is created first (with empty section bodies)
+- [ ] "May I write section [N]" is asked per section (not once at the end)
+- [ ] All required sections are present: User Flows, Interaction States,
+     Wireframe Description, Accessibility Notes
+- [ ] Handoff to `/ux-review` is at the end
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Existing UX Spec — Retrofit: user picks section to update
+
+**Fixture:**
+- `design/ux/hud.md` already exists with all sections populated
+- User wants to update only the Accessibility Notes section
+
+**Input:** `/ux-design hud`
+
+**Expected behavior:**
+1. Skill reads existing `design/ux/hud.md` and detects all sections are populated
+2. Skill reports: "UX spec already exists for HUD — offering to retrofit"
+3. Skill lists all sections and asks which to update
+4. User selects Accessibility Notes
+5. Skill drafts updated accessibility content and asks "May I write section
+   Accessibility Notes to `design/ux/hud.md`?"
+6. Only that section is updated; other sections are preserved; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Existing spec is detected and retrofit is offered
+- [ ] User selects which section(s) to update
+- [ ] Only the selected section is updated — other sections unchanged
+- [ ] "May I write" is asked for the updated section
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 3: Dependency Gap — Spec references a system with no design doc
+
+**Fixture:**
+- User is authoring a UX spec for the inventory screen
+- `design/gdd/inventory.md` does not exist
+
+**Input:** `/ux-design inventory-screen`
+
+**Expected behavior:**
+1. Skill begins authoring the inventory screen UX spec
+2. During the User Flows section, skill attempts to reference inventory system rules
+3. Skill detects: "No GDD found for inventory system — UX spec has a DEPENDENCY GAP"
+4. The dependency gap is flagged in the spec (noted inline: "DEPENDENCY GAP: inventory GDD")
+5. Skill continues authoring with placeholder notes for the missing rules
+6. Verdict is COMPLETE with advisory note about the dependency gap
+
+**Assertions:**
+- [ ] DEPENDENCY GAP label appears in the spec for the missing system doc
+- [ ] Skill does NOT block on the missing GDD — it continues with placeholders
+- [ ] Dependency gap is also noted in the skill output (not just in the file)
+- [ ] Handoff suggests both `/ux-review` and writing the missing GDD
+
+---
+
+### Case 4: No Argument Provided — Usage error
+
+**Fixture:**
+- No argument provided with the skill invocation
+
+**Input:** `/ux-design`
+
+**Expected behavior:**
+1. Skill detects no screen name or argument provided
+2. Skill outputs a usage error: "Screen name required. Usage: `/ux-design [screen-name]`"
+3. Skill provides examples: `/ux-design hud`, `/ux-design main-menu`, `/ux-design inventory`
+4. No file is created; no "May I write" is asked
+
+**Assertions:**
+- [ ] Usage error is clearly stated
+- [ ] Example invocations are provided
+- [ ] No file is created
+- [ ] Skill does not attempt to proceed without an argument
+
+---
+
+### Case 5: Director Gate Check — No gate; ux-review is the separate review skill
+
+**Fixture:**
+- New screen spec with argument provided
+
+**Input:** `/ux-design settings-menu`
+
+**Expected behavior:**
+1. Skill authors all sections of the settings menu UX spec
+2. No director agents are spawned
+3. No gate IDs appear in output during authoring
+
+**Assertions:**
+- [ ] No director gate is invoked during ux-design
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Creates skeleton file with all section headers before discussing content
+- [ ] Discusses and drafts one section at a time
+- [ ] Asks "May I write section [N]" after each section is approved
+- [ ] Detects existing spec and offers retrofit path
+- [ ] Ends with handoff to `/ux-review`
+- [ ] Verdict is COMPLETE when all sections are written
+
+---
+
+## Coverage Notes
+
+- Interaction state enumeration (normal/hover/focus/disabled/error) is a core
+  requirement of each spec; the `/ux-review` skill checks for completeness.
+- Wireframe descriptions are text-only (no images); image references may be
+  added manually by a designer after the fact.
+- Responsive layout concerns (different screen sizes) are noted as optional
+  content and not assertion-tested here.
--- a/Framework/skills/authoring/ux-review.md
+++ b/Framework/skills/authoring/ux-review.md
@@ -0,0 +1,176 @@
+# Skill Test Spec: /ux-review
+
+## Skill Summary
+
+`/ux-review` validates an existing UX spec or HUD design document against
+accessibility and interaction standards. It checks for required sections
+(User Flows, Interaction States, Wireframe Description, Accessibility Notes),
+completeness of interaction state definitions (hover, focus, disabled, error),
+accessibility compliance (keyboard navigation, color contrast notes, screen
+reader considerations), and consistency with the art bible or design system
+if those documents exist.
+
+The skill is read-only — it produces no file writes. Verdicts: APPROVED
+(all checks pass), NEEDS REVISION (fixable issues found), or MAJOR REVISION
+NEEDED (structural or accessibility failures). No director gates apply —
+`/ux-review` IS the review gate for UX specs.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: APPROVED, NEEDS REVISION, MAJOR REVISION NEEDED
+- [ ] Does NOT contain "May I write" language (skill is read-only)
+- [ ] Has a next-step handoff (e.g., back to `/ux-design` for revision, or proceed to implementation)
+
+---
+
+## Director Gate Checks
+
+None. `/ux-review` is itself the review gate for UX specs. No additional director
+gates are invoked within this skill.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Complete UX spec with all required sections, APPROVED
+
+**Fixture:**
+- `design/ux/hud.md` exists with all required sections populated:
+  - User Flows: complete player flow diagrams
+  - Interaction States: normal, hover, focus, disabled, error all defined
+  - Wireframe Description: layout described
+  - Accessibility Notes: keyboard nav, contrast ratios, screen reader notes
+
+**Input:** `/ux-review hud`
+
+**Expected behavior:**
+1. Skill reads `design/ux/hud.md`
+2. Skill checks all 4 required sections — all present and non-empty
+3. Skill checks interaction states — all 5 states defined
+4. Skill checks accessibility notes — keyboard, contrast, and screen reader covered
+5. Skill outputs: checklist of all passed checks
+6. Verdict is APPROVED
+
+**Assertions:**
+- [ ] All 4 required sections are checked
+- [ ] All 5 interaction states are verified present
+- [ ] Verdict is APPROVED
+- [ ] No files are written
+
+---
+
+### Case 2: Missing Accessibility Section — NEEDS REVISION
+
+**Fixture:**
+- `design/ux/hud.md` exists but the Accessibility Notes section is empty
+- All other sections are fully populated
+
+**Input:** `/ux-review hud`
+
+**Expected behavior:**
+1. Skill reads the file and checks all sections
+2. Accessibility Notes section is empty — check fails
+3. Skill outputs: "NEEDS REVISION — Accessibility Notes section is empty"
+4. Skill lists specific items to add: keyboard navigation, color contrast ratios,
+   screen reader labels
+5. Verdict is NEEDS REVISION
+6. Handoff suggests returning to `/ux-design hud` to fill in the section
+
+**Assertions:**
+- [ ] NEEDS REVISION verdict is returned (not APPROVED or MAJOR REVISION NEEDED)
+- [ ] Specific missing content items are listed
+- [ ] Handoff points back to `/ux-design hud` for revision
+- [ ] No files are written
+
+---
+
+### Case 3: Interaction States Incomplete — NEEDS REVISION
+
+**Fixture:**
+- `design/ux/settings-menu.md` exists
+- Interaction States section only defines: normal and hover
+- Missing: focus, disabled, error states
+
+**Input:** `/ux-review settings-menu`
+
+**Expected behavior:**
+1. Skill reads the file and checks interaction states
+2. Only 2 of 5 required states are defined
+3. Skill reports: "NEEDS REVISION — Interaction states incomplete: missing focus, disabled, error"
+4. Verdict is NEEDS REVISION with specific missing states named
+
+**Assertions:**
+- [ ] NEEDS REVISION verdict returned
+- [ ] All 3 missing states are named explicitly in the output
+- [ ] Skill does not return MAJOR REVISION NEEDED for a fixable gap
+- [ ] Handoff suggests returning to `/ux-design settings-menu`
+
+---
+
+### Case 4: File Not Found — Error with remediation
+
+**Fixture:**
+- `design/ux/inventory-screen.md` does not exist
+
+**Input:** `/ux-review inventory-screen`
+
+**Expected behavior:**
+1. Skill attempts to read `design/ux/inventory-screen.md` — file not found
+2. Skill outputs: "UX spec not found: design/ux/inventory-screen.md"
+3. Skill suggests running `/ux-design inventory-screen` to create the spec first
+4. No review is performed; no verdict is issued
+
+**Assertions:**
+- [ ] Error message names the missing file with full path
+- [ ] `/ux-design inventory-screen` is suggested as the remediation
+- [ ] No review checklist is produced
+- [ ] No verdict is issued (error state, not APPROVED/NEEDS REVISION)
+
+---
+
+### Case 5: Director Gate Check — No gate; ux-review is itself the review
+
+**Fixture:**
+- Valid UX spec file
+
+**Input:** `/ux-review hud`
+
+**Expected behavior:**
+1. Skill performs the review and issues a verdict
+2. No additional director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is APPROVED, NEEDS REVISION, or MAJOR REVISION NEEDED — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Checks all 4 required sections (User Flows, Interaction States, Wireframe,
+     Accessibility Notes)
+- [ ] Checks all 5 interaction states (normal, hover, focus, disabled, error)
+- [ ] Checks accessibility coverage (keyboard nav, contrast, screen reader)
+- [ ] Does not write any files
+- [ ] Issues specific, actionable feedback when verdict is not APPROVED
+- [ ] Ends with next-step handoff to `/ux-design` for revision or implementation
+
+---
+
+## Coverage Notes
+
+- MAJOR REVISION NEEDED is triggered when structural sections are entirely
+  absent (not just empty) or when fundamental interaction flows are missing
+  entirely; not tested with a separate fixture here.
+- Art bible / design system consistency check (color palette alignment) is
+  mentioned as a capability but not separately fixture-tested.
+- The case where an existing spec was written for a now-renamed screen is
+  not tested; the skill would review the file by path regardless of the name.
--- a/Framework/skills/gate/gate-check.md
+++ b/Framework/skills/gate/gate-check.md
@@ -0,0 +1,200 @@
+# Skill Test Spec: /gate-check
+
+## Skill Summary
+
+`/gate-check` validates whether the project is ready to advance to the next
+development phase. It checks for required artifacts, runs quality checks, asks
+the user about unverifiable items, and produces a PASS/CONCERNS/FAIL verdict.
+On PASS with user confirmation, it writes the new stage name to
+`production/stage.txt`. It governs all 6 phase transitions and is the most
+critical gate-keeping skill in the pipeline.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings (numbered Phase N or ## sections)
+- [ ] Contains verdict keywords: PASS, CONCERNS, FAIL
+- [ ] Contains "May I write" collaborative protocol language
+- [ ] Has a next-step handoff at the end (Follow-Up Actions section)
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All Concept artifacts present, advancing to Systems Design
+
+**Fixture:**
+- `design/gdd/game-concept.md` exists, has content including all required sections
+- `design/gdd/game-pillars.md` exists (or pillars defined within concept doc)
+- No systems index yet (which is correct for this stage)
+
+**Input:** `/gate-check systems-design`
+
+**Expected behavior:**
+1. Skill reads `design/gdd/game-concept.md` and verifies it has content
+2. Skill checks for game pillars (in concept or separate file)
+3. Skill checks quality items (core loop described, target audience identified)
+4. Skill outputs structured checklist with all items marked
+5. Skill presents PASS/CONCERNS/FAIL verdict
+6. If PASS: skill asks "May I update `production/stage.txt` to 'Systems Design'?"
+
+**Assertions:**
+- [ ] Skill uses Glob or Read to verify `design/gdd/game-concept.md` exists before marking it checked
+- [ ] Output includes a "Required Artifacts" section with check status per item
+- [ ] Output includes a "Quality Checks" section with check status per item
+- [ ] Output includes a "Verdict" line with one of PASS / CONCERNS / FAIL
+- [ ] Skill asks about unverifiable quality items (e.g., "Has this been reviewed?") rather than assuming PASS
+- [ ] Skill asks "May I write" before updating `production/stage.txt`
+- [ ] Skill does NOT write `production/stage.txt` without explicit user confirmation
+
+---
+
+### Case 2: Failure Path — Missing required artifacts for Concept → Systems Design
+
+**Fixture:**
+- `design/gdd/game-concept.md` does NOT exist
+- No game pillars document exists
+- `design/gdd/` directory is empty or absent
+
+**Input:** `/gate-check systems-design`
+
+**Expected behavior:**
+1. Skill attempts to read `design/gdd/game-concept.md` — file not found
+2. Skill marks required artifact as missing (not present)
+3. Skill outputs FAIL verdict
+4. Skill lists blocker: "No game concept document found"
+5. Skill suggests remediation: run `/brainstorm` to create one
+
+**Assertions:**
+- [ ] Verdict is FAIL (not PASS or CONCERNS) when required artifacts are missing
+- [ ] Output explicitly names `design/gdd/game-concept.md` as missing
+- [ ] Output includes a "Blockers" section with at least 1 item
+- [ ] Output recommends `/brainstorm` as the remediation action
+- [ ] Skill does NOT write `production/stage.txt` when verdict is FAIL
+
+---
+
+### Case 3: No Argument — Auto-detect current stage
+
+**Fixture:**
+- `production/stage.txt` contains `Concept`
+- `design/gdd/game-concept.md` exists with content
+- No systems index yet
+
+**Input:** `/gate-check` (no argument)
+
+**Expected behavior:**
+1. Skill reads `production/stage.txt` to determine current stage
+2. Skill determines the next gate is Concept → Systems Design
+3. Skill proceeds with the Systems Design gate checks
+4. Output clearly states which transition is being validated
+
+**Assertions:**
+- [ ] Skill reads `production/stage.txt` (or uses project-stage-detect heuristics) to determine current stage
+- [ ] Output header names both current and target phases (e.g., "Gate Check: Concept → Systems Design")
+- [ ] Skill does not ask the user which gate to check if current stage is determinable
+
+---
+
+### Case 4: Edge Case — Manual check items flagged correctly
+
+**Fixture:**
+- All required artifacts for Concept → Systems Design are present
+- No playtest or review record exists (can't auto-verify quality checks)
+
+**Input:** `/gate-check systems-design`
+
+**Expected behavior:**
+1. Skill verifies all artifact files exist
+2. Skill encounters quality check: "Game concept reviewed (not MAJOR REVISION NEEDED)"
+3. Since no review record exists, skill marks item as MANUAL CHECK NEEDED
+4. Skill asks the user: "Has the game concept been reviewed for design quality?"
+5. Skill waits for user input before finalizing verdict
+
+**Assertions:**
+- [ ] Items that cannot be auto-verified are marked `[?] MANUAL CHECK NEEDED` rather than assumed PASS
+- [ ] Skill uses a question to the user for at least one unverifiable quality item
+- [ ] Skill does not mark unverifiable items as PASS by default
+
+---
+
+---
+
+### Case 5: Director Gate — lean vs full vs solo mode
+
+**Fixture:**
+- `production/session-state/review-mode.txt` exists (or equivalent state file)
+- All required artifacts for the target gate are present
+- `design/gdd/game-concept.md` exists
+
+**Case 5a — full mode:**
+- `review-mode.txt` contains `full`
+
+**Input:** `/gate-check systems-design` (with full mode active)
+
+**Expected behavior:**
+1. Skill reads review mode — determines `full`
+2. Skill spawns all 4 PHASE-GATE director prompts in parallel:
+   - CD-PHASE-GATE (creative-director)
+   - TD-PHASE-GATE (technical-director)
+   - PR-PHASE-GATE (producer)
+   - AD-PHASE-GATE (art-director)
+3. If one director returns CONCERNS → overall gate verdict is at minimum CONCERNS
+4. All 4 verdicts are collected before producing final output
+
+**Assertions (5a):**
+- [ ] Skill reads review-mode before deciding which directors to spawn
+- [ ] All 4 PHASE-GATE director prompts are spawned (not just 1 or 2)
+- [ ] Directors are spawned in parallel (simultaneous, not sequential)
+- [ ] A CONCERNS verdict from any one director propagates to overall verdict
+- [ ] Verdict is NOT auto-PASS if any director returns CONCERNS or REJECT
+
+**Case 5b — solo mode:**
+- `review-mode.txt` contains `solo`
+
+**Input:** `/gate-check systems-design` (with solo mode active)
+
+**Expected behavior:**
+1. Skill reads review mode — determines `solo`
+2. Each director is noted as skipped: "[CD-PHASE-GATE] skipped — Solo mode"
+3. Gate verdict is derived from artifact/quality checks only
+4. No director gates spawn
+
+**Assertions (5b):**
+- [ ] No director gates are spawned in solo mode
+- [ ] Each skipped gate is explicitly noted in output: "[GATE-ID] skipped — Solo mode"
+- [ ] Verdict is based on artifact and quality checks only
+
+**Note on Case 3 correction:**
+The Case 3 assertions previously stated "Skill does not ask the user which gate to check
+if current stage is determinable." This is correct. However, the skill DOES use
+AskUserQuestion to confirm the auto-detected transition before running full checks —
+this is a confirmation step, not a gate selection. Assertions for Case 3 should not
+treat this confirmation as a failure.
+
+---
+
+## Protocol Compliance
+
+- [ ] Uses "May I write" before updating `production/stage.txt`
+- [ ] Presents the full checklist report before asking for write approval
+- [ ] Ends with a "Follow-Up Actions" section listing next steps per verdict
+- [ ] Never advances the stage without explicit user confirmation
+- [ ] Never auto-creates `production/stage.txt` if it doesn't exist without asking
+
+---
+
+## Coverage Notes
+
+- The Production → Polish and Polish → Release gates are not covered here
+  because they require complex multi-artifact setups (sprint plans, playtest
+  data, QA sign-off); these are deferred to dedicated follow-up specs.
+- The "CONCERNS" verdict path (minor gaps, not blocking) is not explicitly
+  tested here; it falls between Case 1 and Case 2 and follows the same pattern.
+- The Vertical Slice validation block (Pre-Production → Production gate) is not
+  covered because it requires a playable build context that cannot be expressed
+  as a document fixture.
--- a/Framework/skills/pipeline/create-control-manifest.md
+++ b/Framework/skills/pipeline/create-control-manifest.md
@@ -0,0 +1,175 @@
+# Skill Test Spec: /create-control-manifest
+
+## Skill Summary
+
+`/create-control-manifest` reads all Accepted ADRs from `docs/architecture/` and
+generates a control manifest — a summary document that captures all architectural
+constraints, required patterns, and forbidden patterns in one place. The manifest
+is the reference document that story authors use when writing story files, ensuring
+stories inherit the correct architectural rules without having to read all ADRs
+individually.
+
+The skill only includes Accepted ADRs; Proposed ADRs are excluded and noted. It
+has no director gates. The skill asks "May I write" before writing
+`docs/architecture/control-manifest.md`.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: CREATED, BLOCKED
+- [ ] Contains "May I write" collaborative protocol language (for control-manifest.md)
+- [ ] Has a next-step handoff at the end (`/create-epics` or `/create-stories`)
+- [ ] Documents that only Accepted ADRs are included (not Proposed)
+
+---
+
+## Director Gate Checks
+
+No director gates — this skill spawns no director gate agents. The control
+manifest is a mechanical extraction from Accepted ADRs; no creative or technical
+review gate is needed.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — 4 Accepted ADRs create a correct manifest
+
+**Fixture:**
+- `docs/architecture/` contains 4 ADR files, all with `Status: Accepted`
+- Each ADR has a "Required Patterns" and/or "Forbidden Patterns" section
+- No existing `docs/architecture/control-manifest.md`
+
+**Input:** `/create-control-manifest`
+
+**Expected behavior:**
+1. Skill reads all ADR files in `docs/architecture/`
+2. Extracts Required Patterns, Forbidden Patterns, and key constraints from each
+3. Drafts the manifest with correct section structure
+4. Shows the draft manifest to the user
+5. Asks "May I write `docs/architecture/control-manifest.md`?"
+6. Writes the manifest after approval
+
+**Assertions:**
+- [ ] All 4 Accepted ADRs are represented in the manifest
+- [ ] Manifest includes distinct sections for Required Patterns and Forbidden Patterns
+- [ ] Manifest includes the source ADR number for each constraint
+- [ ] "May I write" is asked before writing
+- [ ] Skill does NOT write without approval
+- [ ] Verdict is CREATED after writing
+
+---
+
+### Case 2: Failure Path — No ADRs found
+
+**Fixture:**
+- `docs/architecture/` directory exists but contains no ADR files
+
+**Input:** `/create-control-manifest`
+
+**Expected behavior:**
+1. Skill reads `docs/architecture/` and finds no ADR files
+2. Skill outputs: "No ADRs found. Run `/architecture-decision` to create ADRs before generating the control manifest."
+3. Skill exits without creating any file
+4. Verdict is BLOCKED
+
+**Assertions:**
+- [ ] Skill outputs a clear error when no ADRs are found
+- [ ] No control manifest file is written
+- [ ] Skill recommends `/architecture-decision` as the next action
+- [ ] Verdict is BLOCKED (not an error crash)
+
+---
+
+### Case 3: Mixed ADR Statuses — Only Accepted ADRs included
+
+**Fixture:**
+- `docs/architecture/` contains 3 Accepted ADRs and 2 Proposed ADRs
+
+**Input:** `/create-control-manifest`
+
+**Expected behavior:**
+1. Skill reads all ADR files and filters by Status: Accepted
+2. Manifest is drafted from the 3 Accepted ADRs only
+3. Output notes: "2 Proposed ADRs were excluded: [adr-NNN-name, adr-NNN-name]"
+4. User sees which ADRs were excluded before approving the write
+5. Asks "May I write `docs/architecture/control-manifest.md`?"
+
+**Assertions:**
+- [ ] Only the 3 Accepted ADRs appear in the manifest content
+- [ ] Excluded Proposed ADRs are listed by name in the output
+- [ ] User sees the exclusion list before approving the write
+- [ ] Skill does NOT silently omit Proposed ADRs without noting them
+
+---
+
+### Case 4: Edge Case — Manifest already exists
+
+**Fixture:**
+- `docs/architecture/control-manifest.md` already exists (version 1, dated last week)
+- `docs/architecture/` contains Accepted ADRs (some new since last manifest)
+
+**Input:** `/create-control-manifest`
+
+**Expected behavior:**
+1. Skill detects existing manifest and reads its version number / date
+2. Skill offers to regenerate: "control-manifest.md already exists (v1, [date]). Regenerate with current ADRs?"
+3. If user confirms: skill drafts updated manifest, increments version number
+4. Asks "May I write `docs/architecture/control-manifest.md`?" (overwrite)
+5. Writes updated manifest after approval
+
+**Assertions:**
+- [ ] Skill reads and reports the existing manifest version before offering to regenerate
+- [ ] User is offered a regenerate/skip choice — not auto-overwritten
+- [ ] Updated manifest has an incremented version number
+- [ ] "May I write" is asked before overwriting the existing file
+
+---
+
+### Case 5: Director Gate — No gate spawned; no review-mode.txt read
+
+**Fixture:**
+- 4 Accepted ADRs exist
+- `production/session-state/review-mode.txt` exists with `full`
+
+**Input:** `/create-control-manifest`
+
+**Expected behavior:**
+1. Skill reads ADRs and drafts manifest
+2. Skill does NOT read `production/session-state/review-mode.txt`
+3. No director gate agents are spawned at any point
+4. Skill proceeds directly to "May I write" after drafting
+5. Review mode setting has no effect on this skill's behavior
+
+**Assertions:**
+- [ ] No director gate agents are spawned (no CD-, TD-, PR-, AD- prefixed gates)
+- [ ] Skill does NOT read `production/session-state/review-mode.txt`
+- [ ] Output contains no "Gate: [GATE-ID]" or gate-skipped entries
+- [ ] The manifest is generated from ADRs alone, with no external gate review
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads all ADR files before drafting manifest
+- [ ] Only Accepted ADRs included — Proposed ones noted as excluded
+- [ ] Manifest draft shown to user before "May I write" ask
+- [ ] "May I write `docs/architecture/control-manifest.md`?" asked before writing
+- [ ] No director gates — no review-mode.txt read
+- [ ] Ends with next-step handoff: `/create-epics` or `/create-stories`
+
+---
+
+## Coverage Notes
+
+- The exact section structure of the generated manifest (constraint tables, pattern
+  lists) is defined by the skill body and not re-enumerated in test assertions.
+- The `version` field incrementing logic (v1 → v2) is tested via Case 4 but exact
+  version numbering format is not fixture-locked.
+- ADR parsing (extracting Required/Forbidden Patterns) depends on consistent ADR
+  structure — tested implicitly via Case 1's fixture.
--- a/Framework/skills/pipeline/create-epics.md
+++ b/Framework/skills/pipeline/create-epics.md
@@ -0,0 +1,190 @@
+# Skill Test Spec: /create-epics
+
+## Skill Summary
+
+`/create-epics` reads all approved GDDs and translates them into EPIC.md files,
+one per system. Epics are organized by layer (Foundation → Core → Feature →
+Presentation) and processed in priority order within each layer. Each EPIC.md
+includes scope, governing ADRs, GDD requirements, engine risk level, and a
+Definition of Done. The skill asks "May I write" before creating each EPIC file.
+
+In `full` review mode, a PR-EPIC gate (producer) runs after drafting epics and
+before writing any files. In `lean` or `solo` mode, PR-EPIC is skipped and noted.
+Epics are written to `production/epics/[layer]/EPIC-[name].md`.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: CREATED, BLOCKED
+- [ ] Contains "May I write" collaborative protocol language (per-epic approval)
+- [ ] Has a next-step handoff at the end (`/create-stories`)
+- [ ] Documents PR-EPIC gate behavior: runs in full mode; skipped in lean/solo
+
+---
+
+## Director Gate Checks
+
+In `full` mode: PR-EPIC (producer) gate runs after epics are drafted and before
+any epic file is written. If PR-EPIC returns CONCERNS, epics are revised before
+the "May I write" ask.
+
+In `lean` mode: PR-EPIC is skipped. Output notes: "PR-EPIC skipped — lean mode".
+
+In `solo` mode: PR-EPIC is skipped. Output notes: "PR-EPIC skipped — solo mode".
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Two approved GDDs create two EPIC files
+
+**Fixture:**
+- `design/gdd/systems-index.md` exists with 2 systems listed
+- Both systems have approved GDDs in `design/gdd/`
+- `docs/architecture/architecture.md` exists with matching modules
+- At least one Accepted ADR exists for each system
+- `production/session-state/review-mode.txt` contains `lean`
+
+**Input:** `/create-epics`
+
+**Expected behavior:**
+1. Skill reads systems index and both GDDs
+2. Drafts 2 EPIC definitions (layer, GDD path, ADRs, requirements, engine risk)
+3. PR-EPIC gate is skipped (lean mode) — noted in output
+4. For each epic: asks "May I write `production/epics/[layer]/EPIC-[name].md`?"
+5. After approval: writes both EPIC files
+6. Creates or updates `production/epics/index.md`
+
+**Assertions:**
+- [ ] Epic summary is shown before any write ask
+- [ ] "May I write" is asked per-epic (not once for all epics together)
+- [ ] Each EPIC.md contains: layer, GDD path, governing ADRs, requirements table, Definition of Done
+- [ ] PR-EPIC skip is noted in output
+- [ ] `production/epics/index.md` is updated after writing
+- [ ] Skill does NOT write EPIC files without per-epic approval
+
+---
+
+### Case 2: Failure Path — No approved GDDs found
+
+**Fixture:**
+- `design/gdd/systems-index.md` exists
+- No GDDs in `design/gdd/` have approved status (all are Draft or In Progress)
+
+**Input:** `/create-epics`
+
+**Expected behavior:**
+1. Skill reads systems index and attempts to find approved GDDs
+2. No approved GDDs found
+3. Skill outputs: "No approved GDDs to convert. GDDs must be Approved before creating epics."
+4. Skill suggests running `/design-system` and completing GDD approval first
+5. Skill exits without creating any EPIC files
+
+**Assertions:**
+- [ ] Skill stops cleanly with a clear message when no approved GDDs exist
+- [ ] No EPIC files are written
+- [ ] Skill recommends the correct next action
+- [ ] Verdict is BLOCKED
+
+---
+
+### Case 3: Director Gate — Full mode spawns PR-EPIC before writing
+
+**Fixture:**
+- 2 approved GDDs exist
+- `production/session-state/review-mode.txt` contains `full`
+
+**Full mode expected behavior:**
+1. Skill drafts both epics
+2. PR-EPIC gate spawns and reviews the epic drafts
+3. If PR-EPIC returns APPROVED: "May I write" ask proceeds normally
+4. Epic files are written after approval
+
+**Assertions (full mode):**
+- [ ] PR-EPIC gate appears in output as an active gate
+- [ ] PR-EPIC runs before any "May I write" ask
+- [ ] Epic files are NOT written before PR-EPIC completes
+
+**Fixture (lean mode):**
+- Same GDDs
+- `production/session-state/review-mode.txt` contains `lean`
+
+**Lean mode expected behavior:**
+1. Epics are drafted
+2. PR-EPIC is skipped — noted in output
+3. "May I write" ask proceeds directly
+
+**Assertions (lean mode):**
+- [ ] "PR-EPIC skipped — lean mode" appears in output
+- [ ] Skill proceeds to "May I write" without waiting for PR-EPIC
+
+---
+
+### Case 4: Edge Case — Epic already exists for a GDD
+
+**Fixture:**
+- `production/epics/[layer]/EPIC-[name].md` already exists for one of the approved GDDs
+- The other GDD has no existing EPIC file
+
+**Input:** `/create-epics`
+
+**Expected behavior:**
+1. Skill detects the existing EPIC file for the first system
+2. Skill offers to update rather than overwrite: "EPIC-[name].md already exists. Update it, or skip?"
+3. For the second system (no existing file): proceeds normally with "May I write"
+
+**Assertions:**
+- [ ] Skill detects existing EPIC files before writing
+- [ ] User is offered "update" or "skip" options — not auto-overwritten
+- [ ] The new system's EPIC is created normally without conflict
+
+---
+
+### Case 5: Director Gate — PR-EPIC returns CONCERNS
+
+**Fixture:**
+- 2 approved GDDs exist
+- `production/session-state/review-mode.txt` contains `full`
+- PR-EPIC gate returns CONCERNS (e.g., scope of one epic is too large)
+
+**Input:** `/create-epics`
+
+**Expected behavior:**
+1. PR-EPIC gate spawns and returns CONCERNS with specific feedback
+2. Skill surfaces the concerns to the user before any write ask
+3. User is given options: revise epics, accept concerns and proceed, or stop
+4. If user revises: updated epic drafts are shown before the "May I write" ask
+5. Skill does NOT write epics while CONCERNS are unaddressed
+
+**Assertions:**
+- [ ] CONCERNS from PR-EPIC are shown to the user before writing
+- [ ] Skill does NOT auto-write epics when CONCERNS are returned
+- [ ] User is given a clear choice to revise, proceed, or stop
+- [ ] Revised epic drafts are re-shown after revision before final approval
+
+---
+
+## Protocol Compliance
+
+- [ ] Epic drafts shown to user before any "May I write" ask
+- [ ] "May I write" asked per-epic, not once for the entire batch
+- [ ] PR-EPIC gate (if active) runs before write asks — not after
+- [ ] Skipped gates noted by name and mode in output
+- [ ] EPIC.md content sourced only from GDDs, ADRs, and architecture docs — nothing invented
+- [ ] Ends with next-step handoff: `/create-stories [epic-slug]` per created epic
+
+---
+
+## Coverage Notes
+
+- Processing of Core, Feature, and Presentation layers follows the same per-epic
+  pattern as Foundation — layer-specific ordering is not independently tested.
+- Engine risk level assignment (LOW/MEDIUM/HIGH) from governing ADRs is
+  validated implicitly via Case 1's fixture structure.
+- The `layer: [name]` and `[system-name]` argument modes follow the same approval
+  pattern as the default (all systems) mode.
--- a/Framework/skills/pipeline/create-stories.md
+++ b/Framework/skills/pipeline/create-stories.md
@@ -0,0 +1,191 @@
+# Skill Test Spec: /create-stories
+
+## Skill Summary
+
+`/create-stories` breaks a single epic into developer-ready story files. It reads
+the EPIC.md, the corresponding GDD, governing ADRs, the control manifest, and the
+TR registry. Each story gets structured frontmatter including: Title, Epic, Layer,
+Priority, Status, TR-ID, ADR references, Acceptance Criteria, and Definition of
+Done. Stories are classified by type (Logic / Integration / Visual/Feel / UI /
+Config/Data) which determines the required test evidence path.
+
+In `full` review mode, a QL-STORY-READY check runs per story after creation. In
+`lean` or `solo` mode, QL-STORY-READY is skipped. The skill asks "May I write"
+before writing each story file. Stories are written to
+`production/epics/[layer]/story-[name].md`.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED, NEEDS WORK
+- [ ] Contains "May I write" collaborative protocol language (per-story approval)
+- [ ] Has a next-step handoff at the end (`/story-readiness`, `/dev-story`)
+- [ ] Documents story Status: Blocked when governing ADR is Proposed
+- [ ] Documents QL-STORY-READY gate: active in full mode, skipped in lean/solo
+
+---
+
+## Director Gate Checks
+
+In `full` mode: QL-STORY-READY check runs per story after creation. Stories that
+fail the check are noted as NEEDS WORK before the "May I write" ask.
+
+In `lean` mode: QL-STORY-READY is skipped. Output notes:
+"QL-STORY-READY skipped — lean mode" per story.
+
+In `solo` mode: QL-STORY-READY is skipped with equivalent notes.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Epic with 3 stories, all ADRs Accepted
+
+**Fixture:**
+- `production/epics/[layer]/EPIC-[name].md` exists with 3 GDD requirements
+- Corresponding GDD exists with matching acceptance criteria
+- All governing ADRs have `Status: Accepted`
+- `docs/architecture/control-manifest.md` exists
+- `docs/architecture/tr-registry.yaml` has TR-IDs for all 3 requirements
+- `production/session-state/review-mode.txt` contains `lean`
+
+**Input:** `/create-stories [epic-name]`
+
+**Expected behavior:**
+1. Skill reads EPIC.md, GDD, governing ADRs, control manifest, and TR registry
+2. Classifies each requirement into a story type (Logic / Integration / Visual/Feel / UI / Config/Data)
+3. Drafts 3 story files with correct frontmatter schema
+4. QL-STORY-READY is skipped (lean mode) — noted in output
+5. Asks "May I write" before writing each story file
+6. Writes all 3 story files after approval
+
+**Assertions:**
+- [ ] Each story's frontmatter contains: Title, Epic, Layer, Priority, Status, TR-ID, ADR reference, Acceptance Criteria, DoD
+- [ ] Story types are correctly classified (at least one Logic type in fixture)
+- [ ] "May I write" is asked per story (not once for the entire batch)
+- [ ] QL-STORY-READY skip is noted in output
+- [ ] All 3 story files are written with correct naming: `story-[name].md`
+- [ ] Skill does NOT start implementation
+
+---
+
+### Case 2: Failure Path — No epic file found
+
+**Fixture:**
+- The epic path provided does not exist in `production/epics/`
+
+**Input:** `/create-stories nonexistent-epic`
+
+**Expected behavior:**
+1. Skill attempts to read the EPIC.md file
+2. File not found
+3. Skill outputs a clear error with the path it searched
+4. Skill suggests checking `production/epics/` or running `/create-epics` first
+5. No story files are created
+
+**Assertions:**
+- [ ] Skill outputs a clear error naming the missing file path
+- [ ] No story files are written
+- [ ] Skill recommends the correct next action (`/create-epics`)
+- [ ] Skill does NOT create stories without a valid EPIC.md
+
+---
+
+### Case 3: Blocked Story — ADR is Proposed
+
+**Fixture:**
+- EPIC.md exists with 2 requirements
+- Requirement 1 is covered by an Accepted ADR
+- Requirement 2 is covered by an ADR with `Status: Proposed`
+
+**Input:** `/create-stories [epic-name]`
+
+**Expected behavior:**
+1. Skill reads the ADR for Requirement 2 and finds Status: Proposed
+2. Story for Requirement 2 is drafted with `Status: Blocked`
+3. Blocking note references the specific ADR: "BLOCKED: ADR-NNN is Proposed"
+4. Story for Requirement 1 is drafted normally with `Status: Ready`
+5. Both stories are shown in the draft — user asked "May I write" for both
+
+**Assertions:**
+- [ ] Story 2 has `Status: Blocked` in its frontmatter
+- [ ] Blocking note names the specific ADR number and recommends `/architecture-decision`
+- [ ] Story 1 has `Status: Ready` — blocked status does not affect non-blocked stories
+- [ ] Blocked status is shown in the draft preview before writing
+- [ ] Both story files are written (blocked stories are still written — just flagged)
+
+---
+
+### Case 4: Edge Case — No argument provided
+
+**Fixture:**
+- `production/epics/` directory exists with ≥2 epic subdirectories
+
+**Input:** `/create-stories` (no argument)
+
+**Expected behavior:**
+1. Skill detects no argument is provided
+2. Outputs a usage error: "No epic specified. Usage: /create-stories [epic-name]"
+3. Skill lists available epics from `production/epics/`
+4. No story files are created
+
+**Assertions:**
+- [ ] Skill outputs a usage error when no argument is given
+- [ ] Skill lists available epics to help the user choose
+- [ ] No story files are written
+- [ ] Skill does NOT silently pick an epic without user input
+
+---
+
+### Case 5: Director Gate — Full mode runs QL-STORY-READY; stories failing noted as NEEDS WORK
+
+**Fixture:**
+- EPIC.md exists with 2 requirements
+- Both governing ADRs are Accepted
+- `production/session-state/review-mode.txt` contains `full`
+- QL-STORY-READY check finds one story has ambiguous acceptance criteria
+
+**Input:** `/create-stories [epic-name]`
+
+**Expected behavior:**
+1. Both stories are drafted
+2. QL-STORY-READY check runs for each story
+3. Story 1 passes QL-STORY-READY
+4. Story 2 fails QL-STORY-READY — noted as NEEDS WORK with specific feedback
+5. Both stories are shown to user with pass/fail status before "May I write"
+6. User can proceed (story written as-is with NEEDS WORK note) or revise first
+
+**Assertions:**
+- [ ] QL-STORY-READY results appear per story in the output
+- [ ] Story 2 is flagged as NEEDS WORK with the specific failing criteria
+- [ ] Story 1 shows as passing QL-STORY-READY
+- [ ] User is given the choice to proceed or revise before writing
+- [ ] Skill does NOT auto-block writing of stories that fail QL-STORY-READY without user input
+
+---
+
+## Protocol Compliance
+
+- [ ] All context (EPIC, GDD, ADRs, manifest, TR registry) loaded before drafting stories
+- [ ] Story drafts shown in full before any "May I write" ask
+- [ ] "May I write" asked per story (not once for the entire batch)
+- [ ] Blocked stories flagged before write approval — not discovered after writing
+- [ ] TR-IDs reference the registry — requirement text is not embedded inline in story files
+- [ ] Control manifest rules quoted per-story from the manifest, not invented
+- [ ] Ends with next-step handoff: `/story-readiness` → `/dev-story`
+
+---
+
+## Coverage Notes
+
+- Integration story test evidence (playtest doc alternative) follows the same
+  approval pattern as Logic stories — not independently fixture-tested.
+- Story ordering (foundational first, UI last) is validated implicitly via
+  Case 1's multi-story fixture.
+- The story sizing rule (splitting large requirement groups) is not tested here
+  — it is addressed in the `/create-stories` skill's internal logic.
--- a/Framework/skills/pipeline/dev-story.md
+++ b/Framework/skills/pipeline/dev-story.md
@@ -0,0 +1,205 @@
+# Skill Test Spec: /dev-story
+
+## Skill Summary
+
+`/dev-story` reads a story file, loads all required context (referenced ADR,
+TR-ID from the registry, control manifest, engine preferences), implements the
+story, verifies that all acceptance criteria are met, and marks the story
+Complete. The skill routes implementation to the correct specialist agent based
+on the engine and file type — it does not write source code directly.
+
+In `full` review mode, an LP-CODE-REVIEW gate runs before marking the story
+Complete. In `lean` or `solo` mode, LP-CODE-REVIEW is skipped and the story is
+marked Complete after the user confirms all criteria are met. The skill asks
+"May I write" before updating story status and before writing code files.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED, IN PROGRESS, NEEDS CHANGES
+- [ ] Contains "May I write" collaborative protocol language (story status + code files)
+- [ ] Has a next-step handoff at the end (`/story-done`)
+- [ ] Documents LP-CODE-REVIEW gate: active in full mode, skipped in lean/solo
+- [ ] Notes that implementation is delegated to specialist agents (not done directly)
+
+---
+
+## Director Gate Checks
+
+In `full` mode: LP-CODE-REVIEW gate runs after implementation is complete and all
+criteria are verified, before marking the story Complete.
+
+In `lean` mode: LP-CODE-REVIEW is skipped. Output notes:
+"LP-CODE-REVIEW skipped — lean mode". Story is marked Complete after user confirms.
+
+In `solo` mode: LP-CODE-REVIEW is skipped with equivalent notes.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Story implemented and marked Complete (full mode)
+
+**Fixture:**
+- A story file exists at `production/epics/[layer]/story-[name].md` with:
+  - `Status: Ready`
+  - A TR-ID referencing a registered requirement
+  - At least 2 Given-When-Then acceptance criteria
+  - A test evidence path
+- Referenced ADR has `Status: Accepted`
+- `docs/architecture/control-manifest.md` exists
+- `.claude/docs/technical-preferences.md` has engine and language configured
+- `production/session-state/review-mode.txt` contains `full`
+
+**Input:** `/dev-story production/epics/[layer]/story-[name].md`
+
+**Expected behavior:**
+1. Skill reads the story file and all referenced context
+2. Skill verifies the ADR is Accepted (no block)
+3. Skill routes implementation to the correct specialist agent
+4. All acceptance criteria are verified as met
+5. LP-CODE-REVIEW gate spawns and returns APPROVED
+6. Skill asks "May I update story status to Complete?"
+7. Story status is updated to Complete
+
+**Assertions:**
+- [ ] Skill reads story before spawning any agent
+- [ ] ADR status is checked before implementation begins
+- [ ] Implementation is delegated to a specialist agent (not done inline)
+- [ ] All acceptance criteria are confirmed before LP-CODE-REVIEW
+- [ ] LP-CODE-REVIEW appears in output as a completed gate
+- [ ] Story status is updated to Complete only after gate approval and user consent
+- [ ] Test file is written as part of implementation (not deferred)
+
+---
+
+### Case 2: Failure Path — Referenced ADR is Proposed
+
+**Fixture:**
+- A story file exists with `Status: Ready`
+- The story's TR-ID points to a requirement covered by an ADR with `Status: Proposed`
+
+**Input:** `/dev-story production/epics/[layer]/story-[name].md`
+
+**Expected behavior:**
+1. Skill reads the story file
+2. Skill resolves the TR-ID and reads the governing ADR
+3. ADR status is Proposed — skill outputs a BLOCKED message
+4. Skill names the specific ADR blocking the story
+5. Skill recommends running `/architecture-decision` to advance the ADR
+6. Implementation does NOT begin
+
+**Assertions:**
+- [ ] Skill does NOT begin implementation with a Proposed ADR
+- [ ] BLOCKED message names the specific ADR number and title
+- [ ] Skill recommends `/architecture-decision` as the next action
+- [ ] Story status remains unchanged (not set to In Progress or Complete)
+
+---
+
+### Case 3: Ambiguous Acceptance Criteria — Skill asks for clarification
+
+**Fixture:**
+- A story file exists with `Status: Ready`
+- Referenced ADR is Accepted
+- One acceptance criterion is ambiguous (not Given-When-Then; uses subjective language like "feels responsive")
+
+**Input:** `/dev-story production/epics/[layer]/story-[name].md`
+
+**Expected behavior:**
+1. Skill reads the story and identifies the ambiguous criterion
+2. Before routing to the specialist, skill asks the user to clarify the criterion
+3. User provides a concrete, testable restatement
+4. Skill proceeds with implementation using the clarified criterion
+5. Skill does NOT guess at the intended behavior
+
+**Assertions:**
+- [ ] Skill surfaces the ambiguous criterion before implementation starts
+- [ ] Skill asks for user clarification (not auto-interpretation)
+- [ ] Implementation begins only after clarification is provided
+- [ ] Clarified criterion is used in the test (not the original vague version)
+
+---
+
+### Case 4: Edge Case — No argument; reads from session state
+
+**Fixture:**
+- No argument is provided
+- `production/session-state/active.md` references an active story file
+- That story file exists with `Status: In Progress`
+
+**Input:** `/dev-story` (no argument)
+
+**Expected behavior:**
+1. Skill detects no argument is provided
+2. Skill reads `production/session-state/active.md`
+3. Skill finds the active story reference
+4. Skill confirms with user: "Continuing work on [story title] — is that correct?"
+5. After confirmation, skill proceeds with that story
+
+**Assertions:**
+- [ ] Skill reads session state when no argument is provided
+- [ ] Skill confirms the active story with the user before proceeding
+- [ ] Skill does NOT silently assume the active story without confirmation
+- [ ] If session state has no active story, skill asks which story to implement
+
+---
+
+### Case 5: Director Gate — LP-CODE-REVIEW returns NEEDS CHANGES; lean mode skips gate
+
+**Fixture (full mode):**
+- Story is implemented and all criteria appear met
+- `production/session-state/review-mode.txt` contains `full`
+- LP-CODE-REVIEW gate returns NEEDS CHANGES with specific feedback
+
+**Full mode expected behavior:**
+1. LP-CODE-REVIEW gate spawns after implementation
+2. Gate returns NEEDS CHANGES with 2 specific issues
+3. Story status remains In Progress — NOT marked Complete
+4. User is shown the gate feedback and asked how to proceed
+
+**Assertions (full mode):**
+- [ ] Story is NOT marked Complete when LP-CODE-REVIEW returns NEEDS CHANGES
+- [ ] Gate feedback is shown to the user verbatim
+- [ ] Story status stays In Progress until issues are resolved and gate passes
+
+**Fixture (lean mode):**
+- Same story, `production/session-state/review-mode.txt` contains `lean`
+
+**Lean mode expected behavior:**
+1. Implementation completes
+2. LP-CODE-REVIEW gate is skipped — noted in output
+3. User is asked to confirm all criteria are met
+4. Story is marked Complete after user confirmation
+
+**Assertions (lean mode):**
+- [ ] "LP-CODE-REVIEW skipped — lean mode" appears in output
+- [ ] Story is marked Complete after user confirms criteria (no gate required)
+- [ ] Skill does NOT block on a gate that is skipped
+
+---
+
+## Protocol Compliance
+
+- [ ] Does NOT write source code directly — delegates to specialist agents
+- [ ] Reads all context (story, TR-ID, ADR, manifest, engine prefs) before implementation
+- [ ] "May I write" asked before updating story status and before writing code files
+- [ ] Skipped gates noted by name and mode in output
+- [ ] Updates `production/session-state/active.md` after story completion
+- [ ] Ends with next-step handoff: `/story-done`
+
+---
+
+## Coverage Notes
+
+- Engine routing logic (Godot vs Unity vs Unreal) is not tested per engine —
+  the routing pattern is consistent; engine selection is a config fact.
+- Visual/Feel and UI story types (no automated test required) have different
+  evidence requirements and are not covered in these cases.
+- Integration story type follows the same pattern as Logic but with a different
+  evidence path — not independently fixture-tested.
--- a/Framework/skills/pipeline/map-systems.md
+++ b/Framework/skills/pipeline/map-systems.md
@@ -0,0 +1,196 @@
+# Skill Test Spec: /map-systems
+
+## Skill Summary
+
+`/map-systems` decomposes a game concept into a systems index. It reads the
+approved game concept and pillars, enumerates both explicit and implicit systems,
+maps dependencies between systems, assigns priority tiers (MVP / Vertical Slice /
+Alpha / Full Vision), and organizes systems into a layered design order
+(Foundation → Core → Feature → Presentation). The output is written to
+`design/systems-index.md` after user approval.
+
+This skill is required between game concept approval and per-system GDD creation
+— it is a mandatory gate in the pipeline. In `full` review mode, CD-SYSTEMS
+(creative-director) and TD-SYSTEM-BOUNDARY (technical-director) spawn in parallel
+after the decomposition is drafted. In `lean` or `solo` mode, both gates are
+skipped. The skill writes to `design/systems-index.md`.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED
+- [ ] Contains "May I write" collaborative protocol language (for systems-index.md)
+- [ ] Has a next-step handoff at the end (`/design-system`)
+- [ ] Documents gate behavior: CD-SYSTEMS + TD-SYSTEM-BOUNDARY in parallel in full mode
+
+---
+
+## Director Gate Checks
+
+In `full` mode: CD-SYSTEMS (creative-director) and TD-SYSTEM-BOUNDARY
+(technical-director) spawn in parallel after the systems decomposition is drafted
+and before `design/systems-index.md` is written.
+
+In `lean` mode: both gates are skipped. Output notes:
+"CD-SYSTEMS skipped — lean mode" and "TD-SYSTEM-BOUNDARY skipped — lean mode".
+
+In `solo` mode: both gates are skipped with equivalent notes.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Game concept exists, 5-8 systems identified
+
+**Fixture:**
+- `design/gdd/game-concept.md` exists with Core Mechanics and MVP Definition sections
+- `design/gdd/game-pillars.md` exists with ≥1 pillar defined
+- No `design/systems-index.md` exists yet
+- `production/session-state/review-mode.txt` contains `full`
+
+**Input:** `/map-systems`
+
+**Expected behavior:**
+1. Skill reads game-concept.md and game-pillars.md
+2. Identifies 5-8 systems (explicit + implicit)
+3. Maps dependencies between systems and assigns layers
+4. CD-SYSTEMS and TD-SYSTEM-BOUNDARY spawn in parallel and return APPROVED
+5. Asks "May I write `design/systems-index.md`?"
+6. Writes systems-index.md after approval
+7. Updates `production/session-state/active.md`
+
+**Assertions:**
+- [ ] Between 5 and 8 systems are identified (not fewer, not more without explanation)
+- [ ] CD-SYSTEMS and TD-SYSTEM-BOUNDARY spawn in parallel (not sequentially)
+- [ ] Both gates complete before the "May I write" ask
+- [ ] "May I write `design/systems-index.md`?" is asked before writing
+- [ ] systems-index.md is NOT written without approval
+- [ ] Session state is updated after writing
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Failure Path — No game concept found
+
+**Fixture:**
+- `design/gdd/game-concept.md` does NOT exist
+- `design/gdd/` directory may be empty or absent
+
+**Input:** `/map-systems`
+
+**Expected behavior:**
+1. Skill attempts to read `design/gdd/game-concept.md`
+2. File not found
+3. Skill outputs: "No game concept found. Run `/brainstorm` to create one, then return to `/map-systems`."
+4. Skill exits without creating systems-index.md
+
+**Assertions:**
+- [ ] Skill outputs a clear error naming the missing file path
+- [ ] Skill recommends `/brainstorm` as the next action
+- [ ] No systems-index.md is created
+- [ ] Verdict is BLOCKED
+
+---
+
+### Case 3: Director Gate — CD-SYSTEMS returns CONCERNS (missing core system)
+
+**Fixture:**
+- Game concept exists
+- `production/session-state/review-mode.txt` contains `full`
+- CD-SYSTEMS gate returns CONCERNS: "The [core-system] is implied by the concept but not identified"
+
+**Input:** `/map-systems`
+
+**Expected behavior:**
+1. Systems are drafted (5-8 initial systems identified)
+2. CD-SYSTEMS gate returns CONCERNS naming the missing core system
+3. TD-SYSTEM-BOUNDARY returns APPROVED
+4. Skill surfaces CD-SYSTEMS concerns to user
+5. User is asked: revise systems list to add the missing system, or proceed as-is
+6. If revised: updated systems list shown before "May I write" ask
+
+**Assertions:**
+- [ ] CD-SYSTEMS concerns are shown to the user before writing
+- [ ] Skill does NOT auto-write systems-index.md while CONCERNS are unresolved
+- [ ] User is given the option to revise or proceed
+- [ ] Revised systems list is re-shown after revision before final "May I write"
+
+---
+
+### Case 4: Edge Case — systems-index.md already exists
+
+**Fixture:**
+- `design/gdd/game-concept.md` exists
+- `design/systems-index.md` already exists with N systems
+
+**Input:** `/map-systems`
+
+**Expected behavior:**
+1. Skill reads the existing systems-index.md and presents its current state
+2. Skill asks: "systems-index.md already exists with [N] systems. Update with new systems, or review and revise priorities?"
+3. User chooses an action
+4. Skill does NOT silently overwrite the existing index
+
+**Assertions:**
+- [ ] Skill detects and reads the existing systems-index.md before proceeding
+- [ ] User is offered update/review options — not auto-overwritten
+- [ ] Existing system count is presented to the user
+- [ ] Skill does NOT proceed with a full re-decomposition without user choosing to do so
+
+---
+
+### Case 5: Director Gate — Lean mode and solo mode both skip gates, noted
+
+**Fixture (lean mode):**
+- Game concept exists
+- `production/session-state/review-mode.txt` contains `lean`
+
+**Lean mode expected behavior:**
+1. Systems are decomposed and drafted
+2. Both CD-SYSTEMS and TD-SYSTEM-BOUNDARY are skipped
+3. Output notes: "CD-SYSTEMS skipped — lean mode" and "TD-SYSTEM-BOUNDARY skipped — lean mode"
+4. "May I write" ask proceeds directly
+
+**Assertions (lean mode):**
+- [ ] Both gate skip notes appear in output
+- [ ] Skill proceeds to "May I write" without gate approval
+- [ ] systems-index.md is written after user approval
+
+**Fixture (solo mode):**
+- Same game concept, `production/session-state/review-mode.txt` contains `solo`
+
+**Solo mode expected behavior:**
+1. Same decomposition workflow
+2. Both gates skipped — noted in output with "solo mode"
+3. "May I write" ask proceeds
+
+**Assertions (solo mode):**
+- [ ] Both skip notes appear with "solo mode" label
+- [ ] Behavior is otherwise identical to lean mode for this skill
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads game-concept.md and game-pillars.md before any decomposition
+- [ ] "May I write `design/systems-index.md`?" asked before writing
+- [ ] systems-index.md is NOT written without user approval
+- [ ] CD-SYSTEMS and TD-SYSTEM-BOUNDARY spawn in parallel in full mode
+- [ ] Skipped gates noted by name and mode in lean/solo output
+- [ ] Ends with next-step handoff: `/design-system [next-system]`
+
+---
+
+## Coverage Notes
+
+- Circular dependency detection (System A depends on System B which depends on A)
+  is part of the dependency mapping phase — not independently fixture-tested here.
+- Priority tier assignment (MVP heuristics) is evaluated as part of the Case 1
+  collaborative workflow rather than independently.
+- The `next` argument mode (handing off the highest-priority undesigned system to
+  `/design-system`) is not tested here — it is a post-index-creation convenience.
--- a/Framework/skills/pipeline/propagate-design-change.md
+++ b/Framework/skills/pipeline/propagate-design-change.md
@@ -0,0 +1,175 @@
+# Skill Test Spec: /propagate-design-change
+
+## Skill Summary
+
+`/propagate-design-change` handles GDD revision cascades. When a GDD is updated,
+the skill traces all downstream artifacts that reference it: ADRs, TR-registry
+entries, stories, and epics. It produces a structured impact report showing what
+needs to change and why. The skill does NOT automatically apply changes — it
+proposes edits for each affected artifact and asks "May I write" per artifact
+before making any modification.
+
+The skill is read-only during analysis and write-gated per artifact during the
+update phase. It has no director gates — the analysis itself is mechanical
+tracing, not a creative review.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED, NO IMPACT
+- [ ] Contains "May I write" collaborative protocol language (per-artifact approval)
+- [ ] Has a next-step handoff at the end
+- [ ] Documents that changes are proposed, not applied automatically
+
+---
+
+## Director Gate Checks
+
+No director gates — this skill spawns no director gate agents during analysis.
+The impact report is a mechanical tracing operation; no creative or technical
+director review is required at the analysis stage.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — GDD revision affects 2 stories and 1 epic
+
+**Fixture:**
+- `design/gdd/[system].md` exists and has been recently revised (git diff shows changes)
+- `production/epics/[layer]/EPIC-[system].md` references this GDD
+- 2 story files reference TR-IDs from this GDD
+- The changed GDD section affects the acceptance criteria of both stories
+
+**Input:** `/propagate-design-change design/gdd/[system].md`
+
+**Expected behavior:**
+1. Skill reads the revised GDD and identifies what changed (git diff or content comparison)
+2. Skill scans ADRs, TR-registry, epics, and stories for references to this GDD
+3. Skill produces an impact report: 1 epic affected, 2 stories affected
+4. Skill shows the proposed change for each artifact
+5. For each artifact: asks "May I update [filepath]?" separately
+6. Applies changes only after per-artifact approval
+
+**Assertions:**
+- [ ] Impact report identifies all 3 affected artifacts (1 epic + 2 stories)
+- [ ] Each affected artifact's proposed change is shown before asking to write
+- [ ] "May I write" is asked per artifact (not once for all artifacts)
+- [ ] Skill does NOT apply any changes without per-artifact approval
+- [ ] Verdict is COMPLETE after all approved changes are applied
+
+---
+
+### Case 2: No Impact — Changed GDD has no downstream references
+
+**Fixture:**
+- `design/gdd/[system].md` exists and has been revised
+- No ADRs, stories, or epics reference this GDD's TR-IDs or GDD path
+
+**Input:** `/propagate-design-change design/gdd/[system].md`
+
+**Expected behavior:**
+1. Skill reads the revised GDD
+2. Skill scans all ADRs, stories, and epics for references
+3. No references found
+4. Skill outputs: "No downstream impact found for [system].md — no artifacts reference this GDD."
+5. No write operations are performed
+
+**Assertions:**
+- [ ] Skill outputs the "No downstream impact found" message
+- [ ] Verdict is NO IMPACT
+- [ ] No "May I write" asks are issued (nothing to update)
+- [ ] Skill does NOT error or crash when no references are found
+
+---
+
+### Case 3: In-Progress Story Warning — Referenced story is currently being developed
+
+**Fixture:**
+- A story referencing this GDD has `Status: In Progress`
+- The developer has already started implementing this story
+
+**Input:** `/propagate-design-change design/gdd/[system].md`
+
+**Expected behavior:**
+1. Skill identifies the In Progress story as an affected artifact
+2. Skill outputs an elevated warning: "CAUTION: [story-file] is currently In Progress — a developer may be working on this. Coordinate before updating."
+3. The warning appears in the impact report before the "May I write" ask for that story
+4. User can still approve or skip the update for that story
+
+**Assertions:**
+- [ ] In Progress story is flagged with an elevated warning (distinct from regular affected-artifact entries)
+- [ ] Warning appears before the "May I write" ask for that story
+- [ ] Skill still offers to update the story — the warning does not block the option
+- [ ] Other (non-In-Progress) artifacts are not affected by this warning
+
+---
+
+### Case 4: Edge Case — No argument provided
+
+**Fixture:**
+- Multiple GDDs exist in `design/gdd/`
+
+**Input:** `/propagate-design-change` (no argument)
+
+**Expected behavior:**
+1. Skill detects no argument is provided
+2. Skill outputs a usage error: "No GDD specified. Usage: /propagate-design-change design/gdd/[system].md"
+3. Skill lists recently modified GDDs as suggestions (git log)
+4. No analysis is performed
+
+**Assertions:**
+- [ ] Skill outputs a usage error when no argument is given
+- [ ] Usage example is shown with the correct path format
+- [ ] No impact analysis is performed without a target GDD
+- [ ] Skill does NOT silently pick a GDD without user input
+
+---
+
+### Case 5: Director Gate — No gate spawned regardless of review mode
+
+**Fixture:**
+- A GDD has been revised with downstream references
+- `production/session-state/review-mode.txt` exists with `full`
+
+**Input:** `/propagate-design-change design/gdd/[system].md`
+
+**Expected behavior:**
+1. Skill reads the GDD and traces downstream references
+2. Skill does NOT read `production/session-state/review-mode.txt`
+3. No director gate agents are spawned at any point
+4. Impact report is produced and per-artifact approval proceeds normally
+
+**Assertions:**
+- [ ] No director gate agents are spawned (no CD-, TD-, PR-, AD- prefixed gates)
+- [ ] Skill does NOT read `production/session-state/review-mode.txt`
+- [ ] Output contains no "Gate: [GATE-ID]" or gate-skipped entries
+- [ ] Review mode has no effect on this skill's behavior
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads revised GDD and all potentially affected artifacts before producing impact report
+- [ ] Impact report shown in full before any "May I write" ask
+- [ ] "May I write" asked per artifact — never for the entire set at once
+- [ ] In Progress stories flagged with elevated warning before their approval ask
+- [ ] No director gates — no review-mode.txt read
+- [ ] Ends with next-step handoff appropriate to verdict (COMPLETE or NO IMPACT)
+
+---
+
+## Coverage Notes
+
+- ADR impact (when a GDD change requires an ADR update or new ADR) follows the
+  same per-artifact approval pattern as story/epic updates — not independently
+  fixture-tested.
+- TR-registry impact (when changed GDD requires new or updated TR-IDs) is part
+  of the analysis phase but not independently fixture-tested.
+- The git diff comparison method (detecting what changed in the GDD) is a runtime
+  concern — fixtures use pre-arranged content differences.
--- a/Framework/skills/readiness/story-done.md
+++ b/Framework/skills/readiness/story-done.md
@@ -0,0 +1,209 @@
+# Skill Test Spec: /story-done
+
+## Skill Summary
+
+`/story-done` closes the loop between design and implementation. Run at the
+end of implementing a story, it reads the story file and verifies each
+acceptance criterion against the implementation. It checks for GDD and ADR
+deviations, prompts a code review, updates the story status to `Complete`,
+logs any tech debt, and surfaces the next ready story from the sprint. It
+produces a COMPLETE / COMPLETE WITH NOTES / BLOCKED verdict and writes to
+the story file and optionally to `docs/tech-debt-register.md`.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥5 phase headings (complex skill warranting `context: fork` if applicable)
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED
+- [ ] Contains "May I write" collaborative protocol language (writes to story file and tech-debt register)
+- [ ] Has a next-step handoff (surfaces next story from sprint)
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All acceptance criteria met, no deviations
+
+**Fixture:**
+- Story file at `production/epics/core/story-light-pickup.md` with:
+  - 3 acceptance criteria, all implemented as described
+  - `TR-ID: TR-light-001` referencing a GDD requirement
+  - `ADR: docs/architecture/adr-003-inventory.md` (Accepted)
+  - `Status: In Progress`
+- Implementation files listed in story exist in `src/`
+- GDD requirement text at TR-light-001 matches how the feature was implemented
+- ADR guidance was followed (no deviations)
+
+**Input:** `/story-done production/epics/core/story-light-pickup.md`
+
+**Expected behavior:**
+1. Skill reads the story file and extracts all key fields
+2. Skill reads the GDD requirement fresh from `tr-registry.yaml` (not from story's quoted text)
+3. Skill reads the referenced ADR to understand implementation constraints
+4. Skill evaluates each acceptance criterion (auto where possible, manual prompt where not)
+5. Skill checks for GDD requirement deviations
+6. Skill checks for ADR guideline deviations
+7. Skill prompts user: "Please provide the code review outcome for this story"
+8. Skill presents COMPLETE verdict
+9. Skill asks "May I update story Status to Complete and add Completion Notes?"
+10. If yes: skill updates the story file
+11. Skill surfaces the next `Ready for Dev` story from the sprint
+
+**Assertions:**
+- [ ] Skill reads `docs/architecture/tr-registry.yaml` for TR-ID requirement text (not just story)
+- [ ] Skill reads the referenced ADR file (not just the story reference)
+- [ ] Each acceptance criterion is listed with VERIFIED / DEFERRED / FAILED status
+- [ ] Skill prompts the user for code review outcome (does not skip this step)
+- [ ] Verdict is COMPLETE when all criteria are verified and no deviations exist
+- [ ] Skill asks "May I write" before updating the story file
+- [ ] Skill does NOT auto-update story status without user confirmation
+- [ ] After completion, skill surfaces the next ready story from `production/sprints/`
+
+---
+
+### Case 2: Blocked Path — Acceptance criterion cannot be verified
+
+**Fixture:**
+- Story file has an acceptance criterion: "Player sees correct animation on pickup"
+- No automated test for this criterion exists
+- Manual verification has not been performed
+- All other criteria are met
+
+**Input:** `/story-done production/epics/core/story-light-pickup.md`
+
+**Expected behavior:**
+1. Skill processes all acceptance criteria
+2. Reaches the animation criterion — cannot auto-verify
+3. Skill asks the user: "Acceptance criterion 'Player sees correct animation on
+   pickup' cannot be auto-verified. Has this been manually tested?"
+4. If user says No: criterion is marked DEFERRED, verdict becomes COMPLETE WITH NOTES
+5. Skill records the deferred criterion in completion notes
+6. Asks "May I write updated story with deferred criterion noted?"
+
+**Assertions:**
+- [ ] Skill asks the user about unverifiable criteria rather than assuming PASS
+- [ ] Deferred criteria result in COMPLETE WITH NOTES (not COMPLETE or BLOCKED)
+- [ ] The deferred criterion is explicitly named in the completion notes
+- [ ] Skill still asks "May I write" before updating the story file
+
+---
+
+### Case 3: Blocked Path — GDD deviation detected
+
+**Fixture:**
+- Story TR-ID points to requirement: "Player can carry max 3 light sources"
+- Implementation in `src/` uses a variable `MAX_CARRIED_LIGHTS = 5`
+- This is a deliberate deviation from the GDD
+
+**Input:** `/story-done production/epics/core/story-light-pickup.md`
+
+**Expected behavior:**
+1. Skill reads the GDD requirement text (max 3)
+2. Skill detects discrepancy between requirement and implementation value (5)
+3. Skill flags this as a GDD deviation and asks the user to classify it:
+   - INTENTIONAL: document the deviation and reason
+   - ERROR: implementation must be fixed before story can be marked Complete
+   - OUT OF SCOPE: requirement changed and GDD needs updating
+4. If INTENTIONAL: skill records deviation in completion notes, verdict is COMPLETE WITH NOTES
+5. If ERROR: verdict is BLOCKED until implementation is corrected
+
+**Assertions:**
+- [ ] Skill detects the mismatch between GDD requirement and implementation value
+- [ ] Skill asks the user to classify the deviation (not auto-assumes either way)
+- [ ] INTENTIONAL deviation → COMPLETE WITH NOTES (not BLOCKED)
+- [ ] ERROR deviation → BLOCKED verdict until fixed
+- [ ] Detected deviations are recorded in completion notes or tech debt register
+
+---
+
+### Case 4: Edge Case — No argument, auto-detect current story
+
+**Fixture:**
+- `production/session-state/active.md` contains a reference to
+  `production/epics/core/story-oxygen-drain.md` as the active story
+- That story file exists with `Status: In Progress`
+
+**Input:** `/story-done` (no argument)
+
+**Expected behavior:**
+1. Skill reads `production/session-state/active.md`
+2. Skill finds the active story reference
+3. Skill reads that story file and proceeds normally
+4. Output confirms which story was auto-detected
+
+**Assertions:**
+- [ ] Skill reads `production/session-state/active.md` when no argument is given
+- [ ] Skill identifies and confirms the auto-detected story before proceeding
+- [ ] If no story is found in session state, skill asks the user to provide a path
+
+---
+
+---
+
+### Case 5: Director Gate — LP-CODE-REVIEW behavior across review modes
+
+**Fixture:**
+- Story file at `production/epics/core/story-light-pickup.md`
+- All acceptance criteria verified, no GDD deviations
+- `production/session-state/review-mode.txt` exists
+
+**Case 5a — full mode:**
+- `review-mode.txt` contains `full`
+
+**Input:** `/story-done production/epics/core/story-light-pickup.md` (full mode)
+
+**Expected behavior:**
+1. Skill reads review mode — determines `full`
+2. After implementation verification, skill invokes LP-CODE-REVIEW gate
+3. Lead programmer reviews the implementation
+4. If LP verdict is NEEDS CHANGES → story cannot be marked Complete
+5. If LP verdict is APPROVED → skill proceeds to mark story Complete
+
+**Assertions (5a):**
+- [ ] Skill reads review mode before deciding whether to invoke LP-CODE-REVIEW
+- [ ] LP-CODE-REVIEW gate is invoked in full mode after implementation check
+- [ ] An LP NEEDS CHANGES verdict prevents story from being marked Complete
+- [ ] Gate result is noted in output: "Gate: LP-CODE-REVIEW — [result]"
+- [ ] Skill still asks "May I write" before updating story status even if LP approved
+
+**Case 5b — lean or solo mode:**
+- `review-mode.txt` contains `lean` or `solo`
+
+**Expected behavior:**
+1. Skill reads review mode — determines `lean` or `solo`
+2. LP-CODE-REVIEW gate is SKIPPED
+3. Output notes the skip: "[LP-CODE-REVIEW] skipped — Lean/Solo mode"
+4. Story completion proceeds based on acceptance criteria check only
+
+**Assertions (5b):**
+- [ ] LP-CODE-REVIEW gate does NOT spawn in lean or solo mode
+- [ ] Skip is explicitly noted in output
+- [ ] Skill still requires "May I write" approval before marking story Complete
+
+---
+
+## Protocol Compliance
+
+- [ ] Uses "May I write" before updating the story file
+- [ ] Uses "May I write" before adding entries to `docs/tech-debt-register.md`
+- [ ] Presents complete findings (criteria check, deviation check) before asking approval
+- [ ] Ends by surfacing the next ready story from the sprint plan
+- [ ] Does not mark a story Complete if any criteria are in ERROR state
+- [ ] Does not skip the code review prompt
+
+---
+
+## Coverage Notes
+
+- The full 8-phase flow of the skill is exercised across Cases 1-3; not all
+  edge cases within each phase are covered.
+- Tech debt logging (deferred items written to `docs/tech-debt-register.md`)
+  is mentioned in Case 2 but not the primary assertion focus; dedicated
+  coverage deferred.
+- The `sprint-status.yaml` update (Phase 7 in the skill) is implied by Case 1
+  but not the primary assertion; assumed to follow the same "May I write" pattern.
+- Stories with multiple TR-IDs or multiple ADRs are not explicitly tested.
--- a/Framework/skills/readiness/story-readiness.md
+++ b/Framework/skills/readiness/story-readiness.md
@@ -0,0 +1,195 @@
+# Skill Test Spec: /story-readiness
+
+## Skill Summary
+
+`/story-readiness` validates that a story file is ready for a developer to
+pick up and implement. It checks four dimensions: Design (embedded GDD
+requirements), Architecture (ADR references and status), Scope (clear
+boundaries and DoD), and Definition of Done (testable criteria). It produces
+a READY / NEEDS WORK / BLOCKED verdict. It is a read-only skill and runs
+before any developer picks up a story.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings or numbered check sections
+- [ ] Contains verdict keywords: READY, NEEDS WORK, BLOCKED
+- [ ] Does NOT require "May I write" language (read-only skill)
+- [ ] Has a next-step handoff (what to do after verdict)
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Fully ready story
+
+**Fixture:**
+- Story file exists at `production/epics/core/story-light-pickup.md`
+- Story contains:
+  - `TR-ID: TR-light-001` (GDD requirement reference)
+  - `ADR: docs/architecture/adr-003-inventory.md`
+  - Referenced ADR exists and has status `Accepted`
+  - Referenced TR-ID exists in `docs/architecture/tr-registry.yaml`
+  - Story has `## Acceptance Criteria` with ≥3 testable items
+  - Story has `## Definition of Done` section
+  - Story has `Status: Ready for Dev`
+  - Manifest version in story header matches current `docs/architecture/control-manifest.md`
+
+**Input:** `/story-readiness production/epics/core/story-light-pickup.md`
+
+**Expected behavior:**
+1. Skill reads the story file
+2. Skill reads the referenced ADR — verifies status is `Accepted`
+3. Skill reads `docs/architecture/tr-registry.yaml` — verifies TR-ID exists
+4. Skill reads `docs/architecture/control-manifest.md` — verifies manifest version matches
+5. Skill evaluates all 4 dimensions (Design, Architecture, Scope, DoD)
+6. Skill outputs READY verdict with all checks passing
+
+**Assertions:**
+- [ ] Skill reads the referenced ADR file (not just the story)
+- [ ] Skill verifies ADR status is `Accepted` (not `Proposed`)
+- [ ] Skill reads `tr-registry.yaml` to verify TR-ID exists
+- [ ] Output includes check results for all 4 dimensions
+- [ ] Verdict is READY when all checks pass
+- [ ] Skill does not write any files
+
+---
+
+### Case 2: Blocked Path — Referenced ADR is Proposed (not Accepted)
+
+**Fixture:**
+- Story file exists with `ADR: docs/architecture/adr-005-light-system.md`
+- `adr-005-light-system.md` exists but has `Status: Proposed`
+- All other story content is otherwise complete
+
+**Input:** `/story-readiness production/epics/core/story-light-system.md`
+
+**Expected behavior:**
+1. Skill reads the story
+2. Skill reads `adr-005-light-system.md` — finds `Status: Proposed`
+3. Skill flags this as a BLOCKING issue (cannot implement against unaccepted ADR)
+4. Skill outputs BLOCKED verdict
+5. Skill recommends: accept or reject the ADR before picking up the story
+
+**Assertions:**
+- [ ] Verdict is BLOCKED (not NEEDS WORK or READY) when ADR is Proposed
+- [ ] Output explicitly names the Proposed ADR as the blocker
+- [ ] Output recommends resolving ADR status before proceeding
+- [ ] Skill does not output READY regardless of other checks passing
+
+---
+
+### Case 3: Needs Work — Missing Acceptance Criteria
+
+**Fixture:**
+- Story file exists but has no `## Acceptance Criteria` section
+- ADR reference exists and is `Accepted`
+- TR-ID exists in registry
+- Manifest version matches
+
+**Input:** `/story-readiness production/epics/core/story-oxygen-drain.md`
+
+**Expected behavior:**
+1. Skill reads the story
+2. Skill finds no Acceptance Criteria section
+3. Skill flags this as a NEEDS WORK issue (story is incomplete, not blocked)
+4. Skill outputs NEEDS WORK verdict
+5. Skill names the missing section and suggests adding measurable criteria
+
+**Assertions:**
+- [ ] Verdict is NEEDS WORK (not BLOCKED or READY) when Acceptance Criteria section is absent
+- [ ] Output identifies the missing Acceptance Criteria section specifically
+- [ ] Output suggests adding testable/measurable criteria
+- [ ] Skill distinguishes NEEDS WORK (fixable without external dependencies) from BLOCKED (requires outside action)
+
+---
+
+### Case 4: Edge Case — Stale manifest version
+
+**Fixture:**
+- Story file has `Manifest Version: 2026-01-15` in its header
+- `docs/architecture/control-manifest.md` has `Manifest Version: 2026-03-10`
+- Versions do not match (story was created before manifest was updated)
+
+**Input:** `/story-readiness production/epics/core/story-mirror-rotation.md`
+
+**Expected behavior:**
+1. Skill reads the story and extracts manifest version `2026-01-15`
+2. Skill reads control manifest header and extracts current version `2026-03-10`
+3. Skill detects version mismatch
+4. Skill flags this as an ADVISORY issue (not blocking, but worth noting)
+5. Verdict is NEEDS WORK with manifest staleness noted
+
+**Assertions:**
+- [ ] Skill reads `docs/architecture/control-manifest.md` to get current version
+- [ ] Skill compares story's embedded manifest version against current manifest version
+- [ ] Stale manifest version results in NEEDS WORK (not BLOCKED, not READY)
+- [ ] Output explains that the story's embedded guidance may be outdated
+
+---
+
+---
+
+### Case 5: Director Gate — QL-STORY-READY behavior across review modes
+
+**Fixture:**
+- Story file exists and is READY (all 4 dimensions pass, ADR Accepted, criteria present)
+- `production/session-state/review-mode.txt` exists
+
+**Case 5a — full mode:**
+- `review-mode.txt` contains `full`
+
+**Input:** `/story-readiness production/epics/core/story-light-pickup.md` (full mode)
+
+**Expected behavior:**
+1. Skill reads review mode — determines `full`
+2. After completing its own 4-dimension check, skill invokes QL-STORY-READY gate
+3. QA lead reviews the story for readiness
+4. If QA lead verdict is INADEQUATE → story verdict is BLOCKED regardless of 4-dimension result
+5. If QA lead verdict is ADEQUATE → verdict proceeds normally
+
+**Assertions (5a):**
+- [ ] Skill reads review mode before deciding whether to invoke QL-STORY-READY
+- [ ] QL-STORY-READY gate is invoked in full mode after the 4-dimension check completes
+- [ ] A QA lead INADEQUATE verdict overrides a READY 4-dimension result → final verdict BLOCKED
+- [ ] Gate invocation is noted in output: "Gate: QL-STORY-READY — [result]"
+
+**Case 5b — lean or solo mode:**
+- `review-mode.txt` contains `lean` or `solo`
+
+**Expected behavior:**
+1. Skill reads review mode — determines `lean` or `solo`
+2. QL-STORY-READY gate is SKIPPED
+3. Output notes the skip: "[QL-STORY-READY] skipped — Lean/Solo mode"
+4. Verdict is based on 4-dimension check only
+
+**Assertions (5b):**
+- [ ] QL-STORY-READY gate does NOT spawn in lean or solo mode
+- [ ] Skip is explicitly noted in output
+- [ ] Verdict is based on 4-dimension check alone
+
+---
+
+## Protocol Compliance
+
+- [ ] Does NOT use Write or Edit tools (read-only skill)
+- [ ] Presents complete check results before verdict
+- [ ] Does not ask for approval (no file writes)
+- [ ] Ends with recommended next step (fix issues or proceed to implementation)
+- [ ] Distinguishes three verdict levels clearly (READY vs NEEDS WORK vs BLOCKED)
+
+---
+
+## Coverage Notes
+
+- Case where TR-ID is missing from the registry entirely is not explicitly
+  tested here; it follows the same NEEDS WORK pattern as Case 3.
+- The "no argument" path (skill auto-detecting the current story) is not
+  tested because it depends on `production/session-state/active.md` content,
+  which is hard to fixture reliably.
+- Stories with multiple ADR references are not tested; behavior is assumed to
+  be additive (all ADRs must be Accepted for READY verdict).
--- a/Framework/skills/review/architecture-review.md
+++ b/Framework/skills/review/architecture-review.md
@@ -0,0 +1,192 @@
+# Skill Test Spec: /architecture-review
+
+## Skill Summary
+
+`/architecture-review` is an Opus-tier skill that validates a technical architecture
+document against the project's 8 required architecture sections and checks that it
+is internally consistent, non-contradictory with existing ADRs, and correctly
+targeting the pinned engine version. It produces a verdict of APPROVED /
+NEEDS REVISION / MAJOR REVISION NEEDED.
+
+In `full` review mode, the skill spawns two director gate agents in parallel:
+TD-ARCHITECTURE (technical-director) and LP-FEASIBILITY (lead-programmer). In
+`lean` or `solo` mode, both gates are skipped and noted. The skill is read-only —
+no files are written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: APPROVED, NEEDS REVISION, MAJOR REVISION NEEDED
+- [ ] Does NOT require "May I write" language (read-only skill)
+- [ ] Has a next-step handoff at the end
+- [ ] Documents gate behavior: TD-ARCHITECTURE + LP-FEASIBILITY in full mode; skipped in lean/solo
+
+---
+
+## Director Gate Checks
+
+In `full` mode: TD-ARCHITECTURE (technical-director) and LP-FEASIBILITY
+(lead-programmer) are spawned in parallel after the skill reads the architecture doc.
+
+In `lean` mode: both gates are skipped. Output notes:
+"TD-ARCHITECTURE skipped — lean mode" and "LP-FEASIBILITY skipped — lean mode".
+
+In `solo` mode: both gates are skipped with equivalent notes.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Complete architecture doc in full mode
+
+**Fixture:**
+- `docs/architecture/architecture.md` exists with all 8 required sections populated
+- All sections reference the correct engine version from `docs/engine-reference/`
+- No contradictions with existing Accepted ADRs in `docs/architecture/`
+- `production/session-state/review-mode.txt` contains `full`
+
+**Input:** `/architecture-review docs/architecture/architecture.md`
+
+**Expected behavior:**
+1. Skill reads the architecture document
+2. Skill reads existing ADRs for cross-reference
+3. Skill reads engine version reference
+4. TD-ARCHITECTURE and LP-FEASIBILITY gate agents spawn in parallel
+5. Both gates return APPROVED
+6. Skill outputs section-by-section completeness check (8/8 sections present)
+7. Verdict: APPROVED
+
+**Assertions:**
+- [ ] All 8 required sections are checked and reported
+- [ ] TD-ARCHITECTURE and LP-FEASIBILITY spawn in parallel (not sequentially)
+- [ ] Verdict is APPROVED when all sections are present and no conflicts exist
+- [ ] Skill does NOT write any files
+- [ ] Next-step handoff to `/create-control-manifest` or `/create-epics` is present
+
+---
+
+### Case 2: Failure Path — Missing required sections
+
+**Fixture:**
+- `docs/architecture/architecture.md` exists but is missing at least 2 required sections
+  (e.g., no data model section, no error handling section)
+- `production/session-state/review-mode.txt` contains `full`
+
+**Input:** `/architecture-review docs/architecture/architecture.md`
+
+**Expected behavior:**
+1. Skill reads the document and identifies missing sections
+2. Section completeness shows fewer than 8/8 sections present
+3. Missing sections are listed by name with specific remediation guidance
+4. Verdict: MAJOR REVISION NEEDED (≥2 missing sections)
+
+**Assertions:**
+- [ ] Verdict is MAJOR REVISION NEEDED (not APPROVED or NEEDS REVISION) for ≥2 missing sections
+- [ ] Each missing section is named explicitly in the output
+- [ ] Remediation guidance is specific (what to add, not just "add missing sections")
+- [ ] Skill does NOT pass a document missing required sections
+
+---
+
+### Case 3: Partial Path — Architecture contradicts an existing ADR
+
+**Fixture:**
+- `docs/architecture/architecture.md` exists with all 8 sections present
+- One Accepted ADR in `docs/architecture/` establishes a constraint that the architecture doc contradicts
+  (e.g., ADR-001 mandates ECS pattern; architecture.md describes a different pattern for the same system)
+
+**Input:** `/architecture-review docs/architecture/architecture.md`
+
+**Expected behavior:**
+1. Skill reads the architecture doc and all existing ADRs
+2. Conflict is detected between the architecture doc and the named ADR
+3. Conflict entry names: the ADR number/title, the contradicting sections, and impact
+4. Verdict: NEEDS REVISION (conflict exists but structure is otherwise sound)
+
+**Assertions:**
+- [ ] Verdict is NEEDS REVISION (not MAJOR REVISION NEEDED for a single contradiction)
+- [ ] The specific ADR number and title are named in the conflict entry
+- [ ] The contradicting sections in both documents are identified
+- [ ] Skill does NOT auto-resolve the contradiction
+
+---
+
+### Case 4: Edge Case — File not found
+
+**Fixture:**
+- The path provided does not exist in the project
+
+**Input:** `/architecture-review docs/architecture/nonexistent.md`
+
+**Expected behavior:**
+1. Skill attempts to read the file
+2. File not found
+3. Skill outputs a clear error naming the missing file
+4. Skill suggests checking `docs/architecture/` or running `/create-architecture`
+5. Skill does NOT produce a verdict
+
+**Assertions:**
+- [ ] Skill outputs a clear error when the file is not found
+- [ ] No verdict is produced (APPROVED / NEEDS REVISION / MAJOR REVISION NEEDED)
+- [ ] Skill suggests a corrective action
+- [ ] Skill does NOT crash or produce a partial report
+
+---
+
+### Case 5: Director Gate — Full mode spawns both gates; solo mode skips both
+
+**Fixture (full mode):**
+- `docs/architecture/architecture.md` exists with all 8 sections
+- `production/session-state/review-mode.txt` contains `full`
+
+**Full mode expected behavior:**
+1. TD-ARCHITECTURE gate spawns
+2. LP-FEASIBILITY gate spawns in parallel with TD-ARCHITECTURE
+3. Both gates complete before verdict is issued
+
+**Assertions (full mode):**
+- [ ] TD-ARCHITECTURE and LP-FEASIBILITY both appear in the output as completed gates
+- [ ] Both gates spawn in parallel (not one after the other)
+- [ ] Verdict reflects gate feedback
+
+**Fixture (solo mode):**
+- Same architecture doc
+- `production/session-state/review-mode.txt` contains `solo`
+
+**Solo mode expected behavior:**
+1. Skill reads the architecture doc
+2. Gates are NOT spawned
+3. Output notes: "TD-ARCHITECTURE skipped — solo mode" and "LP-FEASIBILITY skipped — solo mode"
+4. Verdict is based on structural checks only
+
+**Assertions (solo mode):**
+- [ ] Neither TD-ARCHITECTURE nor LP-FEASIBILITY appears as an active gate
+- [ ] Both skipped gates are noted in the output
+- [ ] Verdict is still produced based on the structural check alone
+
+---
+
+## Protocol Compliance
+
+- [ ] Does NOT write any files (read-only skill)
+- [ ] Presents section completeness check before issuing verdict
+- [ ] TD-ARCHITECTURE and LP-FEASIBILITY spawn in parallel in full mode
+- [ ] Skipped gates are noted by name and mode in lean/solo output
+- [ ] Verdict is one of exactly: APPROVED, NEEDS REVISION, MAJOR REVISION NEEDED
+- [ ] Ends with next-step handoff appropriate to verdict
+
+---
+
+## Coverage Notes
+
+- The 8 required architecture sections are project-specific; tests use the
+  section list defined in the skill body — not re-enumerated here.
+- Engine version compatibility checking (cross-referencing `docs/engine-reference/`)
+  is part of Case 1's happy path but not independently fixture-tested.
+- RTM (requirement traceability matrix) mode is a separate concern covered by
+  the `/architecture-review` skill's own `rtm` argument mode, not tested here.
--- a/Framework/skills/review/design-review.md
+++ b/Framework/skills/review/design-review.md
@@ -0,0 +1,170 @@
+# Skill Test Spec: /design-review
+
+## Skill Summary
+
+`/design-review` reads a game design document (GDD) and evaluates it against
+the project's 8-section design standard (Overview, Player Fantasy, Detailed
+Rules, Formulas, Edge Cases, Dependencies, Tuning Knobs, Acceptance Criteria).
+It checks for internal consistency, implementability, and cross-system
+conflicts. It produces a verdict of APPROVED, NEEDS REVISION, or MAJOR
+REVISION NEEDED. It is a read-only skill (no file writes) and runs as a
+`context: fork` subagent.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings or numbered steps
+- [ ] Contains verdict keywords: APPROVED, NEEDS REVISION, MAJOR REVISION NEEDED
+- [ ] Does NOT require "May I write" language (read-only skill — `allowed-tools` excludes Write/Edit)
+- [ ] Output format is documented (review template shown in skill body)
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Complete GDD, all 8 sections present
+
+**Fixture:**
+- `design/gdd/light-manipulation.md` exists (use `_fixtures/minimal-game-concept.md`
+  as a stand-in — represents a complete document with all required content)
+- All 8 required sections are populated with substantive content
+- Formulas section contains at least one formula with defined variables
+- Acceptance Criteria section contains at least 3 testable criteria
+
+**Input:** `/design-review design/gdd/light-manipulation.md`
+
+**Expected behavior:**
+1. Skill reads the target document in full
+2. Skill reads CLAUDE.md for project context and standards
+3. Skill evaluates all 8 required sections (present/absent check)
+4. Skill checks internal consistency (formulas match described behavior)
+5. Skill checks implementability (rules are precise enough to code)
+6. Skill outputs structured review with section-by-section status
+7. Skill outputs APPROVED verdict
+
+**Assertions:**
+- [ ] Skill reads the target file before producing any output
+- [ ] Output includes a "Completeness" section showing X/8 sections present
+- [ ] Output includes an "Internal Consistency" section
+- [ ] Output includes an "Implementability" section
+- [ ] Output ends with a verdict line: APPROVED / NEEDS REVISION / MAJOR REVISION NEEDED
+- [ ] APPROVED verdict is given when all 8 sections are present and consistent
+
+---
+
+### Case 2: Failure Path — Incomplete GDD (4/8 sections)
+
+**Fixture:**
+- `design/gdd/light-manipulation.md` exists using content from
+  `tests/skills/_fixtures/incomplete-gdd.md` (4 of 8 sections populated;
+  Formulas, Edge Cases, Tuning Knobs, Acceptance Criteria are missing)
+
+**Input:** `/design-review design/gdd/light-manipulation.md`
+
+**Expected behavior:**
+1. Skill reads the document
+2. Skill identifies 4 missing sections
+3. Skill outputs "Completeness: 4/8 sections present"
+4. Skill lists specifically which 4 sections are missing
+5. Skill outputs MAJOR REVISION NEEDED verdict (not APPROVED or NEEDS REVISION)
+
+**Assertions:**
+- [ ] Output shows "4/8" in the completeness section (not a higher number)
+- [ ] Output explicitly names each missing section (Formulas, Edge Cases, Tuning Knobs, Acceptance Criteria)
+- [ ] Verdict is MAJOR REVISION NEEDED (not APPROVED or NEEDS REVISION) when ≥3 sections are missing
+- [ ] Output does not suggest the document is implementation-ready
+- [ ] Skill does not write any files (read-only enforcement)
+
+---
+
+### Case 3: Partial Path — 7/8 sections, minor inconsistency
+
+**Fixture:**
+- GDD has all sections except Formulas
+- The described behavior mentions numeric values but no formulas are defined
+- Acceptance Criteria exist but are vague ("feels good" rather than measurable)
+
+**Input:** `/design-review design/gdd/[document].md`
+
+**Expected behavior:**
+1. Skill identifies missing Formulas section
+2. Skill flags vague acceptance criteria as an implementability issue
+3. Skill outputs NEEDS REVISION verdict (not APPROVED, not MAJOR REVISION NEEDED)
+4. Skill provides specific remediation notes for each issue
+
+**Assertions:**
+- [ ] Verdict is NEEDS REVISION (not APPROVED, not MAJOR REVISION NEEDED) for 7/8 with issues
+- [ ] Output identifies the missing Formulas section specifically
+- [ ] Output flags the vague acceptance criteria as an implementability gap
+- [ ] Each flagged issue has a specific, actionable remediation note
+
+---
+
+### Case 4: Edge Case — File not found
+
+**Fixture:**
+- The path provided does not exist in the project
+
+**Input:** `/design-review design/gdd/nonexistent.md`
+
+**Expected behavior:**
+1. Skill attempts to read the file
+2. File not found
+3. Skill outputs an error message naming the missing file
+4. Skill suggests checking the path or listing files in `design/gdd/`
+5. Skill does NOT produce a verdict
+
+**Assertions:**
+- [ ] Skill outputs a clear error when the file is not found
+- [ ] Skill does NOT output APPROVED, NEEDS REVISION, or MAJOR REVISION NEEDED when file is missing
+- [ ] Skill suggests a corrective action (check path, list available GDDs)
+
+---
+
+---
+
+### Case 5: Director Gate — no gate spawned regardless of review mode
+
+**Fixture:**
+- `design/gdd/light-manipulation.md` exists with all 8 sections
+- `production/session-state/review-mode.txt` exists with `full` (most permissive mode)
+
+**Input:** `/design-review design/gdd/light-manipulation.md` (with full review mode active)
+
+**Expected behavior:**
+1. Skill reads the GDD document
+2. Skill does NOT read `review-mode.txt` — this skill has no director gates
+3. Skill produces the review output normally
+4. No director gate agents are spawned at any point
+5. Verdict is APPROVED (all 8 sections present in fixture)
+
+**Assertions:**
+- [ ] Skill does NOT spawn any director gate agent (CD-, TD-, PR-, AD- prefixed agents)
+- [ ] Skill does NOT read `review-mode.txt` or equivalent mode file
+- [ ] The `--review` flag or `full` mode state has NO effect on whether directors spawn
+- [ ] Output does not contain any "Gate: [GATE-ID]" entries
+- [ ] Skill IS the review — it does not delegate the review to a director
+
+---
+
+## Protocol Compliance
+
+- [ ] Does NOT use Write or Edit tools (read-only skill)
+- [ ] Presents complete findings before any verdict
+- [ ] Does not ask for approval before producing output (no writes to approve)
+- [ ] Ends with recommended next step (e.g., fix issues and re-run, or proceed to `/map-systems`)
+
+---
+
+## Coverage Notes
+
+- Cross-system consistency checking (Case 3 in the skill's own phase list) is
+  not directly tested here because it requires multiple GDD files to compare;
+  this is covered by the `/review-all-gdds` spec instead.
+- The skill's `context: fork` behavior (running as a subagent) is not tested
+  at the spec level — this is a runtime behavior verified manually.
+- Performance and edge cases involving very large GDD files are not in scope.
--- a/Framework/skills/review/review-all-gdds.md
+++ b/Framework/skills/review/review-all-gdds.md
@@ -0,0 +1,178 @@
+# Skill Test Spec: /review-all-gdds
+
+## Skill Summary
+
+`/review-all-gdds` is an Opus-tier skill that performs a holistic cross-GDD review
+across all files in `design/gdd/`. It runs two complementary review phases in
+parallel: Phase 1 checks for consistency (contradictions, formula mismatches,
+stale references, competing ownership), and Phase 2 checks design theory (dominant
+strategies, pillar drift, cognitive overload, economic imbalance). Because the two
+phases are independent, they are spawned simultaneously to save time. The skill
+produces a CONSISTENT / MINOR ISSUES / MAJOR ISSUES verdict and is read-only — no
+files are written without explicit user approval.
+
+The skill is itself the holistic review gate in the pipeline. It is invoked after
+individual GDDs are complete and before architecture work begins. It does NOT spawn
+any director gate agents (it IS the director-level review).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥5 phase headings (complex multi-phase skill)
+- [ ] Contains verdict keywords: CONSISTENT, MINOR ISSUES, MAJOR ISSUES
+- [ ] Does NOT require "May I write" language (read-only skill)
+- [ ] Has a next-step handoff at the end
+- [ ] Documents parallel phase spawning (Phase 1 and Phase 2 are independent)
+
+---
+
+## Director Gate Checks
+
+No director gates — this skill spawns no director gate agents. It IS the holistic
+review; delegating to a director gate would create a circular dependency.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Clean GDD set with no conflicts
+
+**Fixture:**
+- `design/gdd/` contains ≥3 system GDDs
+- All GDDs are internally consistent: no formula contradictions, no competing ownership, no stale references
+- All GDDs align with the pillars defined in `design/gdd/game-pillars.md`
+
+**Input:** `/review-all-gdds`
+
+**Expected behavior:**
+1. Skill reads all GDD files in `design/gdd/`
+2. Phase 1 (consistency scan) and Phase 2 (design theory check) spawn in parallel
+3. Phase 1 finds no contradictions, no formula mismatches, no ownership conflicts
+4. Phase 2 finds no pillar drift, no dominant strategies, no cognitive overload
+5. Skill outputs a structured findings table with 0 blocking issues
+6. Verdict: CONSISTENT
+
+**Assertions:**
+- [ ] Both review phases are spawned in parallel (not sequentially)
+- [ ] Output includes a findings table (even if empty — shows "No issues found")
+- [ ] Verdict is CONSISTENT when no conflicts are found
+- [ ] Skill does NOT write any files without user approval
+- [ ] Next-step handoff to `/architecture-review` or `/create-architecture` is present
+
+---
+
+### Case 2: Failure Path — Conflicting rules between two GDDs
+
+**Fixture:**
+- GDD-A defines a floor value (e.g. "minimum [output] is [N]")
+- GDD-B states a mechanic that bypasses that floor (e.g. "[mechanic] can reduce [output] to 0")
+- The two GDDs are otherwise complete and valid
+
+**Input:** `/review-all-gdds`
+
+**Expected behavior:**
+1. Phase 1 (consistency scan) detects the contradiction between GDD-A and GDD-B
+2. Conflict is reported with: both filenames, the specific conflicting rules, and severity HIGH
+3. Verdict: MAJOR ISSUES
+4. Handoff instructs user to resolve the conflict and re-run before proceeding
+
+**Assertions:**
+- [ ] Verdict is MAJOR ISSUES (not CONSISTENT or MINOR ISSUES)
+- [ ] Both GDD filenames are named in the conflict entry
+- [ ] The specific contradicting rules are quoted or described (not vague "conflict found")
+- [ ] Issue is classified as severity HIGH (blocking)
+- [ ] Skill does NOT auto-resolve the conflict
+
+---
+
+### Case 3: Partial Path — Single GDD with orphaned dependency reference
+
+**Fixture:**
+- GDD-A lists a dependency in its Dependencies section pointing to "system-B"
+- No GDD for system-B exists in `design/gdd/`
+- All other GDDs are consistent
+
+**Input:** `/review-all-gdds`
+
+**Expected behavior:**
+1. Phase 1 detects the orphaned dependency reference in GDD-A
+2. Issue is reported as: DEPENDENCY GAP — GDD-A references system-B which has no GDD
+3. No other conflicts found
+4. Verdict: MINOR ISSUES (dependency gap is advisory, not blocking by itself)
+
+**Assertions:**
+- [ ] Verdict is MINOR ISSUES (not MAJOR ISSUES for a single orphaned reference)
+- [ ] The specific GDD filename and the missing dependency name are reported
+- [ ] Skill suggests running `/design-system system-B` to resolve the gap
+- [ ] Skill does NOT skip or silently ignore the missing dependency
+
+---
+
+### Case 4: Edge Case — No GDD files found
+
+**Fixture:**
+- `design/gdd/` directory is empty or does not exist
+- No GDD files are present
+
+**Input:** `/review-all-gdds`
+
+**Expected behavior:**
+1. Skill attempts to read files in `design/gdd/`
+2. No files found — skill outputs an error with guidance
+3. Skill recommends running `/brainstorm` and `/design-system` before re-running
+4. Skill does NOT produce a verdict (CONSISTENT / MINOR ISSUES / MAJOR ISSUES)
+
+**Assertions:**
+- [ ] Skill outputs a clear error message when no GDDs are found
+- [ ] No verdict is produced when the directory is empty
+- [ ] Skill recommends the correct next action (`/brainstorm` or `/design-system`)
+- [ ] Skill does NOT crash or produce a partial report
+
+---
+
+### Case 5: Director Gate — No gate spawned regardless of review mode
+
+**Fixture:**
+- `design/gdd/` contains ≥2 consistent system GDDs
+- `production/session-state/review-mode.txt` exists with content `full`
+
+**Input:** `/review-all-gdds`
+
+**Expected behavior:**
+1. Skill reads all GDDs and runs the two review phases
+2. Skill does NOT read `review-mode.txt`
+3. Skill does NOT spawn any director gate agent (CD-, TD-, PR-, AD- prefixed)
+4. Skill completes and outputs its verdict normally
+5. Review mode setting has no effect on this skill's behavior
+
+**Assertions:**
+- [ ] No director gate agents are spawned at any point
+- [ ] Skill does NOT read `production/session-state/review-mode.txt`
+- [ ] Output does not contain any "Gate: [GATE-ID]" or "skipped" gate entries
+- [ ] The skill produces a verdict regardless of review mode
+- [ ] R4 metric: gate count for this skill = 0 in all modes
+
+---
+
+## Protocol Compliance
+
+- [ ] Phase 1 (consistency) and Phase 2 (design theory) spawned in parallel — not sequentially
+- [ ] Does NOT write any files without "May I write" approval
+- [ ] Findings table shown before any write ask
+- [ ] Verdict is one of exactly: CONSISTENT, MINOR ISSUES, MAJOR ISSUES
+- [ ] Ends with appropriate handoff: MAJOR ISSUES → fix and re-run; MINOR ISSUES → may proceed with awareness; CONSISTENT → `/create-architecture`
+
+---
+
+## Coverage Notes
+
+- Economic balance analysis (source/sink loops) requires cross-GDD resource data — covered
+  structurally by Case 2 (the conflict detection pattern is the same).
+- The design theory phase (Phase 2) checks including dominant strategy detection and
+  cognitive overload are not individually fixture-tested — they follow the same
+  pattern as consistency checks and are validated via the pillar drift case structure.
+- The `since-last-review` scoping mode is not tested here — it is a runtime concern.
--- a/Framework/skills/sprint/changelog.md
+++ b/Framework/skills/sprint/changelog.md
@@ -0,0 +1,169 @@
+# Skill Test Spec: /changelog
+
+## Skill Summary
+
+`/changelog` is a Haiku-tier skill that auto-generates a developer-facing
+changelog by reading git commit history and closed sprint stories since the
+last release tag. It organizes entries into features, fixes, and known issues.
+No director gates are used. The skill asks "May I write to `docs/CHANGELOG.md`?"
+before persisting. Verdict is always COMPLETE.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" language (skill writes changelog)
+- [ ] Has a next-step handoff (e.g., run /patch-notes for player-facing version)
+
+---
+
+## Director Gate Checks
+
+None. Changelog generation is a fast compilation task; no gates are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Multiple sprints since last release tag
+
+**Fixture:**
+- Git history has a tag `v0.3.0` three sprints ago
+- Since that tag: 12 commits across sprints 006, 007, 008
+- Sprint story files reference task IDs matching commit messages
+- `docs/CHANGELOG.md` does not yet exist
+
+**Input:** `/changelog`
+
+**Expected behavior:**
+1. Skill reads git log since `v0.3.0` tag
+2. Skill reads sprint stories to cross-reference task IDs
+3. Skill compiles entries into Features, Fixes, and Known Issues sections
+4. Skill presents draft to user
+5. Skill asks "May I write to `docs/CHANGELOG.md`?"
+6. User approves; file written; verdict COMPLETE
+
+**Assertions:**
+- [ ] Changelog covers commits since the most recent git tag
+- [ ] Entries are organized into Features / Fixes / Known Issues sections
+- [ ] Sprint story references are used to enrich commit descriptions
+- [ ] "May I write" prompt appears before file write
+- [ ] Verdict is COMPLETE after write
+
+---
+
+### Case 2: No Git Tags Found — All commits used, version baseline noted
+
+**Fixture:**
+- Git repository has commits but no tags exist
+- 20 commits in history across 3 sprints
+
+**Input:** `/changelog`
+
+**Expected behavior:**
+1. Skill checks for git tags — finds none
+2. Skill uses all commits in history as the baseline
+3. Skill notes in the output: "No version tag found — using full commit history; version baseline is unset"
+4. Skill still compiles organized changelog from available commits
+5. Skill asks "May I write" and writes on approval
+
+**Assertions:**
+- [ ] Skill does not error when no git tags exist
+- [ ] Output explicitly notes that no version baseline was found
+- [ ] Full commit history is used as the source
+- [ ] Changelog is still organized into sections despite missing tag
+
+---
+
+### Case 3: Commit Messages Without Task IDs — Grouped by date with note
+
+**Fixture:**
+- Git log since last tag has 8 commits
+- 5 commits have no task ID in the message (e.g., "fix typo", "tweak values")
+- 3 commits reference task IDs matching sprint stories
+
+**Input:** `/changelog`
+
+**Expected behavior:**
+1. Skill reads commits and sprint stories
+2. 3 commits are matched to sprint stories and placed in appropriate sections
+3. 5 untagged commits are grouped by date under a "Misc" or "Other Changes" section
+4. Output notes: "5 commits without task IDs — grouped by date"
+5. Skill writes changelog on approval
+
+**Assertions:**
+- [ ] Commits with task IDs are placed in appropriate sections (Features or Fixes)
+- [ ] Commits without task IDs are grouped separately with a note
+- [ ] Output flags the number of commits missing task references
+- [ ] No commits are silently dropped from the changelog
+
+---
+
+### Case 4: Existing CHANGELOG.md — New section prepended, old entries preserved
+
+**Fixture:**
+- `docs/CHANGELOG.md` already exists with sections for `v0.2.0` and `v0.3.0`
+- New commits exist since `v0.3.0` tag
+
+**Input:** `/changelog`
+
+**Expected behavior:**
+1. Skill detects that `docs/CHANGELOG.md` already exists
+2. Skill compiles new entries for the period since `v0.3.0`
+3. Skill presents draft with new section prepended above existing content
+4. Skill asks "May I write to `docs/CHANGELOG.md`?" (confirming prepend strategy)
+5. User approves; new content is prepended, old entries intact; verdict COMPLETE
+
+**Assertions:**
+- [ ] Skill reads existing changelog before writing to detect prior content
+- [ ] New section is prepended (not appended or overwriting) existing entries
+- [ ] Old changelog entries for v0.2.0 and v0.3.0 are preserved in the written file
+- [ ] "May I write" prompt reflects the prepend operation
+
+---
+
+### Case 5: Gate Compliance — No gate; read-then-write with approval
+
+**Fixture:**
+- Git history has commits since last tag
+- `review-mode.txt` contains `full`
+
+**Input:** `/changelog`
+
+**Expected behavior:**
+1. Skill compiles changelog in full mode
+2. No director gate is invoked (changelog generation is compilation, not a delivery gate)
+3. Skill runs on Haiku model — fast compilation
+4. Skill asks user for approval and writes file on confirmation
+
+**Assertions:**
+- [ ] No director gate is invoked regardless of review mode
+- [ ] Output does not reference any gate result
+- [ ] Skill proceeds directly from compilation to "May I write" prompt
+- [ ] Verdict is COMPLETE
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads git log and sprint story files before compiling
+- [ ] Always asks "May I write" before writing changelog
+- [ ] No director gates are invoked
+- [ ] Verdict is always COMPLETE
+- [ ] Runs on Haiku model tier (fast, low-cost)
+
+---
+
+## Coverage Notes
+
+- The case where git is not initialized in the repository is not tested;
+  behavior would depend on git command failure handling.
+- Merge commits vs. squash commits are not explicitly differentiated in
+  these tests; implementation detail of the git log parsing phase.
+- The `/patch-notes` skill should be run after `/changelog` for player-facing
+  output; that handoff is verified in the patch-notes spec.
--- a/Framework/skills/sprint/milestone-review.md
+++ b/Framework/skills/sprint/milestone-review.md
@@ -0,0 +1,171 @@
+# Skill Test Spec: /milestone-review
+
+## Skill Summary
+
+`/milestone-review` generates a comprehensive review of a completed milestone:
+what shipped, velocity metrics, deferred items, risks surfaced, and retrospective
+seeds. In full mode the PR-MILESTONE director gate runs after the review is
+compiled (producer reviews scope delivery). In lean and solo modes the gate is
+skipped. The skill asks "May I write to `production/milestones/review-milestone-N.md`?"
+before persisting. Verdicts: MILESTONE COMPLETE or MILESTONE INCOMPLETE.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: MILESTONE COMPLETE, MILESTONE INCOMPLETE
+- [ ] Contains "May I write" language (skill writes review document)
+- [ ] Has a next-step handoff (what to do after review is written)
+
+---
+
+## Director Gate Checks
+
+| Gate ID       | Trigger condition              | Mode guard              |
+|---------------|--------------------------------|-------------------------|
+| PR-MILESTONE  | After review document compiled | full only (not lean/solo) |
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Nearly complete milestone with one deferred story
+
+**Fixture:**
+- `production/milestones/milestone-03.md` exists with 8 stories
+- 7 stories have `Status: Complete`
+- 1 story has `Status: Deferred` (deferred to milestone-04)
+- `review-mode.txt` contains `full`
+
+**Input:** `/milestone-review milestone-03`
+
+**Expected behavior:**
+1. Skill reads `milestone-03.md` and all referenced sprint files
+2. Skill compiles: 7 shipped, 1 deferred; velocity; no blockers
+3. Skill presents review draft to user
+4. PR-MILESTONE gate invoked; producer approves
+5. Skill asks "May I write to `production/milestones/review-milestone-03.md`?"
+6. User approves; file is written; verdict MILESTONE COMPLETE
+
+**Assertions:**
+- [ ] Deferred story is noted in the review with its target milestone
+- [ ] Verdict is MILESTONE COMPLETE despite the one deferred story
+- [ ] PR-MILESTONE gate is invoked after draft compilation in full mode
+- [ ] Skill asks "May I write" before writing review file
+- [ ] Review document path matches `production/milestones/review-milestone-03.md`
+
+---
+
+### Case 2: Blocked Milestone — Multiple blocked stories
+
+**Fixture:**
+- `production/milestones/milestone-03.md` exists with 5 stories
+- 2 stories have `Status: Complete`
+- 3 stories have `Status: Blocked` (named blockers listed in each story)
+- `review-mode.txt` contains `full`
+
+**Input:** `/milestone-review milestone-03`
+
+**Expected behavior:**
+1. Skill reads milestone and sprint files
+2. Skill finds 3 blocked stories; compiles blocker details
+3. Verdict is MILESTONE INCOMPLETE
+4. PR-MILESTONE gate runs; producer notes the unresolved blockers
+5. Review is written with blocker list on approval
+
+**Assertions:**
+- [ ] Verdict is MILESTONE INCOMPLETE when any stories are Blocked
+- [ ] Each blocked story's name and blocker reason is listed in the review
+- [ ] PR-MILESTONE gate is still invoked in full mode even for INCOMPLETE verdict
+- [ ] "May I write" prompt still appears before file write
+
+---
+
+### Case 3: Full Mode — PR-MILESTONE returns CONCERNS
+
+**Fixture:**
+- Milestone-03 has 6 complete stories but 2 were not in the original scope (added mid-sprint)
+- `review-mode.txt` contains `full`
+
+**Input:** `/milestone-review milestone-03`
+
+**Expected behavior:**
+1. Skill compiles review; notes 2 out-of-scope stories shipped
+2. PR-MILESTONE gate invoked; producer returns CONCERNS about scope drift
+3. Skill surfaces the CONCERNS to the user and adds a "scope drift" note to the review
+4. User approves revised review; file written as MILESTONE COMPLETE with caveat
+
+**Assertions:**
+- [ ] CONCERNS from PR-MILESTONE gate are shown to user before write
+- [ ] Scope drift is explicitly noted in the written review document
+- [ ] Verdict is MILESTONE COMPLETE (stories shipped) with CONCERNS annotation
+- [ ] Skill does not suppress gate feedback
+
+---
+
+### Case 4: Edge Case — No milestone file found for specified milestone
+
+**Fixture:**
+- User calls `/milestone-review milestone-07`
+- `production/milestones/milestone-07.md` does NOT exist
+
+**Input:** `/milestone-review milestone-07`
+
+**Expected behavior:**
+1. Skill attempts to read `production/milestones/milestone-07.md`
+2. File not found; skill outputs an error message
+3. Skill suggests checking available milestones in `production/milestones/`
+4. No gate is invoked; no file is written
+
+**Assertions:**
+- [ ] Skill does not crash when milestone file is absent
+- [ ] Output names the expected file path in the error message
+- [ ] Output suggests checking `production/milestones/` for valid milestone names
+- [ ] Verdict is BLOCKED (cannot review a non-existent milestone)
+
+---
+
+### Case 5: Lean/Solo Mode — PR-MILESTONE gate skipped
+
+**Fixture:**
+- `production/milestones/milestone-03.md` exists with 5 complete stories
+- `review-mode.txt` contains `solo`
+
+**Input:** `/milestone-review milestone-03`
+
+**Expected behavior:**
+1. Skill reads review mode — determines `solo`
+2. Skill compiles review draft
+3. PR-MILESTONE gate is skipped; output notes "[PR-MILESTONE] skipped — Solo mode"
+4. Skill asks user for direct approval of the review
+5. User approves; review file is written; verdict MILESTONE COMPLETE
+
+**Assertions:**
+- [ ] PR-MILESTONE gate is NOT invoked in solo (or lean) mode
+- [ ] Skip is explicitly noted in skill output
+- [ ] User direct approval is still required before write
+- [ ] Verdict is MILESTONE COMPLETE after successful write
+
+---
+
+## Protocol Compliance
+
+- [ ] Shows compiled review draft before invoking PR-MILESTONE or asking to write
+- [ ] Always asks "May I write" before writing review document
+- [ ] PR-MILESTONE gate only runs in full mode
+- [ ] Skip message appears in lean and solo output
+- [ ] Verdict is MILESTONE COMPLETE or MILESTONE INCOMPLETE, stated clearly
+
+---
+
+## Coverage Notes
+
+- The case where the milestone has zero stories is not tested; it follows the
+  MILESTONE INCOMPLETE pattern with a note suggesting the milestone may not
+  have been planned.
+- Velocity calculation specifics (story points vs. story count) are not
+  verified here; they are implementation details of the review compilation phase.
--- a/Framework/skills/sprint/patch-notes.md
+++ b/Framework/skills/sprint/patch-notes.md
@@ -0,0 +1,170 @@
+# Skill Test Spec: /patch-notes
+
+## Skill Summary
+
+`/patch-notes` is a Haiku-tier skill that generates player-facing patch notes
+from existing changelog content, stripping internal task IDs and technical
+jargon in favor of plain language. It filters entries to only those relevant
+to players (visible features and bug fixes; internal refactors are excluded).
+No director gates are used. The skill asks "May I write to
+`docs/patch-notes-vX.X.md`?" before persisting. Verdict is always COMPLETE.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" language (skill writes patch notes file)
+- [ ] Has a next-step handoff (e.g., share with community manager)
+
+---
+
+## Director Gate Checks
+
+None. Patch notes generation is a fast compilation task; no gates are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Changelog filtered to player-facing entries
+
+**Fixture:**
+- `docs/CHANGELOG.md` exists with 5 entries:
+  - "Add dual-wield melee system" (Features — player-facing)
+  - "Fix crash on level transition" (Fixes — player-facing)
+  - "Add enemy patrol AI" (Features — player-facing)
+  - "Refactor input handler to use event bus" (Fixes — internal only)
+  - "Update dependency: Godot 4.6" (internal only)
+- Version is `v0.4.0`
+
+**Input:** `/patch-notes v0.4.0`
+
+**Expected behavior:**
+1. Skill reads `docs/CHANGELOG.md`
+2. Skill filters to 3 player-facing entries; excludes 2 internal entries
+3. Skill rewrites entries in plain language (no task IDs, no tech jargon)
+4. Skill presents draft to user
+5. Skill asks "May I write to `docs/patch-notes-v0.4.0.md`?"
+6. User approves; file written; verdict COMPLETE
+
+**Assertions:**
+- [ ] Only 3 entries appear in the patch notes (2 internal entries excluded)
+- [ ] Entries are written in plain language without internal task IDs
+- [ ] File path matches `docs/patch-notes-v0.4.0.md`
+- [ ] "May I write" prompt appears before file write
+- [ ] Verdict is COMPLETE after write
+
+---
+
+### Case 2: No Changelog Found — Directed to run /changelog first
+
+**Fixture:**
+- `docs/CHANGELOG.md` does NOT exist
+
+**Input:** `/patch-notes v0.4.0`
+
+**Expected behavior:**
+1. Skill attempts to read `docs/CHANGELOG.md` — not found
+2. Skill outputs: "No changelog found — run /changelog first to generate one"
+3. No patch notes are generated; no file is written
+
+**Assertions:**
+- [ ] Skill does not crash when changelog is absent
+- [ ] Output explicitly directs user to run `/changelog`
+- [ ] No "May I write" prompt appears (nothing to write)
+- [ ] Verdict is BLOCKED (dependency not met)
+
+---
+
+### Case 3: Tone Guidance from Design Folder — Incorporated into output
+
+**Fixture:**
+- `docs/CHANGELOG.md` exists with player-facing entries
+- `design/community/tone-guide.md` exists with guidance: "upbeat, encouraging tone; avoid passive voice"
+
+**Input:** `/patch-notes v0.4.0`
+
+**Expected behavior:**
+1. Skill reads changelog
+2. Skill detects tone guide at `design/community/tone-guide.md`
+3. Skill applies tone guidance when rewriting entries in plain language
+4. Patch notes use upbeat, active-voice phrasing
+5. Skill presents draft, asks to write, writes on approval
+
+**Assertions:**
+- [ ] Skill checks `design/` for a community or tone guidance file
+- [ ] Tone guide content influences phrasing of patch note entries
+- [ ] Output reflects active voice and upbeat tone where applicable
+- [ ] Skill notes that tone guidance was applied
+
+---
+
+### Case 4: Patch Note Template Exists — Used instead of generated structure
+
+**Fixture:**
+- `.claude/docs/templates/patch-notes-template.md` exists with a structured header format
+- `docs/CHANGELOG.md` exists with player-facing entries
+
+**Input:** `/patch-notes v0.4.0`
+
+**Expected behavior:**
+1. Skill reads changelog and detects template exists
+2. Skill populates the template with player-facing entries
+3. Template header/footer structure is preserved in the output
+4. Skill asks "May I write" and writes on approval
+
+**Assertions:**
+- [ ] Skill checks for a patch notes template before generating from scratch
+- [ ] Template structure is used when found (not overridden by default format)
+- [ ] Player-facing entries are inserted into the correct template section
+- [ ] Output note confirms template was used
+
+---
+
+### Case 5: Gate Compliance — No gate; community-manager is separate
+
+**Fixture:**
+- `docs/CHANGELOG.md` exists with player-facing entries
+- `review-mode.txt` contains `full`
+
+**Input:** `/patch-notes v0.4.0`
+
+**Expected behavior:**
+1. Skill compiles patch notes in full mode
+2. No director gate is invoked (community review is a separate, manual step)
+3. Skill runs on Haiku model — fast compilation
+4. Skill notes in output: "Consider sharing draft with community manager before publishing"
+5. Skill asks user for approval and writes on confirmation
+
+**Assertions:**
+- [ ] No director gate is invoked regardless of review mode
+- [ ] Output suggests (but does not require) community manager review
+- [ ] Skill proceeds directly from compilation to "May I write" prompt
+- [ ] Verdict is COMPLETE
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads `docs/CHANGELOG.md` before generating patch notes
+- [ ] Filters entries to player-facing items only
+- [ ] Rewrites entries in plain language without internal IDs
+- [ ] Always asks "May I write" before writing patch notes file
+- [ ] No director gates are invoked
+- [ ] Runs on Haiku model tier (fast, low-cost)
+
+---
+
+## Coverage Notes
+
+- The case where all changelog entries are internal (zero player-facing items)
+  is not tested; behavior is an empty patch notes draft with a warning.
+- Version number parsing from the changelog header is an implementation detail
+  not verified here.
+- The community manager consultation noted in Case 5 is advisory; a separate
+  skill or manual review handles that step.
--- a/Framework/skills/sprint/retrospective.md
+++ b/Framework/skills/sprint/retrospective.md
@@ -0,0 +1,169 @@
+# Skill Test Spec: /retrospective
+
+## Skill Summary
+
+`/retrospective` generates a structured sprint or milestone retrospective
+covering three categories: what went well, what didn't, and action items.
+It reads sprint files and session logs to compile observations, then produces
+a retrospective document. No director gates are used — retrospectives are
+team self-reflection artifacts. The skill asks "May I write to
+`production/retrospectives/retro-sprint-NNN.md`?" before persisting.
+Verdict is always COMPLETE (retrospective is structured output, not a pass/fail
+assessment).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" language (skill writes retrospective document)
+- [ ] Has a next-step handoff (what to do after retrospective is written)
+
+---
+
+## Director Gate Checks
+
+None. Retrospectives are team self-reflection documents; no gates are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Sprint with mixed outcomes
+
+**Fixture:**
+- `production/sprints/sprint-005.md` exists with 6 stories (4 Complete, 1 Blocked, 1 Deferred)
+- `production/session-logs/` contains log entries for the sprint period
+- No prior retrospective exists for sprint-005
+
+**Input:** `/retrospective sprint-005`
+
+**Expected behavior:**
+1. Skill reads sprint-005 and session logs
+2. Skill compiles three retrospective categories: went well (4 stories shipped), 
+   didn't (1 blocked, 1 deferred), and action items (address blocker root cause)
+3. Skill presents retrospective draft to user
+4. Skill asks "May I write to `production/retrospectives/retro-sprint-005.md`?"
+5. User approves; file is written; verdict COMPLETE
+
+**Assertions:**
+- [ ] Retrospective contains all three categories (went well / didn't / actions)
+- [ ] Blocked and deferred stories appear in the "what didn't" section
+- [ ] At least one action item is generated from the blocked story
+- [ ] Skill asks "May I write" before writing file
+- [ ] Verdict is COMPLETE after successful write
+
+---
+
+### Case 2: No Sprint Data — Manual input fallback
+
+**Fixture:**
+- User calls `/retrospective sprint-009`
+- `production/sprints/sprint-009.md` does NOT exist
+- No session logs reference sprint-009
+
+**Input:** `/retrospective sprint-009`
+
+**Expected behavior:**
+1. Skill attempts to read sprint-009 — not found
+2. Skill informs user that no sprint data was found for sprint-009
+3. Skill prompts user to provide retrospective input manually (went well, didn't, actions)
+4. User provides input; skill formats it into the retrospective structure
+5. Skill asks "May I write" and writes the document on approval
+
+**Assertions:**
+- [ ] Skill does not crash or produce an empty document when sprint file is absent
+- [ ] User is prompted to provide manual input
+- [ ] Manual input is formatted into the three-category structure
+- [ ] "May I write" prompt still appears before file write
+
+---
+
+### Case 3: Prior Retrospective Exists — Offer to append or replace
+
+**Fixture:**
+- `production/retrospectives/retro-sprint-005.md` already exists with content
+- User re-runs `/retrospective sprint-005` after changes
+
+**Input:** `/retrospective sprint-005`
+
+**Expected behavior:**
+1. Skill detects that `retro-sprint-005.md` already exists
+2. Skill presents user with choice: append new observations or replace existing file
+3. User selects "replace"; skill compiles fresh retrospective
+4. Skill asks "May I write to `production/retrospectives/retro-sprint-005.md`?" (confirming overwrite)
+5. File is overwritten; verdict COMPLETE
+
+**Assertions:**
+- [ ] Skill checks for existing retrospective file before compiling
+- [ ] User is offered append or replace choice — not silently overwritten
+- [ ] "May I write" prompt reflects the overwrite scenario
+- [ ] Verdict is COMPLETE after write regardless of append vs. replace
+
+---
+
+### Case 4: Edge Case — Unresolved action items from previous retrospective
+
+**Fixture:**
+- `production/retrospectives/retro-sprint-004.md` exists with 2 action items marked `[ ]` (not done)
+- User runs `/retrospective sprint-005`
+
+**Input:** `/retrospective sprint-005`
+
+**Expected behavior:**
+1. Skill reads the most recent prior retrospective (retro-sprint-004)
+2. Skill detects 2 unchecked action items from sprint-004
+3. Skill includes a "Carry-over from Sprint 004" section in the new retrospective
+4. The unresolved items are listed with a note that they were not followed up
+
+**Assertions:**
+- [ ] Skill reads the most recent prior retrospective to check for open action items
+- [ ] Unresolved action items appear in the new retrospective under a carry-over section
+- [ ] Carry-over items are distinct from newly generated action items
+- [ ] Output notes that these items were not followed up in the previous sprint
+
+---
+
+### Case 5: Gate Compliance — No gate invoked in any mode
+
+**Fixture:**
+- `production/sprints/sprint-005.md` exists with complete stories
+- `production/session-state/review-mode.txt` contains `full`
+
+**Input:** `/retrospective sprint-005`
+
+**Expected behavior:**
+1. Skill compiles retrospective in full mode
+2. No director gate is invoked (retrospectives are team self-reflection, not delivery gates)
+3. Skill asks user for approval and writes file on confirmation
+4. Verdict is COMPLETE
+
+**Assertions:**
+- [ ] No director gate is invoked regardless of review mode
+- [ ] Output does not contain any gate invocation or gate result notation
+- [ ] Skill proceeds directly from compilation to "May I write" prompt
+- [ ] Review mode file content is irrelevant to this skill's behavior
+
+---
+
+## Protocol Compliance
+
+- [ ] Always shows retrospective draft before asking to write
+- [ ] Always asks "May I write" before writing retrospective file
+- [ ] No director gates are invoked
+- [ ] Verdict is always COMPLETE (not a pass/fail skill)
+- [ ] Checks prior retrospective for unresolved action items
+
+---
+
+## Coverage Notes
+
+- Milestone retrospectives (as opposed to sprint retrospectives) follow the
+  same pattern but read milestone files instead of sprint files; not
+  separately tested here.
+- The case where session logs are empty is similar to Case 2 (no data);
+  the skill falls back to manual input in both situations.
--- a/Framework/skills/sprint/sprint-plan.md
+++ b/Framework/skills/sprint/sprint-plan.md
@@ -0,0 +1,177 @@
+# Skill Test Spec: /sprint-plan
+
+## Skill Summary
+
+`/sprint-plan` reads the current milestone file and backlog stories, then
+generates a new numbered sprint with stories prioritized by implementation layer
+and priority score. In full mode the PR-SPRINT director gate runs after the
+sprint draft is compiled (producer reviews the plan). In lean and solo modes
+the gate is skipped. The skill asks "May I write to `production/sprints/sprint-NNN.md`?"
+before persisting. Verdicts: COMPLETE (sprint generated and written) or
+BLOCKED (cannot proceed due to missing data or gate failure).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED
+- [ ] Contains "May I write" language (skill writes sprint file)
+- [ ] Has a next-step handoff (what to do after sprint is written)
+
+---
+
+## Director Gate Checks
+
+| Gate ID   | Trigger condition        | Mode guard         |
+|-----------|--------------------------|--------------------|
+| PR-SPRINT | After sprint draft built | full only (not lean/solo) |
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Backlog with stories generates sprint
+
+**Fixture:**
+- `production/milestones/milestone-02.md` exists with capacity `10 story points`
+- Backlog contains 5 unstarted stories across 2 epics, mixed priorities
+- `production/session-state/review-mode.txt` contains `full`
+- Next sprint number is `003` (sprints 001 and 002 already exist)
+
+**Input:** `/sprint-plan`
+
+**Expected behavior:**
+1. Skill reads current milestone to obtain capacity and goals
+2. Skill reads all unstarted stories from backlog; sorts by layer + priority
+3. Skill drafts sprint-003 with stories fitting within capacity
+4. Skill presents draft to user before invoking gate
+5. Skill invokes PR-SPRINT gate (full mode); producer approves
+6. Skill asks "May I write to `production/sprints/sprint-003.md`?"
+7. User approves; file is written
+
+**Assertions:**
+- [ ] Stories are sorted by implementation layer before priority
+- [ ] Sprint draft is shown before any write or gate invocation
+- [ ] PR-SPRINT gate is invoked in full mode after draft is ready
+- [ ] Skill asks "May I write" before writing the sprint file
+- [ ] Written file path matches `production/sprints/sprint-003.md`
+- [ ] Verdict is COMPLETE after successful write
+
+---
+
+### Case 2: Blocked Path — Backlog is empty
+
+**Fixture:**
+- `production/milestones/milestone-02.md` exists
+- No unstarted stories exist in any epic backlog
+
+**Input:** `/sprint-plan`
+
+**Expected behavior:**
+1. Skill reads backlog — finds no unstarted stories
+2. Skill outputs "No unstarted stories in backlog"
+3. Skill suggests running `/create-stories` to populate the backlog
+4. No gate is invoked; no file is written
+
+**Assertions:**
+- [ ] Verdict is BLOCKED
+- [ ] Output contains "No unstarted stories" or equivalent message
+- [ ] Output recommends `/create-stories`
+- [ ] PR-SPRINT gate is NOT invoked
+- [ ] No write tool is called
+
+---
+
+### Case 3: Gate returns CONCERNS — Sprint overloaded, revised before write
+
+**Fixture:**
+- Backlog has 8 stories totalling 16 points; milestone capacity is 10 points
+- `review-mode.txt` contains `full`
+
+**Input:** `/sprint-plan`
+
+**Expected behavior:**
+1. Skill drafts sprint with all 8 stories (over capacity)
+2. PR-SPRINT gate runs; producer returns CONCERNS: sprint is overloaded
+3. Skill presents concern to user and asks which stories to defer
+4. User selects 3 stories to defer; sprint is revised to 5 stories / 10 points
+5. Skill asks "May I write" with revised sprint; writes on approval
+
+**Assertions:**
+- [ ] CONCERNS from PR-SPRINT gate surfaces to user before any write
+- [ ] Skill allows sprint to be revised after gate feedback
+- [ ] Revised sprint (not original) is written to file
+- [ ] Verdict is COMPLETE after revision and write
+
+---
+
+### Case 4: Lean Mode — PR-SPRINT gate skipped
+
+**Fixture:**
+- Backlog has 4 stories; milestone capacity is 8 points
+- `review-mode.txt` contains `lean`
+
+**Input:** `/sprint-plan`
+
+**Expected behavior:**
+1. Skill reads review mode — determines `lean`
+2. Skill drafts sprint and presents it to user
+3. PR-SPRINT gate is skipped; output notes "[PR-SPRINT] skipped — Lean mode"
+4. Skill asks user for direct approval of the sprint
+5. User approves; sprint file is written
+
+**Assertions:**
+- [ ] PR-SPRINT gate is NOT invoked in lean mode
+- [ ] Skip is explicitly noted in output
+- [ ] User approval is still required before write (gate skip ≠ approval skip)
+- [ ] Verdict is COMPLETE after write
+
+---
+
+### Case 5: Edge Case — Previous sprint still has open stories
+
+**Fixture:**
+- `production/sprints/sprint-002.md` exists with 2 stories still `Status: In Progress`
+- Backlog has 5 new unstarted stories
+- `review-mode.txt` contains `full`
+
+**Input:** `/sprint-plan`
+
+**Expected behavior:**
+1. Skill reads sprint-002 and detects 2 open (in-progress) stories
+2. Skill flags: "Sprint 002 has 2 open stories — confirm carry-over before planning sprint 003"
+3. Skill presents user with choice: carry stories over, defer them, or cancel
+4. User confirms carry-over; carried stories are prepended to new sprint with `[CARRY]` tag
+5. Sprint draft is built; PR-SPRINT gate runs; sprint is written on approval
+
+**Assertions:**
+- [ ] Skill checks the most recent sprint file for open stories
+- [ ] User is asked to confirm carry-over before sprint planning continues
+- [ ] Carried stories appear in the new sprint draft with a distinguishing label
+- [ ] Skill does not silently ignore open stories from the previous sprint
+
+---
+
+## Protocol Compliance
+
+- [ ] Shows draft sprint before invoking PR-SPRINT gate or asking to write
+- [ ] Always asks "May I write" before writing sprint file
+- [ ] PR-SPRINT gate only runs in full mode
+- [ ] Skip message appears in lean and solo mode output
+- [ ] Verdict is clearly stated at the end of the skill output
+
+---
+
+## Coverage Notes
+
+- The case where no milestone file exists is not explicitly tested; behavior
+  follows the BLOCKED pattern with a suggestion to run `/gate-check` for
+  milestone progression.
+- Solo mode behavior is equivalent to lean (gate skipped, user approval
+  required) and is not separately tested.
+- Parallel story selection algorithms are not tested here; those are unit
+  concerns for the sprint-plan subagent.
--- a/Framework/skills/sprint/sprint-status.md
+++ b/Framework/skills/sprint/sprint-status.md
@@ -0,0 +1,167 @@
+# Skill Test Spec: /sprint-status
+
+## Skill Summary
+
+`/sprint-status` is a Haiku-tier read-only skill that reads the current active
+sprint file and the session state to produce a concise sprint health summary.
+It reports story counts by status (Complete / In Progress / Blocked / Not Started)
+and emits one of three sprint-health verdicts: ON TRACK, AT RISK, or BLOCKED.
+It never writes files and does not invoke any director gates. It is designed for
+fast, low-cost status checks during a session.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings or numbered check sections
+- [ ] Contains verdict keywords: ON TRACK, AT RISK, BLOCKED
+- [ ] Does NOT require "May I write" language (read-only skill)
+- [ ] Has a next-step handoff (what to do based on the verdict)
+
+---
+
+## Director Gate Checks
+
+None. `/sprint-status` is a read-only reporting skill; no gates are invoked.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Mixed sprint, AT RISK with named blocker
+
+**Fixture:**
+- `production/sprints/sprint-004.md` exists (active sprint, linked in `active.md`)
+- Sprint contains 6 stories:
+  - 3 with `Status: Complete`
+  - 2 with `Status: In Progress`
+  - 1 with `Status: Blocked` (blocker: "Waiting on physics ADR acceptance")
+- Sprint end date is 2 days away
+
+**Input:** `/sprint-status`
+
+**Expected behavior:**
+1. Skill reads `production/session-state/active.md` to find active sprint reference
+2. Skill reads `production/sprints/sprint-004.md`
+3. Skill counts stories by status: 3 Complete, 2 In Progress, 1 Blocked
+4. Skill detects a Blocked story and the approaching deadline
+5. Skill outputs AT RISK verdict with the blocker named explicitly
+
+**Assertions:**
+- [ ] Output includes story count breakdown by status
+- [ ] Output names the specific blocked story and its blocker reason
+- [ ] Verdict is AT RISK (not BLOCKED, not ON TRACK) when any story is Blocked
+- [ ] Skill does not write any files
+
+---
+
+### Case 2: All Stories Complete — Sprint COMPLETE verdict
+
+**Fixture:**
+- `production/sprints/sprint-004.md` exists
+- All 5 stories have `Status: Complete`
+
+**Input:** `/sprint-status`
+
+**Expected behavior:**
+1. Skill reads sprint file — all stories are Complete
+2. Skill outputs ON TRACK verdict or SPRINT COMPLETE label
+3. Skill suggests running `/milestone-review` or `/sprint-plan` as next steps
+
+**Assertions:**
+- [ ] Verdict is ON TRACK or SPRINT COMPLETE when all stories are Complete
+- [ ] Output notes that the sprint is fully done
+- [ ] Next-step suggestion references `/milestone-review` or `/sprint-plan`
+- [ ] No files are written
+
+---
+
+### Case 3: No Active Sprint File — Guidance to run /sprint-plan
+
+**Fixture:**
+- `production/session-state/active.md` does not reference an active sprint
+- `production/sprints/` directory is empty or absent
+
+**Input:** `/sprint-status`
+
+**Expected behavior:**
+1. Skill reads `active.md` — finds no active sprint reference
+2. Skill checks `production/sprints/` — finds no files
+3. Skill outputs an informational message: no active sprint detected
+4. Skill suggests running `/sprint-plan` to create one
+
+**Assertions:**
+- [ ] Skill does not error or crash when no sprint file exists
+- [ ] Output clearly states no active sprint was found
+- [ ] Output recommends `/sprint-plan` as the next action
+- [ ] No verdict keyword is emitted (no sprint to assess)
+
+---
+
+### Case 4: Edge Case — Stale In Progress Story (flagged)
+
+**Fixture:**
+- `production/sprints/sprint-004.md` exists
+- One story has `Status: In Progress` with a note in `active.md`:
+  `Last updated: 2026-03-30` (more than 2 days before today's session date)
+- No stories are Blocked
+
+**Input:** `/sprint-status`
+
+**Expected behavior:**
+1. Skill reads sprint file and session state
+2. Skill detects the story has been In Progress for >2 days without update
+3. Skill flags the story as "stale" in the output
+4. Verdict is AT RISK (stale in-progress stories indicate a hidden blocker)
+
+**Assertions:**
+- [ ] Skill compares story "last updated" metadata against session date
+- [ ] Stale In Progress story is flagged by name in the output
+- [ ] Verdict is AT RISK, not ON TRACK, when a stale story is detected
+- [ ] Output does not conflate "stale" with "Blocked" — the label is distinct
+
+---
+
+### Case 5: Gate Compliance — Read-only; no gate invocation
+
+**Fixture:**
+- `production/sprints/sprint-004.md` exists with 4 stories (2 Complete, 2 In Progress)
+- `production/session-state/review-mode.txt` contains `full`
+
+**Input:** `/sprint-status`
+
+**Expected behavior:**
+1. Skill reads sprint and produces status summary
+2. Skill does NOT invoke any director gate regardless of review mode
+3. Output is a plain status report with ON TRACK, AT RISK, or BLOCKED verdict
+4. Skill does not prompt for user approval or ask to write any file
+
+**Assertions:**
+- [ ] No director gate is invoked in any review mode
+- [ ] Output does not contain any "May I write" prompt
+- [ ] Skill completes and returns a verdict without user interaction
+- [ ] Review mode file is ignored (or confirmed irrelevant) by this skill
+
+---
+
+## Protocol Compliance
+
+- [ ] Does NOT use Write or Edit tools (read-only skill)
+- [ ] Presents story count breakdown before emitting verdict
+- [ ] Does not ask for approval
+- [ ] Ends with a recommended next step based on verdict
+- [ ] Runs on Haiku model tier (fast, low-cost)
+
+---
+
+## Coverage Notes
+
+- The case where multiple sprints are active simultaneously is not tested;
+  the skill reads whichever sprint `active.md` references.
+- Partial sprint completion percentages are not explicitly verified; the
+  count-by-status output implies them.
+- The `solo` mode review-mode variant is not separately tested; gate
+  behavior in Case 5 applies to all modes equally.
--- a/Framework/skills/team/team-audio.md
+++ b/Framework/skills/team/team-audio.md
@@ -0,0 +1,210 @@
+# Skill Test Spec: /team-audio
+
+## Skill Summary
+
+Orchestrates the audio team through a four-step pipeline: audio direction
+(audio-director) → sound design + accessibility review in parallel (sound-designer
+ accessibility-specialist) → technical implementation + engine validation in
+parallel (technical-artist + primary engine specialist) → code integration
+(gameplay-programmer). Reads relevant GDDs, the sound bible (if present), and
+existing audio asset lists before spawning agents. Compiles all outputs into an
+audio design document saved to `design/gdd/audio-[feature].md`. Uses
+`AskUserQuestion` at each step transition. Verdict is COMPLETE when the audio
+design document is produced. Skips the engine specialist spawn gracefully when no
+engine is configured.
+
+---
+
+## Static Assertions (Structural)
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 step/phase headings
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED
+- [ ] Contains "File Write Protocol" section
+- [ ] File writes are delegated to sub-agents — orchestrator does not write files directly
+- [ ] Sub-agents enforce "May I write to [path]?" before any write
+- [ ] Has a next-step handoff at the end (references `/dev-story`, `/asset-audit`)
+- [ ] Error Recovery Protocol section is present
+- [ ] `AskUserQuestion` is used at step transitions before proceeding
+- [ ] Step 2 explicitly spawns sound-designer and accessibility-specialist in parallel
+- [ ] Step 3 explicitly spawns technical-artist and engine specialist in parallel (when engine is configured)
+- [ ] Skill reads `design/gdd/sound-bible.md` during context gathering if it exists
+- [ ] Output document is saved to `design/gdd/audio-[feature].md`
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All steps complete, audio design document saved
+
+**Fixture:**
+- GDD for the target feature exists at `design/gdd/combat.md`
+- Sound bible exists at `design/gdd/sound-bible.md`
+- Existing audio assets are listed in `assets/audio/`
+- Engine is configured in `.claude/docs/technical-preferences.md`
+- No accessibility gaps exist in the planned audio event list
+
+**Input:** `/team-audio combat`
+
+**Expected behavior:**
+1. Context gathering: orchestrator reads `design/gdd/combat.md`, `design/gdd/sound-bible.md`, and `assets/audio/` asset list before spawning any agent
+2. Step 1: audio-director is spawned; defines sonic identity, emotional tone, adaptive music direction, mix targets, and adaptive audio rules for combat
+3. `AskUserQuestion` presents audio direction; user approves before Step 2 begins
+4. Step 2: sound-designer and accessibility-specialist are spawned in parallel; sound-designer produces SFX specifications, audio event list with trigger conditions, and mixing groups; accessibility-specialist identifies critical gameplay audio events and specifies visual fallback and subtitle requirements
+5. `AskUserQuestion` presents SFX spec and accessibility requirements; user approves before Step 3 begins
+6. Step 3: technical-artist and primary engine specialist are spawned in parallel; technical-artist designs bus structure, middleware integration, memory budgets, and streaming strategy; engine specialist validates that the integration approach is idiomatic for the configured engine
+7. `AskUserQuestion` presents technical plan; user approves before Step 4 begins
+8. Step 4: gameplay-programmer is spawned; wires up audio events to gameplay triggers, implements adaptive music, sets up occlusion zones, writes unit tests for audio event triggers
+9. Orchestrator compiles all outputs into a single audio design document
+10. Subagent asks "May I write the audio design document to `design/gdd/audio-combat.md`?" before writing
+11. Summary output lists: audio event count, estimated asset count, implementation tasks, and any open questions
+12. Verdict: COMPLETE
+
+**Assertions:**
+- [ ] Sound bible is read during context gathering (before Step 1) when it exists
+- [ ] audio-director is spawned before sound-designer or accessibility-specialist
+- [ ] `AskUserQuestion` appears after Step 1 output and before Step 2 launch
+- [ ] sound-designer and accessibility-specialist Task calls are issued simultaneously in Step 2
+- [ ] technical-artist and engine specialist Task calls are issued simultaneously in Step 3
+- [ ] gameplay-programmer is not launched until Step 3 `AskUserQuestion` is approved
+- [ ] Audio design document is written to `design/gdd/audio-combat.md` (not another path)
+- [ ] Summary includes audio event count and estimated asset count
+- [ ] No files are written by the orchestrator directly
+- [ ] Verdict is COMPLETE after document delivery
+
+---
+
+### Case 2: Accessibility Gap — Critical gameplay audio event has no visual fallback
+
+**Fixture:**
+- GDD for the target feature exists
+- Step 1 and Step 2 are in progress
+- sound-designer's audio event list includes "EnemyNearbyAlert" — a spatial audio cue that warns the player an enemy is approaching from off-screen
+- accessibility-specialist reviews the event list and finds "EnemyNearbyAlert" has no visual fallback (no on-screen indicator, no subtitle, no controller rumble specified)
+
+**Input:** `/team-audio stealth` (Step 2 scenario)
+
+**Expected behavior:**
+1. Steps 1–2 proceed; accessibility-specialist and sound-designer are spawned in parallel
+2. accessibility-specialist returns its review with a BLOCKING concern: "`EnemyNearbyAlert` is a critical gameplay audio event (warns player of off-screen threat) with no visual fallback — hearing-impaired players cannot detect this threat. This is a BLOCKING accessibility gap."
+3. Orchestrator surfaces the concern immediately in conversation before presenting `AskUserQuestion`
+4. `AskUserQuestion` presents the accessibility concern as a BLOCKING issue with options:
+   - Add a visual indicator for EnemyNearbyAlert (e.g., directional arrow on HUD) and continue
+   - Add controller haptic feedback as the fallback and continue
+   - Stop here and resolve all accessibility gaps before proceeding to Step 3
+5. Step 3 (technical-artist + engine specialist) is not launched until the user resolves or explicitly accepts the gap
+6. The accessibility gap is included in the final audio design document under "Open Accessibility Issues" if unresolved
+
+**Assertions:**
+- [ ] Accessibility gap is labeled BLOCKING (not advisory) in the report
+- [ ] The specific event name ("EnemyNearbyAlert") and the nature of the gap are stated
+- [ ] `AskUserQuestion` surfaces the gap before Step 3 is launched
+- [ ] At least one resolution option is offered (add visual fallback, add haptic fallback)
+- [ ] Step 3 is not launched while the gap is unresolved without explicit user authorization
+- [ ] If the gap is carried forward unresolved, it is documented in the audio design doc as an open issue
+
+---
+
+### Case 3: No Argument — Usage guidance or design doc inference
+
+**Fixture:**
+- Any project state
+
+**Input:** `/team-audio` (no argument)
+
+**Expected behavior:**
+1. Skill detects no argument is provided
+2. Outputs usage guidance: e.g., "Usage: `/team-audio [feature or area]` — specify the feature or area to design audio for (e.g., `combat`, `main menu`, `forest biome`, `boss encounter`)"
+3. Skill exits without spawning any agents
+
+**Assertions:**
+- [ ] Skill does NOT spawn any agents when no argument is provided
+- [ ] Usage message includes the correct invocation format with argument examples
+- [ ] Skill does NOT attempt to infer a feature from existing design docs without user direction
+- [ ] No `AskUserQuestion` is used — output is direct guidance
+
+---
+
+### Case 4: Missing Sound Bible — Skill notes the gap and proceeds without it
+
+**Fixture:**
+- GDD for the target feature exists at `design/gdd/main-menu.md`
+- `design/gdd/sound-bible.md` does NOT exist
+- Engine is configured; other context files are present
+
+**Input:** `/team-audio main menu`
+
+**Expected behavior:**
+1. Context gathering: orchestrator reads `design/gdd/main-menu.md` and checks for `design/gdd/sound-bible.md`
+2. Sound bible is not found; orchestrator notes the gap in conversation: "Note: `design/gdd/sound-bible.md` not found — audio direction will proceed without a project-wide sonic identity reference. Consider creating a sound bible if this is an ongoing project."
+3. Pipeline proceeds normally through all four steps without the sound bible as input
+4. audio-director in Step 1 is informed that no sound bible exists and must establish sonic identity from the feature GDD alone
+5. The missing sound bible is mentioned in the final summary as a recommended next step
+
+**Assertions:**
+- [ ] Orchestrator checks for the sound bible during context gathering (before Step 1)
+- [ ] Missing sound bible is noted explicitly in conversation — not silently ignored
+- [ ] Pipeline does NOT halt due to the missing sound bible
+- [ ] audio-director is notified that no sound bible exists in its prompt context
+- [ ] Summary or Next Steps section recommends creating a sound bible
+- [ ] Verdict is still COMPLETE if all other steps succeed
+
+---
+
+### Case 5: Engine Not Configured — Engine specialist step skipped gracefully
+
+**Fixture:**
+- Engine is NOT configured in `.claude/docs/technical-preferences.md` (shows `[TO BE CONFIGURED]`)
+- GDD for the target feature exists
+- Sound bible may or may not exist
+
+**Input:** `/team-audio boss encounter`
+
+**Expected behavior:**
+1. Context gathering: orchestrator reads `.claude/docs/technical-preferences.md` and detects no engine is configured
+2. Steps 1–2 proceed normally (audio-director, sound-designer, accessibility-specialist)
+3. Step 3: technical-artist is spawned normally; engine specialist spawn is SKIPPED
+4. Orchestrator notes in conversation: "Engine specialist not spawned — no engine configured in technical-preferences.md. Engine integration validation will be deferred until an engine is selected."
+5. Step 4: gameplay-programmer proceeds with a note that engine-specific audio integration patterns could not be validated
+6. The engine specialist gap is included in the audio design document under "Deferred Validation"
+7. Verdict: COMPLETE (skip is graceful, not a blocker)
+
+**Assertions:**
+- [ ] Engine specialist is NOT spawned when no engine is configured
+- [ ] Skill does NOT error out due to the missing engine configuration
+- [ ] The skip is explicitly noted in conversation — not silently omitted
+- [ ] technical-artist is still spawned in Step 3 (skip applies only to the engine specialist)
+- [ ] gameplay-programmer proceeds in Step 4 with the deferred validation noted
+- [ ] Deferred engine validation is recorded in the audio design document
+- [ ] Verdict is COMPLETE (engine not configured is a known graceful case)
+
+---
+
+## Protocol Compliance
+
+- [ ] Context gathering (GDDs, sound bible, asset list) runs before any agent is spawned
+- [ ] `AskUserQuestion` is used after every step output before the next step launches
+- [ ] Parallel spawning: Step 2 (sound-designer + accessibility-specialist) and Step 3 (technical-artist + engine specialist) issue all Task calls before waiting for results
+- [ ] No files are written by the orchestrator directly — all writes are delegated to sub-agents
+- [ ] Each sub-agent enforces the "May I write to [path]?" protocol before any write
+- [ ] BLOCKED status from any agent is surfaced immediately — not silently skipped
+- [ ] A partial report is always produced when some agents complete and others block
+- [ ] Audio design document path follows the pattern `design/gdd/audio-[feature].md`
+- [ ] Verdict is exactly COMPLETE or BLOCKED — no other verdict values used
+- [ ] Next Steps handoff references `/dev-story` and `/asset-audit`
+
+---
+
+## Coverage Notes
+
+- The "Retry with narrower scope" and "Skip this agent" resolution paths from the Error
+  Recovery Protocol are not separately tested — they follow the same `AskUserQuestion`
+  + partial-report pattern validated in Cases 2 and 5.
+- Step 4 (gameplay-programmer) happy-path behavior is validated implicitly by Case 1.
+  Failure modes for this step follow the standard Error Recovery Protocol.
+- The accessibility-specialist's subtitle and caption requirements (beyond visual fallbacks)
+  are validated implicitly by Case 1. Case 2 focuses on the more severe case where a
+  critical gameplay event has no fallback at all.
+- Engine specialist validation logic (idiomatic integration, version-specific changes) is
+  tested only for the configured and unconfigured states. The specific content of the
+  engine specialist's output is out of scope for this behavioral spec.
--- a/Framework/skills/team/team-combat.md
+++ b/Framework/skills/team/team-combat.md
@@ -0,0 +1,180 @@
+# Skill Test Spec: /team-combat
+
+## Skill Summary
+
+Orchestrates the full combat team pipeline end-to-end for a single combat feature.
+Coordinates game-designer, gameplay-programmer, ai-programmer, technical-artist,
+sound-designer, the primary engine specialist, and qa-tester through six structured
+phases: Design → Architecture (with engine specialist validation) → Implementation
+(parallel) → Integration → Validation → Sign-off. Uses `AskUserQuestion` at each
+phase transition. Delegates all file writes to sub-agents. Produces a summary report
+with verdict COMPLETE / NEEDS WORK / BLOCKED and handoffs to `/code-review`,
+`/balance-check`, and `/team-polish`.
+
+---
+
+## Static Assertions (Structural)
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings (Phase 1 through Phase 6 are all present)
+- [ ] Contains verdict keywords: COMPLETE, NEEDS WORK, BLOCKED
+- [ ] Contains "May I write" or "File Write Protocol" — writes delegated to sub-agents, orchestrator does not write files directly
+- [ ] Has a next-step handoff at the end (references `/code-review`, `/balance-check`, `/team-polish`)
+- [ ] Error Recovery Protocol section is present with all four recovery steps
+- [ ] Uses `AskUserQuestion` at phase transitions for user approval before proceeding
+- [ ] Phase 3 is explicitly marked as parallel (gameplay-programmer, ai-programmer, technical-artist, sound-designer)
+- [ ] Phase 2 includes spawning the primary engine specialist (read from `.claude/docs/technical-preferences.md`)
+- [ ] Team Composition lists all seven roles (game-designer, gameplay-programmer, ai-programmer, technical-artist, sound-designer, engine specialist, qa-tester)
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All agents succeed, full pipeline runs to completion
+
+**Fixture:**
+- `design/gdd/game-concept.md` exists and is populated
+- Engine is configured in `.claude/docs/technical-preferences.md` (Engine Specialists section filled)
+- No existing GDD for the requested combat feature
+
+**Input:** `/team-combat parry and riposte system`
+
+**Expected behavior:**
+1. Phase 1 — game-designer spawned; produces `design/gdd/parry-riposte.md` covering all 8 required sections (overview, player fantasy, rules, formulas, edge cases, dependencies, tuning knobs, acceptance criteria); asks user to approve design doc
+2. Phase 2 — gameplay-programmer + ai-programmer spawned; produce architecture sketch with class structure, interfaces, and file list; then primary engine specialist is spawned to validate idioms; engine specialist output incorporated; `AskUserQuestion` presented with architecture options before Phase 3 begins
+3. Phase 3 — gameplay-programmer, ai-programmer, technical-artist, sound-designer spawned in parallel; all four return outputs before Phase 4 begins
+4. Phase 4 — integration wires together all Phase 3 outputs; tuning knobs verified as data-driven; `AskUserQuestion` confirms integration before Phase 5
+5. Phase 5 — qa-tester spawned; writes test cases from acceptance criteria; verifies edge cases; performance impact checked against budget
+6. Phase 6 — summary report produced: design COMPLETE, all team members COMPLETE, test cases listed, verdict: COMPLETE
+7. Next steps listed: `/code-review`, `/balance-check`, `/team-polish`
+
+**Assertions:**
+- [ ] `AskUserQuestion` called at each phase gate (at minimum before Phase 3 and before Phase 5)
+- [ ] Phase 3 agents launched simultaneously — no sequential dependency between gameplay-programmer, ai-programmer, technical-artist, sound-designer
+- [ ] Engine specialist runs in Phase 2 before Phase 3 begins (output incorporated into architecture)
+- [ ] All file writes delegated to sub-agents (orchestrator never calls Write/Edit directly)
+- [ ] Verdict COMPLETE present in final report
+- [ ] Next steps include `/code-review`, `/balance-check`, `/team-polish`
+- [ ] Design doc covers all 8 required GDD sections
+
+---
+
+### Case 2: Blocked Agent — One subagent returns BLOCKED mid-pipeline
+
+**Fixture:**
+- `design/gdd/parry-riposte.md` exists (Phase 1 already complete)
+- ai-programmer agent returns BLOCKED because no AI system architecture ADR exists (ADR status is Proposed)
+
+**Input:** `/team-combat parry and riposte system`
+
+**Expected behavior:**
+1. Phase 1 — design doc found; game-designer confirms it is valid; phase approved
+2. Phase 2 — gameplay-programmer completes architecture sketch; ai-programmer returns BLOCKED: "ADR for AI behavior system is Proposed — cannot implement until ADR is Accepted"
+3. Error Recovery Protocol triggered: "ai-programmer: BLOCKED — AI behavior ADR is Proposed"
+4. `AskUserQuestion` presented with options: (a) Skip ai-programmer and note the gap; (b) Retry with narrower scope; (c) Stop here and run `/architecture-decision` first
+5. If user chooses (a): Phase 3 proceeds with gameplay-programmer, technical-artist, sound-designer only; ai-programmer gap noted in partial report
+6. Final report produced: partial implementation documented, ai-programmer section marked BLOCKED, overall verdict: BLOCKED
+
+**Assertions:**
+- [ ] BLOCKED surface message appears before any dependent phase continues
+- [ ] `AskUserQuestion` offers at minimum three options: skip / retry / stop
+- [ ] Partial report produced — completed agents' work is not discarded
+- [ ] Overall verdict is BLOCKED (not COMPLETE) when any agent is unresolved
+- [ ] Blocked reason references the ADR and suggests `/architecture-decision`
+- [ ] Orchestrator does not silently proceed past the blocked dependency
+
+---
+
+### Case 3: No Argument — Clear usage guidance shown
+
+**Fixture:**
+- Any project state
+
+**Input:** `/team-combat` (no argument)
+
+**Expected behavior:**
+1. Skill detects no argument provided
+2. Outputs usage message explaining the required argument (combat feature description)
+3. Provides an example invocation: `/team-combat [combat feature description]`
+4. Skill exits without spawning any subagents
+
+**Assertions:**
+- [ ] Skill does NOT spawn any subagents when no argument is given
+- [ ] Usage message includes the argument-hint format from frontmatter
+- [ ] Error message includes at least one example of a valid invocation
+- [ ] No file reads beyond what is needed to detect the missing argument
+- [ ] Verdict is NOT shown (pipeline never runs)
+
+---
+
+### Case 4: Parallel Phase Validation — Phase 3 agents run simultaneously
+
+**Fixture:**
+- `design/gdd/parry-riposte.md` exists and is complete
+- Architecture sketch has been approved
+- Engine specialist has validated architecture
+
+**Input:** `/team-combat parry and riposte system` (resuming from Phase 2 complete)
+
+**Expected behavior:**
+1. Phase 3 begins after architecture approval
+2. All four Task calls — gameplay-programmer, ai-programmer, technical-artist, sound-designer — are issued before any result is awaited
+3. Skill waits for all four agents to complete before proceeding to Phase 4
+4. If any single agent completes early, skill does not begin Phase 4 until all four have returned
+
+**Assertions:**
+- [ ] Four Task calls issued in a single batch (no sequential waiting between them)
+- [ ] Phase 4 does not begin until all four Phase 3 agents have returned results
+- [ ] Skill does not pass one Phase 3 agent's output as input to another Phase 3 agent (they are independent)
+- [ ] All four Phase 3 agent results referenced in the Phase 4 integration step
+
+---
+
+### Case 5: Architecture Phase Engine Routing — Engine specialist receives correct context
+
+**Fixture:**
+- `.claude/docs/technical-preferences.md` has Engine Specialists section populated (e.g., Primary: godot-specialist)
+- Architecture sketch produced by gameplay-programmer is available
+- Engine version pinned in `docs/engine-reference/godot/VERSION.md`
+
+**Input:** `/team-combat parry and riposte system`
+
+**Expected behavior:**
+1. Phase 2 — gameplay-programmer produces architecture sketch
+2. Skill reads `.claude/docs/technical-preferences.md` Engine Specialists section to identify the primary engine specialist agent type
+3. Engine specialist is spawned with: the architecture sketch, the GDD path, the engine version from `VERSION.md`, and explicit instructions to check for deprecated APIs
+4. Engine specialist output (idiom notes, deprecated API warnings, native system recommendations) is returned to orchestrator
+5. Orchestrator incorporates engine notes into the architecture before presenting Phase 2 results to user
+6. `AskUserQuestion` includes engine specialist's notes alongside the architecture sketch
+
+**Assertions:**
+- [ ] Engine specialist agent type is read from `.claude/docs/technical-preferences.md` — not hardcoded
+- [ ] Engine specialist prompt includes the architecture sketch and GDD path
+- [ ] Engine specialist checks for deprecated APIs against the pinned engine version
+- [ ] Engine specialist output is incorporated before Phase 3 begins (not skipped or appended separately)
+- [ ] If no engine is configured, engine specialist step is skipped and a note is added to the report
+
+---
+
+## Protocol Compliance
+
+- [ ] `AskUserQuestion` used at each phase transition — user approves before pipeline advances
+- [ ] All file writes delegated to sub-agents via Task — orchestrator does not call Write or Edit directly
+- [ ] Error Recovery Protocol followed: surface → assess → offer options → partial report
+- [ ] Phase 3 agents launched in parallel per skill spec
+- [ ] Partial report always produced even when agents are BLOCKED
+- [ ] Verdict is one of COMPLETE / NEEDS WORK / BLOCKED
+- [ ] Next steps present at end of output: `/code-review`, `/balance-check`, `/team-polish`
+
+---
+
+## Coverage Notes
+
+- The NEEDS WORK verdict path (qa-tester finds failures in Phase 5) is not separately tested
+  here; it follows the same error recovery and partial report protocol as Case 2.
+- "Retry with narrower scope" error recovery option is listed in assertions but its full
+  recursive behavior (splitting via `/create-stories`) is covered by the `/create-stories` spec.
+- Phase 4 integration logic (wiring gameplay, AI, VFX, audio) is validated implicitly by
+  the Happy Path case; a dedicated integration test would require fixture code files.
+- Engine specialist unavailable (no engine configured) is partially covered in Case 5
+  assertions — a dedicated fixture for unconfigured engine state would strengthen coverage.
--- a/Framework/skills/team/team-level.md
+++ b/Framework/skills/team/team-level.md
@@ -0,0 +1,209 @@
+# Skill Test Spec: /team-level
+
+## Skill Summary
+
+Orchestrates the full level design team for a single level or area. Coordinates
+narrative-director, world-builder, level-designer, systems-designer, art-director,
+accessibility-specialist, and qa-tester through five sequential steps with one
+parallel phase (Step 4). Compiles all team outputs into a single level design
+document saved to `design/levels/[level-name].md`. Uses `AskUserQuestion` at each
+step transition. Delegates all file writes to sub-agents. Produces a summary report
+with verdict COMPLETE / BLOCKED and handoffs to `/design-review`, `/dev-story`,
+`/qa-plan`.
+
+---
+
+## Static Assertions (Structural)
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase/step headings (Step 1 through Step 5 are all present)
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED
+- [ ] Contains "May I write" or "File Write Protocol" — writes delegated to sub-agents, orchestrator does not write files directly
+- [ ] Has a next-step handoff at the end (references `/design-review`, `/dev-story`, `/qa-plan`)
+- [ ] Error Recovery Protocol section is present with all four recovery steps
+- [ ] Uses `AskUserQuestion` at step transitions for user approval before proceeding
+- [ ] Step 4 is explicitly marked as parallel (art-director and accessibility-specialist run simultaneously)
+- [ ] Context gathering reads: `design/gdd/game-concept.md`, `design/gdd/game-pillars.md`, `design/levels/`, `design/narrative/`, and relevant world-building docs
+- [ ] Team Composition lists all seven roles (narrative-director, world-builder, level-designer, systems-designer, art-director, accessibility-specialist, qa-tester)
+- [ ] accessibility-specialist output includes severity ratings (BLOCKING / RECOMMENDED / NICE TO HAVE)
+- [ ] Final level design document saved to `design/levels/[level-name].md`
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All team members produce outputs, document compiled and saved
+
+**Fixture:**
+- `design/gdd/game-concept.md` exists and is populated
+- `design/gdd/game-pillars.md` exists
+- `design/levels/` directory exists (may contain other level docs)
+- `design/narrative/` directory exists with relevant narrative docs
+
+**Input:** `/team-level forest dungeon`
+
+**Expected behavior:**
+1. Context gathering — orchestrator reads game-concept.md, game-pillars.md, existing level docs in `design/levels/`, narrative docs in `design/narrative/`, and world-building docs for the forest region
+2. Step 1 — narrative-director spawned: defines narrative purpose, key characters, dialogue triggers, emotional arc; world-builder spawned: provides lore context, environmental storytelling opportunities, world rules; `AskUserQuestion` confirms Step 1 outputs before Step 2
+3. Step 2 — level-designer spawned: designs spatial layout (critical path, optional paths, secrets), pacing curve, encounters, puzzles, entry/exit points and connections to adjacent areas; `AskUserQuestion` confirms layout before Step 3
+4. Step 3 — systems-designer spawned: specifies enemy compositions, loot tables, difficulty balance, area-specific mechanics, resource distribution; `AskUserQuestion` confirms systems before Step 4
+5. Step 4 — art-director and accessibility-specialist spawned in parallel; art-director: visual theme, color palette, lighting, asset list, VFX needs; accessibility-specialist: navigation clarity, colorblind safety, cognitive load check — each concern rated BLOCKING / RECOMMENDED / NICE TO HAVE; `AskUserQuestion` presents both outputs before Step 5
+6. Step 5 — qa-tester spawned: test cases for critical path, boundary/edge cases (sequence breaks, softlocks), playtest checklist, acceptance criteria
+7. Orchestrator compiles all team outputs into level design document format; sub-agent asked "May I write to `design/levels/forest-dungeon.md`?"; file saved
+8. Summary report: area overview, encounter count, estimated asset list, narrative beats, cross-team dependencies, verdict: COMPLETE
+9. Next steps listed: `/design-review design/levels/forest-dungeon.md`, `/dev-story`, `/qa-plan`
+
+**Assertions:**
+- [ ] All five sources read during context gathering before any agent is spawned
+- [ ] narrative-director and world-builder both spawned in Step 1 (may be sequential or parallel — both must complete before Step 2)
+- [ ] `AskUserQuestion` called at each step gate (minimum: after Step 1, Step 2, Step 3, Step 4)
+- [ ] Step 4 agents (art-director, accessibility-specialist) launched simultaneously
+- [ ] All file writes delegated to sub-agents — orchestrator does not write directly
+- [ ] Level doc saved to `design/levels/forest-dungeon.md` (slugified from argument)
+- [ ] Verdict COMPLETE in final summary report
+- [ ] Next steps include `/design-review`, `/dev-story`, `/qa-plan`
+- [ ] Summary report includes: area overview, encounter count, estimated asset list, narrative beats
+
+---
+
+### Case 2: Blocked Agent (world-builder) — Partial report produced with gap noted
+
+**Fixture:**
+- `design/gdd/game-concept.md` exists
+- World-building docs for the forest region do NOT exist
+- world-builder agent returns BLOCKED: "No world-building docs found for the forest region — cannot provide lore context"
+
+**Input:** `/team-level forest dungeon`
+
+**Expected behavior:**
+1. Context gathering completes; missing world-building docs noted
+2. Step 1 — narrative-director completes successfully; world-builder spawned and returns BLOCKED
+3. Error Recovery Protocol triggered: "world-builder: BLOCKED — no world-building docs for forest region"
+4. `AskUserQuestion` presented with options:
+   - (a) Skip world-builder and note the lore gap in the level doc
+   - (b) Retry with narrower scope (world-builder focuses only on what can be inferred from game-concept.md)
+   - (c) Stop here and create world-building docs first
+5. If user chooses (a): pipeline continues with Steps 2–5 using narrative-director context only; level doc compiled with a clearly marked gap section: "World-building context: NOT PROVIDED — see open dependency"
+6. Final report produced: partial outputs documented, world-builder section marked BLOCKED, overall verdict: BLOCKED
+
+**Assertions:**
+- [ ] BLOCKED surface message appears immediately when world-builder fails — before Step 2 begins without user input
+- [ ] `AskUserQuestion` offers at minimum three options (skip / retry / stop)
+- [ ] Partial report produced — narrative-director's completed work is not discarded
+- [ ] Level doc (if compiled) contains an explicit gap notation for the missing world-building context
+- [ ] Overall verdict is BLOCKED (not COMPLETE) when world-builder remains unresolved
+- [ ] Skill does NOT silently fabricate lore content to fill the gap
+
+---
+
+### Case 3: No Argument — Usage guidance shown
+
+**Fixture:**
+- Any project state
+
+**Input:** `/team-level` (no argument)
+
+**Expected behavior:**
+1. Skill detects no argument provided
+2. Outputs usage message explaining the required argument (level name or area to design)
+3. Provides example invocations: `/team-level tutorial`, `/team-level forest dungeon`, `/team-level final boss arena`
+4. Skill exits without reading any project files or spawning any subagents
+
+**Assertions:**
+- [ ] Skill does NOT spawn any subagents when no argument is given
+- [ ] Usage message includes the argument-hint format from frontmatter
+- [ ] At least one example of a valid invocation is shown
+- [ ] No GDD or level files read before failing
+- [ ] Verdict is NOT shown (pipeline never starts)
+
+---
+
+### Case 4: Accessibility Review Gate — Blocking concern surfaces before sign-off
+
+**Fixture:**
+- Steps 1–3 complete successfully
+- `design/accessibility-requirements.md` committed tier: Enhanced
+- accessibility-specialist (Step 4, parallel) flags a BLOCKING concern: the critical path through the forest dungeon requires players to distinguish between two environmental hazards (toxic pools vs. shallow water) using color alone — no shape, icon, or audio cue differentiates them
+
+**Input:** `/team-level forest dungeon`
+
+**Expected behavior:**
+1. Steps 1–3 complete; Step 4 parallel phase begins
+2. accessibility-specialist returns: BLOCKING concern — "Critical path hazard distinction relies on color only (toxic pools vs. shallow water). Shape, icon, or audio cue required per Enhanced accessibility tier."
+3. art-director returns Step 4 output (complete)
+4. Skill presents both Step 4 results via `AskUserQuestion` — BLOCKING concern highlighted prominently
+5. `AskUserQuestion` offers:
+   - (a) Return to level-designer + art-director to redesign hazard visual/audio language before Step 5
+   - (b) Document as a known accessibility gap and proceed to Step 5 with the concern logged
+6. Skill does NOT silently proceed past the BLOCKING concern
+7. If user chooses (a): level-designer and art-director revision spawned; re-run Step 4 accessibility check
+8. Final report includes BLOCKING concern and its resolution status regardless of user choice
+
+**Assertions:**
+- [ ] BLOCKING accessibility concern is not treated as advisory — it is surfaced as a blocker
+- [ ] `AskUserQuestion` presents the specific concern text (not just "accessibility issue found")
+- [ ] Step 5 (qa-tester) does NOT begin without user acknowledging the BLOCKING concern
+- [ ] Revision path offered: level-designer + art-director can be sent back before proceeding
+- [ ] Final report includes the accessibility concern and its resolution status
+- [ ] art-director's completed output is NOT discarded when accessibility-specialist blocks
+
+---
+
+### Case 5: Circular Level Reference — Adjacent area dependency flagged
+
+**Fixture:**
+- Steps 1–3 in progress
+- level-designer (Step 2) produces a layout that specifies entry/exit points connecting to "the crystal caves" (an adjacent area)
+- `design/levels/crystal-caves.md` does NOT exist — the crystal caves area has not been designed yet
+
+**Input:** `/team-level forest dungeon`
+
+**Expected behavior:**
+1. Step 2 — level-designer produces layout including: "West exit connects to crystal-caves entry point A"
+2. Orchestrator (or level-designer subagent) checks `design/levels/` for `crystal-caves.md`; file not found
+3. Dependency gap surfaced: "Level references crystal-caves as an adjacent area but `design/levels/crystal-caves.md` does not exist"
+4. `AskUserQuestion` presented with options:
+   - (a) Proceed with a placeholder reference — note the dependency in the level doc as UNRESOLVED
+   - (b) Pause and run `/team-level crystal caves` first to establish that area
+5. Skill does NOT invent crystal caves content to satisfy the reference
+6. If user chooses (a): level doc compiled with the west exit marked "→ crystal-caves (UNRESOLVED — area not yet designed)"; flagged in the open dependencies section of the summary report
+7. Final report includes open cross-level dependencies section
+
+**Assertions:**
+- [ ] Skill detects the missing adjacent area by checking `design/levels/` — does not assume it will be created later
+- [ ] Skill does NOT fabricate crystal caves content (lore, layout, connections) to resolve the reference
+- [ ] `AskUserQuestion` offers a "design crystal caves first" option referencing `/team-level`
+- [ ] If user proceeds with placeholder, level doc explicitly marks the west exit as UNRESOLVED
+- [ ] Summary report includes an open cross-level dependencies section listing unresolved references
+- [ ] Circular or forward references do not cause the skill to loop or crash
+
+---
+
+## Protocol Compliance
+
+- [ ] `AskUserQuestion` used at each step transition — user approves before pipeline advances
+- [ ] All file writes delegated to sub-agents via Task — orchestrator does not call Write or Edit directly
+- [ ] Error Recovery Protocol followed: surface → assess → offer options → partial report
+- [ ] Step 4 agents (art-director, accessibility-specialist) launched in parallel per skill spec
+- [ ] Partial report always produced even when agents are BLOCKED
+- [ ] Accessibility BLOCKING concerns surface before sign-off and require explicit user acknowledgment
+- [ ] Verdict is one of COMPLETE / BLOCKED
+- [ ] Next steps present at end: `/design-review`, `/dev-story`, `/qa-plan`
+
+---
+
+## Coverage Notes
+
+- narrative-director and world-builder in Step 1 may be sequential or parallel — the skill spec
+  spawns both but does not mandate simultaneous launch; coverage of parallel Step 1 would require
+  an explicit timing assertion fixture.
+- The "Retry with narrower scope" option in the blocked world-builder case (Case 2) — the
+  retry behavior itself is not tested in depth; its full path is analogous to the blocked agent
+  pattern covered in Case 2 and in other team-* specs.
+- systems-designer (Step 3) block scenarios are not separately tested; the same Error Recovery
+  Protocol applies and the pattern is validated by Case 2.
+- Step 4 parallel ordering (art-director completing before or after accessibility-specialist)
+  does not affect outcomes — both must return before Step 5 regardless of order.
+- The level doc slug convention (argument → filename) is implicitly tested by Case 1
+  (`forest dungeon` → `forest-dungeon.md`); multi-word slugification edge cases (special
+  characters, very long names) are not covered.
--- a/Framework/skills/team/team-live-ops.md
+++ b/Framework/skills/team/team-live-ops.md
@@ -0,0 +1,178 @@
+# Skill Test Spec: /team-live-ops
+
+## Skill Summary
+
+Orchestrates the live-ops team through a 7-phase planning pipeline to produce a
+season or event plan. Coordinates live-ops-designer, economy-designer,
+analytics-engineer, community-manager, narrative-director, and writer. Phases 3
+and 4 (economy design and analytics) run simultaneously. Ends with a consolidated
+season plan requiring user approval before handoff to production.
+
+---
+
+## Static Assertions (Structural)
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED
+- [ ] Contains "May I write" language in the File Write Protocol section (delegated to sub-agents)
+- [ ] Has a File Write Protocol section stating that the orchestrator does not write files directly
+- [ ] Has a next-step handoff at the end referencing `/design-review`, `/sprint-plan`, and `/team-release`
+- [ ] Uses `AskUserQuestion` at phase transitions to capture user approval before proceeding
+- [ ] States explicitly that Phases 3 and 4 can run simultaneously (parallel spawning)
+- [ ] Error recovery section present (or implied through BLOCKED handling)
+- [ ] Output documents section specifies paths under `design/live-ops/seasons/`
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All 7 phases complete, season plan produced
+
+**Fixture:**
+- `design/live-ops/economy-rules.md` exists with current economy configuration
+- `design/live-ops/ethics-policy.md` exists with the project ethics policy
+- Game concept document exists at its standard path
+- No existing season documents for the new season name being planned
+
+**Input:** `/team-live-ops "Season 2: The Frozen Wastes"`
+
+**Expected behavior:**
+1. Phase 1: Spawns `live-ops-designer` via Task; receives season brief with scope, content list, and retention mechanic; presents to user
+2. AskUserQuestion: user approves Phase 1 output before Phase 2 begins
+3. Phase 2: Spawns `narrative-director` via Task; reads the Phase 1 season brief; produces narrative framing document (theme, story hook, lore connections); presents to user
+4. Phase 3 and 4 (parallel): Spawns `economy-designer` and `analytics-engineer` simultaneously via two Task calls before waiting for either result; economy-designer reads `design/live-ops/economy-rules.md`
+5. Phase 5: Spawns `narrative-director` and `writer` in parallel to produce in-game narrative text and player-facing copy; both read Phase 2 narrative framing doc
+6. Phase 6: Spawns `community-manager` via Task; reads season brief, economy design, and narrative framing; produces communication calendar with draft copy
+7. Phase 7: Collects all phase outputs; presents consolidated season plan summary including economy health check, analytics readiness, ethics review, and open questions
+8. AskUserQuestion: user approves the full season plan
+9. Sub-agents ask "May I write to `design/live-ops/seasons/S2_The_Frozen_Wastes.md`?", `...analytics.md`, and `...comms.md` before writing
+10. Verdict: COMPLETE — season plan produced and handed off for production
+
+**Assertions:**
+- [ ] All 7 phases execute in order; Phase 3 and 4 are issued as parallel Task calls
+- [ ] Phase 7 consolidated summary includes all six sections (season brief, narrative framing, economy design, analytics plan, content inventory, communication calendar)
+- [ ] Ethics review section in Phase 7 explicitly references `design/live-ops/ethics-policy.md`
+- [ ] Three output documents written to `design/live-ops/seasons/` with correct naming convention
+- [ ] File writes are delegated to sub-agents — orchestrator does not write directly
+- [ ] Verdict: COMPLETE appears in final output
+- [ ] Next steps reference `/design-review`, `/sprint-plan`, and `/team-release`
+
+---
+
+### Case 2: Ethics Violation Found — Reward element violates ethics policy
+
+**Fixture:**
+- All standard live-ops fixtures present (economy-rules.md, ethics-policy.md)
+- `design/live-ops/ethics-policy.md` explicitly prohibits loot boxes targeting players under 18
+- economy-designer (Phase 3) proposes a "Mystery Chest" mechanic with randomized premium rewards and no pity timer
+
+**Input:** `/team-live-ops "Season 3: Shadow Tournament"`
+
+**Expected behavior:**
+1. Phases 1–4 proceed normally; economy-designer proposes Mystery Chest mechanic
+2. Phase 7: Orchestrator reviews Phase 3 output against ethics policy; identifies Mystery Chest as a violation of the "no untransparent random premium rewards" rule in the ethics policy
+3. Ethics review section of the Phase 7 summary flags the violation explicitly: "ETHICS FLAG: Mystery Chest mechanic in Phase 3 economy design violates [policy rule]. Approval is blocked until this is resolved."
+4. AskUserQuestion presented with resolution options before season plan approval is offered
+5. Skill does NOT issue a COMPLETE verdict or write output documents until the ethics violation is resolved or explicitly waived by the user
+
+**Assertions:**
+- [ ] Phase 7 ethics review section explicitly names the violating element and the policy rule it breaks
+- [ ] Skill does not auto-approve the season plan when an ethics violation is present
+- [ ] AskUserQuestion is used to surface the violation and offer resolution options (revise economy design, override with documented rationale, cancel)
+- [ ] Output documents are NOT written while the violation is unresolved
+- [ ] If user chooses to revise: skill re-spawns economy-designer to produce a corrected design before returning to Phase 7 review
+- [ ] Verdict: COMPLETE is only issued after the ethics flag is cleared
+
+---
+
+### Case 3: No Argument — Usage guidance shown
+
+**Fixture:**
+- Any project state
+
+**Input:** `/team-live-ops` (no argument)
+
+**Expected behavior:**
+1. Phase 1: No argument detected
+2. Outputs: "Usage: `/team-live-ops [season name or event description]` — Provide the name or description of the season or live event to plan."
+3. Skill exits immediately without spawning any subagents
+
+**Assertions:**
+- [ ] Skill does NOT guess a season name or fabricate a scope
+- [ ] Error message includes the correct usage format with the argument-hint
+- [ ] No Task calls are issued before the argument check fails
+- [ ] No files are read or written
+
+---
+
+### Case 4: Parallel Phase Validation — Phases 3 and 4 run simultaneously
+
+**Fixture:**
+- All standard live-ops fixtures present
+- Phase 1 (season brief) and Phase 2 (narrative framing) already approved
+- Phase 3 (economy-designer) and Phase 4 (analytics-engineer) inputs are independent of each other
+
+**Input:** `/team-live-ops "Season 1: The First Thaw"` (observed at Phase 3/4 transition)
+
+**Expected behavior:**
+1. After Phase 2 is approved by the user, the orchestrator issues both Task calls (economy-designer and analytics-engineer) before awaiting either result
+2. Both agents receive the season brief as context; analytics-engineer does NOT wait for economy-designer output to begin
+3. Economy-designer output and analytics-engineer output are collected together before Phase 5 begins
+4. If one of the two parallel agents blocks, the other continues; a partial result is reported
+
+**Assertions:**
+- [ ] Both Task calls for Phase 3 and Phase 4 are issued before either result is awaited — they are not sequential
+- [ ] Analytics-engineer prompt does NOT include economy-designer output as a required input (the inputs are independent)
+- [ ] If economy-designer blocks but analytics-engineer succeeds, analytics output is preserved and the block is surfaced via AskUserQuestion
+- [ ] Phase 5 does not begin until BOTH Phase 3 and Phase 4 results are collected
+- [ ] Skill documentation explicitly states "Phases 3 and 4 can run simultaneously"
+
+---
+
+### Case 5: Missing Ethics Policy — `design/live-ops/ethics-policy.md` does not exist
+
+**Fixture:**
+- `design/live-ops/economy-rules.md` exists
+- `design/live-ops/ethics-policy.md` does NOT exist
+- All other fixtures are present
+
+**Input:** `/team-live-ops "Season 4: Desert Heat"`
+
+**Expected behavior:**
+1. Phases 1–4 proceed; economy-designer and analytics-engineer are given the ethics policy path but it is absent
+2. Phase 7: Orchestrator attempts to run ethics review; detects that `design/live-ops/ethics-policy.md` is missing
+3. Phase 7 summary includes a gap flag: "ETHICS REVIEW SKIPPED: `design/live-ops/ethics-policy.md` not found. Economy design was not reviewed against an ethics policy. Recommend creating one before production begins."
+4. Skill still completes the season plan and reaches COMPLETE verdict, but the gap is prominently flagged in the output and in the season design document
+5. Next steps include a recommendation to create the ethics policy document
+
+**Assertions:**
+- [ ] Skill does NOT error out when the ethics policy file is missing
+- [ ] Skill does NOT fabricate ethics policy rules in the absence of the file
+- [ ] Phase 7 summary explicitly notes that ethics review was skipped and why
+- [ ] Verdict: COMPLETE is still reachable despite the missing file
+- [ ] Gap flag appears in the season design output document (not just in conversation)
+- [ ] Next steps recommend creating `design/live-ops/ethics-policy.md`
+
+---
+
+## Protocol Compliance
+
+- [ ] `AskUserQuestion` used at every phase transition — user approves before the next phase begins
+- [ ] Phases 3 and 4 are always spawned in parallel, not sequentially
+- [ ] File Write Protocol: orchestrator never calls Write/Edit directly — all writes are delegated to sub-agents
+- [ ] Each output document gets its own "May I write to [path]?" ask from the relevant sub-agent
+- [ ] Ethics review in Phase 7 always references the ethics policy file path explicitly
+- [ ] Error recovery: any BLOCKED agent is surfaced immediately with AskUserQuestion options (skip / retry / stop)
+- [ ] Partial reports are produced if any phase blocks — work is never discarded
+- [ ] Verdict: COMPLETE only after user approves the consolidated season plan; BLOCKED if any unresolved ethics violation exists
+- [ ] Next steps always include `/design-review`, `/sprint-plan`, and `/team-release`
+
+---
+
+## Coverage Notes
+
+- Phase 5 parallel spawning (narrative-director + writer) follows the same pattern as Phases 3/4 but is not separately tested here — it uses the same parallel Task protocol validated in Case 4.
+- The "economy-rules.md absent" edge case is not separately tested — it would surface as a BLOCKED result from economy-designer and follow the standard error recovery path tested implicitly in Case 4.
+- The full content writing pipeline (Phase 5 output validation) is validated implicitly by the Case 1 happy path consolidated summary check.
+- Community manager communication calendar format (pre-launch, launch day, mid-season, final week) is validated implicitly by Case 1; no separate edge case is needed.
--- a/Framework/skills/team/team-narrative.md
+++ b/Framework/skills/team/team-narrative.md
@@ -0,0 +1,209 @@
+# Skill Test Spec: /team-narrative
+
+## Skill Summary
+
+Orchestrates the narrative team through a five-phase pipeline: narrative direction
+(narrative-director) → world foundation + dialogue drafting (world-builder and writer
+in parallel) → level narrative integration (level-designer) → consistency review
+(narrative-director) → polish + localization compliance (writer, localization-lead,
+and world-builder in parallel). Uses `AskUserQuestion` at each phase transition to
+present proposals as selectable options. Produces a narrative summary report and
+delivers narrative documents via subagents that each enforce the "May I write?"
+protocol. Verdict is COMPLETE when all phases succeed, or BLOCKED when a dependency
+is unresolved.
+
+---
+
+## Static Assertions (Structural)
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED
+- [ ] Contains "File Write Protocol" section
+- [ ] File writes are delegated to sub-agents — orchestrator does not write files directly
+- [ ] Sub-agents enforce "May I write to [path]?" before any write
+- [ ] Has a next-step handoff at the end (references `/design-review`, `/localize extract`, `/dev-story`)
+- [ ] Error Recovery Protocol section is present
+- [ ] `AskUserQuestion` is used at phase transitions before proceeding
+- [ ] Phase 2 explicitly spawns world-builder and writer in parallel
+- [ ] Phase 5 explicitly spawns writer, localization-lead, and world-builder in parallel
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All five phases complete, narrative doc delivered
+
+**Fixture:**
+- A game concept and GDD exist for the target feature (e.g., `design/gdd/faction-intro.md`)
+- Character voice profiles exist (e.g., `design/narrative/characters/`)
+- Existing lore entries exist for cross-reference (e.g., `design/narrative/lore/`)
+- No lore contradictions exist between existing entries and the new content
+
+**Input:** `/team-narrative faction introduction cutscene for the Ironveil faction`
+
+**Expected behavior:**
+1. Phase 1: narrative-director is spawned; outputs a narrative brief defining the story beat, characters involved, emotional tone, and lore dependencies
+2. `AskUserQuestion` presents the narrative brief; user approves before Phase 2 begins
+3. Phase 2: world-builder and writer are spawned in parallel; world-builder produces lore entries for the Ironveil faction; writer drafts dialogue lines using character voice profiles
+4. `AskUserQuestion` presents world foundation and dialogue drafts; user approves before Phase 3 begins
+5. Phase 3: level-designer is spawned; produces environmental storytelling layout, trigger placement, and pacing plan
+6. `AskUserQuestion` presents level narrative plan; user approves before Phase 4 begins
+7. Phase 4: narrative-director reviews all dialogue against voice profiles, verifies lore consistency, confirms pacing; approves or flags issues
+8. `AskUserQuestion` presents review results; user approves before Phase 5 begins
+9. Phase 5: writer, localization-lead, and world-builder are spawned in parallel; writer performs final self-review; localization-lead validates i18n compliance; world-builder finalizes canon levels
+10. Final summary report is presented; subagent asks "May I write the narrative document to [path]?" before writing
+11. Verdict: COMPLETE
+
+**Assertions:**
+- [ ] narrative-director is spawned in Phase 1 before any other agents
+- [ ] `AskUserQuestion` appears after Phase 1 output and before Phase 2 launch
+- [ ] world-builder and writer Task calls are issued simultaneously in Phase 2 (not sequentially)
+- [ ] level-designer is not launched until Phase 2 `AskUserQuestion` is approved
+- [ ] narrative-director is re-spawned in Phase 4 for consistency review
+- [ ] Phase 5 spawns all three agents (writer, localization-lead, world-builder) simultaneously
+- [ ] Summary report includes: narrative brief status, lore entries created/updated, dialogue lines written, level narrative integration points, consistency review results
+- [ ] No files are written by the orchestrator directly
+- [ ] Verdict is COMPLETE after delivery
+
+---
+
+### Case 2: Lore Contradiction Found — world-builder finds conflict before writer proceeds
+
+**Fixture:**
+- Existing lore entry at `design/narrative/lore/ironveil-history.md` states the Ironveil faction was founded 200 years ago
+- The new narrative brief (from Phase 1) states the Ironveil were founded 50 years ago
+- The writer has been spawned in parallel with the world-builder in Phase 2
+
+**Input:** `/team-narrative ironveil faction introduction cutscene`
+
+**Expected behavior:**
+1. Phases 1–2 begin normally
+2. Phase 2 world-builder detects a factual contradiction between the narrative brief and existing lore: founding date conflict
+3. world-builder returns BLOCKED with reason: "Lore contradiction found — founding date conflicts with `design/narrative/lore/ironveil-history.md`"
+4. Orchestrator surfaces the contradiction immediately: "world-builder: BLOCKED — Lore contradiction: founding date in narrative brief (50 years ago) conflicts with existing canon (200 years ago in `ironveil-history.md`)"
+5. Orchestrator assesses dependency: the writer's dialogue depends on canon lore — the writer's draft cannot be finalized without resolving the contradiction
+6. `AskUserQuestion` presents options:
+   - Revise the narrative brief to match existing canon (200 years ago)
+   - Update the existing lore entry to reflect the new canon (50 years ago)
+   - Stop here and resolve the contradiction in the lore docs first
+7. Writer output is preserved but flagged as pending canon resolution — work is not discarded
+8. Orchestrator does NOT proceed to Phase 3 until the contradiction is resolved or user explicitly chooses to skip
+
+**Assertions:**
+- [ ] Contradiction is surfaced before Phase 3 begins
+- [ ] Orchestrator does not silently resolve the contradiction by picking one version
+- [ ] `AskUserQuestion` presents at least 3 options including "stop and resolve first"
+- [ ] Writer's draft output is preserved in the partial report, not discarded
+- [ ] Phase 3 (level-designer) is not launched until the user resolves the contradiction
+- [ ] Verdict is BLOCKED (not COMPLETE) if the user stops to resolve the contradiction
+
+---
+
+### Case 3: No Argument — Usage guidance shown
+
+**Fixture:**
+- Any project state
+
+**Input:** `/team-narrative` (no argument)
+
+**Expected behavior:**
+1. Skill detects no argument is provided
+2. Outputs usage guidance: e.g., "Usage: `/team-narrative [narrative content description]` — describe the story content, scene, or narrative area to work on (e.g., `boss encounter cutscene`, `faction intro dialogue`, `tutorial narrative`)"
+3. Skill exits without spawning any agents
+
+**Assertions:**
+- [ ] Skill does NOT spawn any agents when no argument is provided
+- [ ] Usage message includes the correct invocation format with an argument example
+- [ ] Skill does NOT attempt to guess or infer a narrative topic from project files
+- [ ] No `AskUserQuestion` is used — output is direct guidance
+
+---
+
+### Case 4: Localization Compliance — localization-lead flags a non-translatable string
+
+**Fixture:**
+- Phases 1–4 complete successfully
+- Phase 5 begins; writer and world-builder complete without issues
+- localization-lead finds a dialogue line that uses a hardcoded formatted date string (e.g., `"On March 12th, Year 3"`) that cannot survive locale-specific translation without a locale-aware formatter
+
+**Input:** `/team-narrative ironveil faction introduction cutscene` (Phase 5 scenario)
+
+**Expected behavior:**
+1. Phase 5 spawns writer, localization-lead, and world-builder in parallel
+2. localization-lead completes its review and flags: "String key `dialogue.ironveil.intro.003` contains a hardcoded date format (`March 12th, Year 3`) that will not localize correctly — requires a locale-aware date placeholder"
+3. Orchestrator surfaces the localization blocker in the summary report
+4. The localization issue is labeled as BLOCKING in the final report (not advisory)
+5. `AskUserQuestion` presents options:
+   - Fix the string now (writer revises the line)
+   - Note the gap and deliver the narrative doc with the issue flagged
+   - Stop and resolve before finalizing
+6. If the user chooses to proceed with the issue flagged, verdict is COMPLETE with noted localization debt; if user stops, verdict is BLOCKED
+
+**Assertions:**
+- [ ] localization-lead is spawned in Phase 5 simultaneously with writer and world-builder
+- [ ] Hardcoded date format is identified as a localization blocker (not silently passed)
+- [ ] The specific string key and reason are included in the issue report
+- [ ] `AskUserQuestion` offers the option to fix now vs. flag and proceed
+- [ ] Verdict notes the localization debt if the user proceeds without fixing
+- [ ] Skill does NOT automatically rewrite the offending line without user approval
+
+---
+
+### Case 5: Writer Blocked — Missing character voice profiles
+
+**Fixture:**
+- Phase 1 narrative-director produces a narrative brief referencing two characters: Commander Varek and Advisor Selene
+- No character voice profiles exist in `design/narrative/characters/` for either character
+- Phase 2 begins; world-builder proceeds normally
+
+**Input:** `/team-narrative ironveil surrender negotiation scene`
+
+**Expected behavior:**
+1. Phase 1 completes; narrative brief lists Commander Varek and Advisor Selene as characters
+2. Phase 2: writer is spawned in parallel with world-builder
+3. writer returns BLOCKED: "Cannot produce dialogue — no voice profiles found for Commander Varek or Advisor Selene in `design/narrative/characters/`. Voice profiles required to match character tone and speech patterns."
+4. Orchestrator surfaces the blocker immediately: "writer: BLOCKED — Missing prerequisite: character voice profiles for Commander Varek and Advisor Selene"
+5. world-builder output is preserved; partial report is produced with lore entries
+6. `AskUserQuestion` presents options:
+   - Create voice profiles first (redirects to the narrative-director or design workflow)
+   - Provide minimal voice direction inline and retry the writer with that context
+   - Stop here and create voice profiles before proceeding
+7. Orchestrator does NOT proceed to Phase 3 (level-designer) without writer output
+
+**Assertions:**
+- [ ] Writer block is surfaced before Phase 3 begins
+- [ ] world-builder's completed lore output is preserved in the partial report
+- [ ] Missing prerequisite (voice profiles) is named specifically (character names and expected file path)
+- [ ] `AskUserQuestion` offers at least one option to resolve the missing prerequisite
+- [ ] Orchestrator does not fabricate voice profiles or invent character voices
+- [ ] Phase 3 is not launched while writer is BLOCKED without explicit user authorization
+
+---
+
+## Protocol Compliance
+
+- [ ] `AskUserQuestion` is used after every phase output before the next phase launches
+- [ ] Parallel spawning: Phase 2 (world-builder + writer) and Phase 5 (writer + localization-lead + world-builder) issue all Task calls before waiting for results
+- [ ] No files are written by the orchestrator directly — all writes are delegated to sub-agents
+- [ ] Each sub-agent enforces the "May I write to [path]?" protocol before any write
+- [ ] BLOCKED status from any agent is surfaced immediately — not silently skipped
+- [ ] A partial report is always produced when some agents complete and others block
+- [ ] Verdict is exactly COMPLETE or BLOCKED — no other verdict values used
+- [ ] Next Steps handoff references `/design-review`, `/localize extract`, and `/dev-story`
+
+---
+
+## Coverage Notes
+
+- Phase 3 (level-designer) and Phase 4 (narrative-director review) happy-path behavior are
+  validated implicitly by Case 1. Separate edge cases are not needed for these phases as
+  their failure modes follow the standard Error Recovery Protocol.
+- The "Retry with narrower scope" and "Skip this agent" resolution paths from the Error
+  Recovery Protocol are not separately tested — they follow the same `AskUserQuestion`
+  + partial-report pattern validated in Cases 2 and 5.
+- Localization concerns that are advisory (e.g., German/Finnish +30% expansion warnings)
+  vs. blocking (hardcoded formats) are distinguished in Case 4; advisory-only scenarios
+  follow the same pattern but do not change the verdict.
+- The writer's "all lines under 120 characters" and "string keys not raw strings" checks
+  in Phase 5 are covered implicitly by Case 4's localization compliance scenario.
--- a/Framework/skills/team/team-polish.md
+++ b/Framework/skills/team/team-polish.md
@@ -0,0 +1,218 @@
+# Skill Test Spec: /team-polish
+
+## Skill Summary
+
+Orchestrates the polish team through a six-phase pipeline: performance assessment
+(performance-analyst) → optimization (performance-analyst, optionally with
+engine-programmer when engine-level root causes are found) → visual polish
+(technical-artist, parallel with Phase 2) → audio polish (sound-designer, parallel
+with Phase 2) → hardening (qa-tester) → sign-off (orchestrator collects all results
+and issues READY FOR RELEASE or NEEDS MORE WORK). Uses `AskUserQuestion` at each
+phase transition. Engine-programmer is spawned conditionally only when Phase 1
+identifies engine-level root causes. Verdict is READY FOR RELEASE or NEEDS MORE WORK.
+
+---
+
+## Static Assertions (Structural)
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: READY FOR RELEASE, NEEDS MORE WORK
+- [ ] Contains "File Write Protocol" section
+- [ ] File writes are delegated to sub-agents — orchestrator does not write files directly
+- [ ] Sub-agents enforce "May I write to [path]?" before any write
+- [ ] Has a next-step handoff at the end (references `/release-checklist`, `/sprint-plan update`, `/gate-check`)
+- [ ] Error Recovery Protocol section is present
+- [ ] `AskUserQuestion` is used at phase transitions before proceeding
+- [ ] Phase 3 (visual polish) and Phase 4 (audio polish) are explicitly run in parallel with Phase 2
+- [ ] engine-programmer is conditionally spawned in Phase 2 only when Phase 1 identifies engine-level root causes
+- [ ] Phase 6 sign-off compares metrics against budgets before issuing verdict
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Full pipeline completes, READY FOR RELEASE verdict
+
+**Fixture:**
+- Feature exists and is functionally complete (e.g., `combat` system)
+- Performance budgets are defined in technical-preferences.md (e.g., target 60fps, 16ms frame budget)
+- No frame budget violations exist before polishing begins
+- No audio events are missing; VFX assets are complete
+- No regressions are introduced by polish changes
+
+**Input:** `/team-polish combat`
+
+**Expected behavior:**
+1. Phase 1: performance-analyst is spawned; profiles the combat system, measures frame budget, checks memory usage; output: performance report showing all metrics within budget, no violations
+2. `AskUserQuestion` presents performance report; user approves before Phases 2, 3, and 4 begin
+3. Phase 2: performance-analyst applies minor optimizations (e.g., draw call batching); no engine-programmer needed (no engine-level root causes identified)
+4. Phases 3 and 4 are launched in parallel alongside Phase 2:
+   - Phase 3: technical-artist reviews VFX for quality, optimizes particle systems, adds screen shake and visual juice
+   - Phase 4: sound-designer reviews audio events for completeness, checks mix levels, adds ambient audio layers
+5. All three parallel phases complete; `AskUserQuestion` presents results; user approves before Phase 5 begins
+6. Phase 5: qa-tester runs edge case tests, soak tests, stress tests, and regression tests; all pass
+7. `AskUserQuestion` presents test results; user approves before Phase 6
+8. Phase 6: orchestrator collects all results; compares before/after performance metrics against budgets; all metrics pass
+9. Subagent asks "May I write the polish report to `production/qa/evidence/polish-combat-[date].md`?" before writing
+10. Verdict: READY FOR RELEASE
+
+**Assertions:**
+- [ ] performance-analyst is spawned first in Phase 1 before any other agents
+- [ ] `AskUserQuestion` appears after Phase 1 output and before Phases 2/3/4 launch
+- [ ] Phases 3 and 4 Task calls are issued at the same time as Phase 2 (not after Phase 2 completes)
+- [ ] engine-programmer is NOT spawned when Phase 1 finds no engine-level root causes
+- [ ] qa-tester (Phase 5) is not launched until the parallel phases complete and user approves
+- [ ] Phase 6 verdict is based on comparison of metrics against defined budgets
+- [ ] Summary report includes: before/after performance metrics, visual polish changes, audio polish changes, test results
+- [ ] No files are written by the orchestrator directly
+- [ ] Verdict is READY FOR RELEASE
+
+---
+
+### Case 2: Performance Blocker — Frame budget violation cannot be fully resolved
+
+**Fixture:**
+- Feature being polished: `particle-storm` VFX system
+- Phase 1 identifies a frame budget violation: particle-storm costs 12ms on target hardware (budget is 6ms for this system)
+- Phase 2 performance-analyst applies optimizations reducing cost to 9ms — still over the 6ms budget
+- Phase 2 cannot fully resolve the violation without a fundamental design change
+
+**Input:** `/team-polish particle-storm`
+
+**Expected behavior:**
+1. Phase 1: performance-analyst identifies the 12ms frame cost vs. 6ms budget; reports "FRAME BUDGET VIOLATION: particle-storm costs 12ms, budget is 6ms"
+2. `AskUserQuestion` presents the violation; user chooses to proceed with optimization attempt
+3. Phase 2: performance-analyst applies optimizations; achieves 9ms — reduced but still over budget; reports "Optimization reduced cost to 9ms (was 12ms) — 3ms over budget. No further gains achievable without design changes."
+4. Phases 3 and 4 run in parallel with Phase 2 (visual and audio polish)
+5. Phase 5: qa-tester runs regression and edge case tests; all pass
+6. Phase 6: orchestrator collects results; frame budget violation (9ms vs 6ms budget) remains unresolved
+7. Verdict: NEEDS MORE WORK
+8. Report lists the specific unresolved issue: "particle-storm frame cost (9ms) exceeds budget (6ms) by 3ms — requires design scope reduction or budget renegotiation"
+9. Next Steps: schedule the remaining issue in `/sprint-plan update`; re-run `/team-polish` after fix
+
+**Assertions:**
+- [ ] Frame budget violation is flagged in Phase 1 with specific numbers (actual vs. budget)
+- [ ] Phase 2 reports the post-optimization metric explicitly (9ms achieved, 3ms still over)
+- [ ] Verdict is NEEDS MORE WORK (not READY FOR RELEASE) when a budget violation remains
+- [ ] The specific unresolved issue is listed by name with the remaining gap quantified
+- [ ] Next Steps references `/sprint-plan update` for scheduling the remaining fix
+- [ ] Phases 3 and 4 still run (polish work is not abandoned due to a Phase 2 partial resolution)
+- [ ] Phase 5 qa-tester still runs (regression testing is independent of the performance outcome)
+
+---
+
+### Case 3: No Argument — Usage guidance shown
+
+**Fixture:**
+- Any project state
+
+**Input:** `/team-polish` (no argument)
+
+**Expected behavior:**
+1. Skill detects no argument is provided
+2. Outputs usage guidance: e.g., "Usage: `/team-polish [feature or area]` — specify the feature or area to polish (e.g., `combat`, `main menu`, `inventory system`, `level-1`)"
+3. Skill exits without spawning any agents
+
+**Assertions:**
+- [ ] Skill does NOT spawn any agents when no argument is provided
+- [ ] Usage message includes the correct invocation format with argument examples
+- [ ] Skill does NOT attempt to guess a feature from project files
+- [ ] No `AskUserQuestion` is used — output is direct guidance
+
+---
+
+### Case 4: Engine-Level Bottleneck — engine-programmer spawned conditionally in Phase 2
+
+**Fixture:**
+- Feature being polished: `open-world` environment streaming
+- Phase 1 identifies a performance bottleneck with a root cause in the rendering pipeline: "draw call overhead is caused by the engine's scene tree traversal in the spatial indexer — this is an engine-level issue, not a game code issue"
+- Performance budgets are defined; the rendering overhead exceeds target frame budget
+
+**Input:** `/team-polish open-world`
+
+**Expected behavior:**
+1. Phase 1: performance-analyst profiles the environment; identifies frame budget violation; root cause analysis points to engine-level rendering pipeline (spatial indexer traversal overhead)
+2. Phase 1 output explicitly classifies the root cause as engine-level
+3. `AskUserQuestion` presents the performance report including the engine-level root cause; user approves before Phase 2
+4. Phase 2: performance-analyst is spawned for game-code-level optimizations AND engine-programmer is spawned in parallel for the engine-level rendering fix
+5. Phases 3 and 4 also run in parallel with Phase 2 (visual and audio polish)
+6. engine-programmer addresses the spatial indexer traversal; provides profiler validation showing the fix reduces overhead
+7. Phase 5: qa-tester runs regression tests including tests for the engine-level fix
+8. Phase 6: orchestrator collects all results; if metrics are now within budget, verdict is READY FOR RELEASE; if not, NEEDS MORE WORK
+
+**Assertions:**
+- [ ] engine-programmer is NOT spawned in Phase 2 unless Phase 1 explicitly identifies an engine-level root cause
+- [ ] engine-programmer is spawned in Phase 2 when Phase 1 identifies an engine-level root cause
+- [ ] engine-programmer and performance-analyst Task calls in Phase 2 are issued simultaneously (not sequentially)
+- [ ] Phases 3 and 4 also run in parallel with Phase 2 (not deferred until Phase 2 completes)
+- [ ] engine-programmer's output includes profiler validation of the fix
+- [ ] qa-tester in Phase 5 runs regression tests that cover the engine-level change
+- [ ] Verdict correctly reflects whether all metrics including the engine fix now meet budgets
+
+---
+
+### Case 5: Regression Found — Polish change broke an existing feature
+
+**Fixture:**
+- Feature being polished: `inventory-ui`
+- Phases 1–4 complete successfully; performance and polish changes are applied
+- Phase 5: qa-tester runs regression tests and finds that a shader optimization applied in Phase 3 broke the item highlight glow effect on hover — an existing feature that was working before the polish pass
+
+**Input:** `/team-polish inventory-ui` (Phase 5 scenario)
+
+**Expected behavior:**
+1. Phases 1–4 complete; polish changes include a shader optimization from technical-artist
+2. Phase 5: qa-tester runs regression tests and detects "Item highlight glow on hover no longer renders — regression introduced by shader optimization in Phase 3"
+3. qa-tester returns test results with the regression noted
+4. Orchestrator surfaces the regression immediately: "qa-tester: REGRESSION FOUND — `item-highlight-hover` glow broken by Phase 3 shader optimization"
+5. Subagent files a bug report asking "May I write the bug report to `production/qa/evidence/bug-polish-inventory-ui-[date].md`?" before writing
+6. Bug report is written after approval; it includes: the broken behavior, the polish change that caused it, reproduction steps, and severity
+7. `AskUserQuestion` presents the regression with options:
+   - Revert the shader optimization and find an alternative approach
+   - Fix the shader optimization to preserve the glow effect
+   - Accept the regression and schedule a fix in the next sprint
+8. Verdict: NEEDS MORE WORK (regression present regardless of user's chosen resolution path, unless fix is applied within the current session)
+
+**Assertions:**
+- [ ] Regression is surfaced before Phase 6 sign-off
+- [ ] The specific broken behavior and the responsible change are both named in the report
+- [ ] Subagent asks "May I write the bug report to [path]?" before filing
+- [ ] Bug report includes: broken behavior, causal change, reproduction steps, severity
+- [ ] `AskUserQuestion` offers options including revert, fix in place, and schedule later
+- [ ] Verdict is NEEDS MORE WORK when a regression is present and unresolved
+- [ ] Verdict may become READY FOR RELEASE only if the regression is fixed within the current polish session and qa-tester re-runs to confirm
+
+---
+
+## Protocol Compliance
+
+- [ ] Phase 1 (assessment) must complete before any other phase begins
+- [ ] `AskUserQuestion` is used after every phase output before the next phase launches
+- [ ] Phases 3 and 4 are always launched in parallel with Phase 2 (not deferred)
+- [ ] engine-programmer is only spawned when Phase 1 explicitly identifies engine-level root causes
+- [ ] No files are written by the orchestrator directly — all writes are delegated to sub-agents
+- [ ] Each sub-agent enforces the "May I write to [path]?" protocol before any write
+- [ ] BLOCKED status from any agent is surfaced immediately — not silently skipped
+- [ ] A partial report is always produced when some agents complete and others block
+- [ ] Verdict is exactly READY FOR RELEASE or NEEDS MORE WORK — no other verdict values used
+- [ ] NEEDS MORE WORK verdict always lists specific remaining issues with severity
+- [ ] Next Steps handoff references `/release-checklist` (on success) and `/sprint-plan update` + `/gate-check` (on failure)
+
+---
+
+## Coverage Notes
+
+- The tools-programmer optional agent (for content pipeline tool verification) is not
+  separately tested — it follows the same conditional spawn pattern as engine-programmer
+  and is invoked only when content authoring tools are involved in the polished area.
+- The "Retry with narrower scope" and "Skip this agent" resolution paths from the Error
+  Recovery Protocol are not separately tested — they follow the same `AskUserQuestion`
+  + partial-report pattern validated in Cases 2 and 5.
+- Phase 6 sign-off logic (collecting and comparing all metrics) is validated implicitly
+  by Cases 1 and 2. The distinction between READY FOR RELEASE and NEEDS MORE WORK is
+  exercised in both directions across these cases.
+- Soak testing and stress testing (Phase 5) are validated implicitly by Case 1's
+  qa-tester output. Case 5 focuses on the regression detection aspect of Phase 5.
+- The "minimum spec hardware" test path in Phase 5 is not separately tested — it follows
+  the same qa-tester delegation pattern when the hardware is available.
--- a/Framework/skills/team/team-qa.md
+++ b/Framework/skills/team/team-qa.md
@@ -0,0 +1,204 @@
+# Skill Test Spec: /team-qa
+
+## Skill Summary
+
+Orchestrates the QA team through a 7-phase structured testing cycle. Coordinates
+qa-lead (strategy, test plan, sign-off report) and qa-tester (test case writing,
+bug report writing). Covers scope detection, story classification, QA plan
+generation, smoke check gate, test case writing, manual QA execution with bug
+filing, and a final sign-off report with an APPROVED / APPROVED WITH CONDITIONS /
+NOT APPROVED verdict. Parallel qa-tester spawning is used in Phase 5 for
+independent stories.
+
+---
+
+## Static Assertions (Structural)
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED
+- [ ] Contains verdict keywords for sign-off report: APPROVED, APPROVED WITH CONDITIONS, NOT APPROVED
+- [ ] Contains "May I write" language for both the QA plan and the sign-off report
+- [ ] Has an Error Recovery Protocol section
+- [ ] Uses `AskUserQuestion` at phase transitions to capture user approval before proceeding
+- [ ] Phase 4 (smoke check) is a hard gate: FAIL stops the cycle
+- [ ] Bug reports are written to `production/qa/bugs/` with `BUG-[NNN]-[short-slug].md` naming
+- [ ] Next-step guidance differs by verdict (APPROVED / APPROVED WITH CONDITIONS / NOT APPROVED)
+- [ ] Independent qa-tester tasks in Phase 5 are spawned in parallel
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All stories pass manual QA, APPROVED verdict
+
+**Fixture:**
+- `production/sprints/sprint-03/` exists with 4 story files
+- Stories are a mix of types: 1 Logic, 1 Integration, 2 Visual/Feel
+- All stories have acceptance criteria populated
+- `tests/smoke/` contains a smoke test list; all items are verifiable
+- No existing bugs in `production/qa/bugs/`
+
+**Input:** `/team-qa sprint-03`
+
+**Expected behavior:**
+1. Phase 1: Reads all story files in `production/sprints/sprint-03/`; reads `production/stage.txt`; reports "Found 4 stories. Current stage: [stage]. Ready to begin QA strategy?"
+2. Phase 2: Spawns `qa-lead` via Task; produces strategy table classifying all 4 stories; no blockers flagged; presents to user; AskUserQuestion: user selects "Looks good — proceed to test plan"
+3. Phase 3: Produces QA plan document; asks "May I write the QA plan to `production/qa/qa-plan-sprint-03-[date].md`?"; writes after approval
+4. Phase 4: Spawns `qa-lead` via Task; reviews `tests/smoke/`; returns PASS; reports "Smoke check passed. Proceeding to test case writing."
+5. Phase 5: Spawns `qa-tester` via Task for each Visual/Feel and Integration story (2–3 stories); run in parallel; presents test cases grouped by story; AskUserQuestion per group; user approves
+6. Phase 6: Walks through each approved story; user marks all as PASS; result summary: "Stories PASS: 4, FAIL: 0, BLOCKED: 0"
+7. Phase 7: Spawns `qa-lead` via Task to produce sign-off report; report shows all stories PASS; no bugs filed; Verdict: APPROVED; asks "May I write this QA sign-off report to `production/qa/qa-signoff-sprint-03-[date].md`?"; writes after approval
+8. Verdict: COMPLETE — QA cycle finished
+
+**Assertions:**
+- [ ] Phase 1 correctly counts and reports 4 stories with current stage
+- [ ] Strategy table in Phase 2 classifies all 4 stories with correct types
+- [ ] QA plan written only after "May I write?" approval
+- [ ] Smoke check PASS allows pipeline to continue without user intervention
+- [ ] Phase 5 qa-tester tasks for independent stories are issued in parallel
+- [ ] Sign-off report includes Test Coverage Summary table and Verdict: APPROVED
+- [ ] Sign-off report written only after "May I write?" approval
+- [ ] Verdict: COMPLETE appears in final output
+- [ ] Next step: "Run `/gate-check` to validate advancement."
+
+---
+
+### Case 2: Smoke Check Fail — QA cycle stops at Phase 4
+
+**Fixture:**
+- `production/sprints/sprint-04/` exists with 3 story files
+- `tests/smoke/` exists with 5 smoke test items; 2 items cannot be verified (e.g., build is unstable, core navigation broken)
+
+**Input:** `/team-qa sprint-04`
+
+**Expected behavior:**
+1. Phases 1–3 complete normally; QA plan is written
+2. Phase 4: Spawns `qa-lead` via Task; smoke check returns FAIL; two specific failures are identified
+3. Skill reports: "Smoke check failed. QA cannot begin until these issues are resolved: [list of 2 failures]. Fix them and re-run `/smoke-check`, or re-run `/team-qa` once resolved."
+4. Skill stops immediately after Phase 4 — no Phase 5, 6, or 7 is executed
+5. No sign-off report is produced; no "May I write?" for a sign-off is issued
+
+**Assertions:**
+- [ ] Smoke check FAIL causes the pipeline to halt at Phase 4 — Phases 5, 6, 7 are NOT executed
+- [ ] Failure list is shown to the user explicitly (not summarized vaguely)
+- [ ] Skill recommends `/smoke-check` and `/team-qa` re-run as remediation steps
+- [ ] No QA sign-off report is written or offered
+- [ ] Skill does NOT produce a COMPLETE verdict
+- [ ] Any QA plan already written in Phase 3 is preserved (not deleted)
+
+---
+
+### Case 3: Bug Found — Visual/Feel story fails manual QA, bug report filed
+
+**Fixture:**
+- `production/sprints/sprint-05/` exists with 2 story files: 1 Logic (passes automated tests), 1 Visual/Feel
+- `tests/smoke/` smoke check passes
+- The Visual/Feel story's animation timing is visibly wrong (acceptance criterion not met)
+- `production/qa/bugs/` directory exists (empty or with existing bugs)
+
+**Input:** `/team-qa sprint-05`
+
+**Expected behavior:**
+1. Phases 1–5 complete normally; test cases are written for the Visual/Feel story
+2. Phase 6: User marks Visual/Feel story as FAIL; AskUserQuestion collects failure description: "Animation plays at 2x speed — jitter visible on every loop"
+3. Phase 6: Spawns `qa-tester` via Task to write a formal bug report; bug report written to `production/qa/bugs/BUG-001-animation-speed-jitter.md` (or next increment if bugs exist); report includes severity field
+4. Result summary: "Stories PASS: 1, FAIL: 1 — bugs filed: BUG-001"
+5. Phase 7: Spawns `qa-lead` to produce sign-off report; Bugs Found table lists BUG-001 with severity and status Open; Verdict: NOT APPROVED (S1/S2 bug open, or FAIL without documented workaround)
+6. Sign-off report write is offered; writes after approval
+7. Next step: "Resolve S1/S2 bugs and re-run `/team-qa` or targeted manual QA before advancing."
+
+**Assertions:**
+- [ ] FAIL result in Phase 6 triggers AskUserQuestion to collect the failure description before the bug report is written
+- [ ] `qa-tester` is spawned via Task to write the bug report — orchestrator does not write it directly
+- [ ] Bug report follows naming convention: `BUG-[NNN]-[short-slug].md` in `production/qa/bugs/`
+- [ ] Bug report NNN is incremented correctly from existing bugs in the directory
+- [ ] Phase 7 sign-off report Bugs Found table includes the bug ID, story name, severity, and status
+- [ ] Verdict in sign-off report is NOT APPROVED
+- [ ] Next step explicitly mentions re-running `/team-qa`
+- [ ] Verdict: COMPLETE is still issued by the orchestrator (the QA cycle finished — the verdict is NOT APPROVED, but the skill completed its pipeline)
+
+---
+
+### Case 4: No Argument — Skill infers active sprint or asks user
+
+**Fixture (variant A — state files present):**
+- `production/session-state/active.md` exists and contains a reference to `sprint-06`
+- `production/sprint-status.yaml` exists and identifies `sprint-06` as active
+
+**Fixture (variant B — state files absent):**
+- `production/session-state/active.md` does NOT exist
+- `production/sprint-status.yaml` does NOT exist
+
+**Input:** `/team-qa` (no argument)
+
+**Expected behavior (variant A):**
+1. Phase 1: No argument provided; reads `production/session-state/active.md`; reads `production/sprint-status.yaml`
+2. Detects `sprint-06` as the active sprint from both sources
+3. Proceeds as if `/team-qa sprint-06` was the input; reports "No sprint argument provided — inferred sprint-06 from session state. Found [N] stories."
+
+**Expected behavior (variant B):**
+1. Phase 1: No argument provided; attempts to read `production/session-state/active.md` — file missing; attempts to read `production/sprint-status.yaml` — file missing
+2. Cannot infer sprint; uses AskUserQuestion: "Which sprint or feature should QA cover?" with options to type a sprint identifier or cancel
+
+**Assertions:**
+- [ ] Skill does NOT default to a hardcoded sprint name when no argument is provided
+- [ ] Skill reads both `production/session-state/active.md` AND `production/sprint-status.yaml` before asking the user (variant A)
+- [ ] When both state files are absent, skill uses AskUserQuestion rather than guessing (variant B)
+- [ ] Inferred sprint is reported to the user before proceeding (variant A transparency)
+- [ ] Skill does NOT error out when state files are missing — it falls back to asking (variant B)
+
+---
+
+### Case 5: Mixed Results — Some PASS, one FAIL with S1 bug, one BLOCKED
+
+**Fixture:**
+- `production/sprints/sprint-07/` exists with 4 story files
+- Smoke check passes
+- Story A (Logic): automated test passes — PASS
+- Story B (UI): manual QA — PASS WITH NOTES (minor text overflow)
+- Story C (Visual/Feel): manual QA — FAIL; tester identifies S1 crash on ability activation
+- Story D (Integration): cannot test — BLOCKED (dependency system not yet implemented)
+
+**Input:** `/team-qa sprint-07`
+
+**Expected behavior:**
+1. Phases 1–5 proceed; Phase 5 test cases cover stories B, C, D
+2. Phase 6: User marks Story A as implicitly PASS (automated); Story B: PASS WITH NOTES; Story C: FAIL; Story D: BLOCKED
+3. After Story C FAIL: qa-tester spawned to write bug report `BUG-001-crash-ability-activation.md` with S1 severity
+4. Result summary presented: "Stories PASS: 1, PASS WITH NOTES: 1, FAIL: 1 — bugs filed: BUG-001 (S1), BLOCKED: 1"
+5. Phase 7: qa-lead produces sign-off report covering all 4 stories; BUG-001 listed as S1/Open; Story D listed as BLOCKED; Verdict: NOT APPROVED
+6. Sign-off report written after "May I write?" approval
+7. Next step: "Resolve S1/S2 bugs and re-run `/team-qa` or targeted manual QA before advancing."
+
+**Assertions:**
+- [ ] All 4 stories appear in the Phase 7 sign-off report Test Coverage Summary table — none are silently omitted
+- [ ] Story D (BLOCKED) is listed in the report with a BLOCKED status, not silently dropped
+- [ ] S1 bug causes Verdict: NOT APPROVED regardless of the other stories passing
+- [ ] PASS WITH NOTES stories do not downgrade to FAIL — they are tracked separately
+- [ ] BUG-001 severity is listed as S1 in the Bugs Found table
+- [ ] Partial results are preserved — the sign-off report is still produced even with failures and blocks
+- [ ] Verdict: COMPLETE is issued by the orchestrator (pipeline completed); sign-off verdict is NOT APPROVED
+
+---
+
+## Protocol Compliance
+
+- [ ] `AskUserQuestion` used at Phase 2 (strategy review), Phase 5 (test case approval per group), and Phase 6 (per-story manual QA result)
+- [ ] Phase 4 smoke check is a hard gate: FAIL halts the pipeline at Phase 4 with no exceptions
+- [ ] "May I write?" asked separately for QA plan (Phase 3) and sign-off report (Phase 7)
+- [ ] Bug reports are always written by `qa-tester` via Task — orchestrator does not write directly
+- [ ] Phase 5 qa-tester tasks for independent stories are issued in parallel where possible
+- [ ] Error recovery: any BLOCKED agent is surfaced immediately with AskUserQuestion options
+- [ ] Partial report always produced — no work is discarded because one story failed or blocked
+- [ ] Sign-off verdict rules are strictly applied: any S1/S2 bug open = NOT APPROVED; no exceptions
+- [ ] Orchestrator-level Verdict: COMPLETE is distinct from the sign-off report's APPROVED/NOT APPROVED verdict
+
+---
+
+## Coverage Notes
+
+- The "APPROVED WITH CONDITIONS" verdict path (S3/S4 bugs, PASS WITH NOTES) is covered implicitly by Case 5's PASS WITH NOTES story (Story B) — if no S1/S2 bugs existed, that case would produce APPROVED WITH CONDITIONS. A dedicated case is not required as the verdict logic is table-driven.
+- The `feature: [system-name]` argument form is not separately tested — it follows the same Phase 1 logic as the sprint form, using glob instead of directory read. The no-argument inference path (Case 4) provides sufficient coverage of the detection logic.
+- Logic stories with passing automated tests do not need manual QA — this is validated implicitly by Case 5 (Story A) where the Logic story receives no manual QA phase.
+- Parallel qa-tester spawning in Phase 5 is validated implicitly by Case 1 (multiple Visual/Feel stories issued simultaneously); no dedicated parallelism case is required beyond the Static Assertions check.
--- a/Framework/skills/team/team-release.md
+++ b/Framework/skills/team/team-release.md
@@ -0,0 +1,215 @@
+# Skill Test Spec: /team-release
+
+## Skill Summary
+
+Orchestrates the release team through a 7-phase pipeline from release candidate to
+deployment and post-release monitoring. Coordinates release-manager, qa-lead,
+devops-engineer, producer, security-engineer (optional, required for online/
+multiplayer), network-programmer (optional, required for multiplayer),
+analytics-engineer, and community-manager. Phase 3 agents run in parallel. Ends
+with a go/no-go decision; deployment (Phase 6) is skipped if the producer calls
+NO-GO. Closes with a post-release monitoring plan.
+
+---
+
+## Static Assertions (Structural)
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED
+- [ ] Contains "May I write" language in the File Write Protocol section (delegated to sub-agents)
+- [ ] Has a File Write Protocol section stating that the orchestrator does not write files directly
+- [ ] Has an Error Recovery Protocol section with four recovery options (surface / assess / offer options / partial report)
+- [ ] Has a next-step handoff referencing post-release monitoring, `/retrospective`, and `production/stage.txt`
+- [ ] Uses `AskUserQuestion` at phase transitions requiring user approval before proceeding
+- [ ] Phase 3 agents (qa-lead, devops-engineer, and optionally security-engineer, network-programmer) are explicitly stated to run in parallel
+- [ ] Phase 6 (Deployment) is conditional on a GO decision from Phase 5
+- [ ] security-engineer is described as conditional on online features / player data — not always spawned
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path (Single-Player) — All phases complete, version deployed
+
+**Fixture:**
+- `production/stage.txt` exists and contains a Production-or-later stage
+- Milestone acceptance criteria are all met (producer can confirm)
+- No online features, no multiplayer, no player data collection
+- All CI builds are clean on the current branch
+- No open S1/S2 bugs
+- `production/sprints/` contains the completed sprint stories for this milestone
+
+**Input:** `/team-release v1.0.0`
+
+**Expected behavior:**
+1. Phase 1: Spawns `producer` via Task; confirms all milestone acceptance criteria met; identifies any deferred scope; produces release authorization; presents to user; AskUserQuestion: user approves before Phase 2
+2. Phase 2: Spawns `release-manager` via Task; cuts release branch from agreed commit; bumps version numbers; invokes `/release-checklist`; freezes branch; output: branch name and checklist; AskUserQuestion: user approves before Phase 3
+3. Phase 3 (parallel): Issues Task calls simultaneously for `qa-lead` (regression suite, critical path sign-off) and `devops-engineer` (build artifacts, CI verification); security-engineer is NOT spawned (no online features); network-programmer is NOT spawned (no multiplayer); both complete successfully
+4. Phase 4: Verifies localization strings all translated; `analytics-engineer` verifies telemetry fires correctly on the release build; performance benchmarks pass; sign-off produced
+5. Phase 5: Spawns `producer` via Task; collects sign-offs from qa-lead, release-manager, devops-engineer; no open blocking issues; producer declares GO; AskUserQuestion: user sees GO decision and confirms deployment
+6. Phase 6: Spawns `release-manager` + `devops-engineer` (parallel); tags release in version control; invokes `/changelog`; deploys to staging; smoke test passes; deploys to production; simultaneously spawns `community-manager` to finalize patch notes via `/patch-notes v1.0.0` and prepare launch announcement
+7. Phase 7: release-manager generates release report; producer updates milestone tracking; qa-lead begins monitoring for regressions; community-manager publishes communication; analytics-engineer confirms live dashboards healthy
+8. Verdict: COMPLETE — release executed and deployed
+
+**Assertions:**
+- [ ] Phase 3 qa-lead and devops-engineer Task calls are issued simultaneously, not sequentially
+- [ ] security-engineer is NOT spawned when the game has no online features, multiplayer, or player data
+- [ ] Phase 5 producer collects sign-offs from all required parties before declaring GO
+- [ ] Phase 6 deployment only begins after GO decision is confirmed by the user
+- [ ] `/changelog` is invoked by release-manager in Phase 6 (not written directly)
+- [ ] `/patch-notes v1.0.0` is invoked by community-manager in Phase 6
+- [ ] Phase 7 monitoring plan includes a 48-hour post-release monitoring commitment
+- [ ] Next steps recommend updating `production/stage.txt` to `Live` after successful deployment
+- [ ] Verdict: COMPLETE appears in the final output
+
+---
+
+### Case 2: Go/No-Go: NO — S1 bug found in Phase 3, deployment skipped
+
+**Fixture:**
+- Release candidate branch exists for v0.9.0
+- qa-lead discovers a previously unreported S1 crash in the main menu during Phase 3 regression testing
+- devops-engineer build is clean and artifacts are ready
+- producer is aware of the S1 bug
+
+**Input:** `/team-release v0.9.0`
+
+**Expected behavior:**
+1. Phases 1–2 complete normally; release candidate is cut
+2. Phase 3 (parallel): devops-engineer returns clean build sign-off; qa-lead returns with an S1 bug identified and regression suite failing; qa-lead declares quality gate: NOT PASSED
+3. Orchestrator surfaces the qa-lead result immediately: "QA-LEAD: S1 bug found — [crash description]. Quality gate: NOT PASSED."
+4. Phase 4 proceeds cautiously or is paused (AskUserQuestion: continue to Phase 4 or skip to Phase 5 for go/no-go?)
+5. Phase 5: Spawns `producer` via Task; producer receives qa-lead's NOT PASSED verdict; no S1 sign-off available; producer declares NO-GO with rationale: "S1 bug [ID] is open and unresolved. Releasing is not safe."
+6. AskUserQuestion: user is presented with the NO-GO decision and the S1 bug details; options: fix the bug and re-run, defer the release, or override (with documented rationale)
+7. Phase 6 (Deployment) is SKIPPED entirely — no branch tagging, no deploy to staging, no deploy to production
+8. community-manager is NOT spawned in Phase 6 (no deployment to announce)
+9. Skill ends with a partial report summarizing what was completed (Phases 1–5) and what was skipped (Phase 6) and why
+10. Verdict: BLOCKED — release not deployed
+
+**Assertions:**
+- [ ] qa-lead S1 bug finding is surfaced to the user immediately after Phase 3 completes — not suppressed until Phase 5
+- [ ] producer's NO-GO decision explicitly references the S1 bug and the quality gate result
+- [ ] Phase 6 Deployment is completely skipped when producer declares NO-GO
+- [ ] community-manager is NOT spawned for patch notes or launch announcement on NO-GO
+- [ ] The partial report clearly states which phases completed and which were skipped, with reasons
+- [ ] Verdict: BLOCKED (not COMPLETE) when deployment is skipped due to NO-GO
+- [ ] AskUserQuestion offers the user resolution options (fix and re-run / defer / override with rationale)
+- [ ] Override path (if chosen) requires user to provide a documented rationale before proceeding to Phase 6
+
+---
+
+### Case 3: Security Audit for Online Game — security-engineer is spawned in Phase 3
+
+**Fixture:**
+- Game has multiplayer features and stores player account data
+- Release candidate exists for v2.1.0
+- qa-lead and devops-engineer both return clean sign-offs
+- security-engineer audit is required per team composition rules
+
+**Input:** `/team-release v2.1.0`
+
+**Expected behavior:**
+1. Phases 1–2 complete normally
+2. Phase 3 (parallel): Orchestrator detects that the game has online/multiplayer features and player data; issues Task calls simultaneously for `qa-lead`, `devops-engineer`, AND `security-engineer`; also spawns `network-programmer` for netcode stability sign-off
+3. security-engineer conducts pre-release security audit: reviews authentication flows, anti-cheat presence, data privacy compliance; returns sign-off
+4. network-programmer verifies lag compensation, reconnect handling, and bandwidth under load; returns sign-off
+5. All four Phase 3 agents complete; their results are collected before Phase 4 begins
+6. Phase 5: producer collects sign-offs from all four Phase 3 agents (qa-lead, devops-engineer, security-engineer, network-programmer) before making the go/no-go call
+7. Remaining phases proceed normally to COMPLETE
+
+**Assertions:**
+- [ ] security-engineer IS spawned in Phase 3 when the game has online features, multiplayer, or player data — this is not skipped
+- [ ] network-programmer IS spawned in Phase 3 when the game has multiplayer
+- [ ] All four Phase 3 Task calls (qa-lead, devops-engineer, security-engineer, network-programmer) are issued simultaneously
+- [ ] security-engineer audit covers authentication, anti-cheat, and data privacy compliance
+- [ ] Phase 5 producer sign-off collection includes security-engineer (four parties, not two)
+- [ ] Phase 6 deployment does not begin until security-engineer has signed off
+- [ ] Skill does NOT treat security-engineer as optional for a game with player data
+
+---
+
+### Case 4: Localization Miss — Untranslated strings block the ship
+
+**Fixture:**
+- Release candidate exists for v1.2.0
+- Phase 3 (qa-lead, devops-engineer) complete with clean sign-offs
+- Phase 4: localization verification detects 47 untranslated strings in the French locale (a supported language in the game's localization scope)
+- localization-lead is available as a delegatable agent
+
+**Input:** `/team-release v1.2.0`
+
+**Expected behavior:**
+1. Phases 1–3 complete with clean sign-offs
+2. Phase 4: Localization verification step detects untranslated strings; identifies 47 strings in French locale; localization-lead (if available) is spawned to assess the severity
+3. Orchestrator surfaces: "LOCALIZATION MISS: 47 untranslated strings found in French locale. Localization sign-off is required before shipping."
+4. AskUserQuestion: options presented — (a) Fix translations and re-run Phase 4, (b) Remove French locale from this release, (c) Ship as-is with a known issues note
+5. If user selects (a): Phase 4 is re-run after translations are provided; skill waits for localization sign-off
+6. Phase 5 go/no-go does NOT proceed while localization sign-off is outstanding
+7. Ship is blocked (Phase 6 not entered) until localization issue is resolved or explicitly waived
+
+**Assertions:**
+- [ ] Localization verification in Phase 4 detects untranslated strings and counts them (not just "some strings missing")
+- [ ] Untranslated strings for a supported locale block the pipeline before Phase 5
+- [ ] AskUserQuestion is used to offer the user resolution choices — the skill does not auto-waive
+- [ ] Phase 5 go/no-go is NOT called while localization sign-off is pending
+- [ ] If user chooses to re-run Phase 4: the skill does not require restarting from Phase 1
+- [ ] If user explicitly waives (ships as-is): the waiver is documented in the release report (Phase 7) as a known issue
+- [ ] Skill does NOT fabricate translated strings to unblock itself
+
+---
+
+### Case 5: No Argument — Skill infers version or asks
+
+**Fixture (variant A — milestone data present):**
+- `production/milestones/` exists with a milestone file; most recent milestone is "v1.1.0 — Gold"
+- `production/session-state/active.md` references a version or milestone
+
+**Fixture (variant B — no discoverable version):**
+- `production/milestones/` does not exist
+- `production/session-state/active.md` does not reference a version
+- No git tags are present from which to infer a version
+
+**Input:** `/team-release` (no argument)
+
+**Expected behavior (variant A):**
+1. Phase 1: No argument provided; reads `production/session-state/active.md`; reads most recent milestone file in `production/milestones/`
+2. Infers v1.1.0 as the target version; reports "No version argument provided — inferred v1.1.0 from milestone data. Proceeding."
+3. Confirms with AskUserQuestion before beginning Phase 1 proper: "Releasing v1.1.0. Is this correct?"
+4. Proceeds as if `/team-release v1.1.0` was the input
+
+**Expected behavior (variant B):**
+1. Phase 1: No argument provided; reads available state files — no version discoverable
+2. Uses AskUserQuestion: "What version number should be released? (e.g., v1.0.0)"
+3. Waits for user input before proceeding
+
+**Assertions:**
+- [ ] Skill does NOT default to a hardcoded version string when no argument is provided
+- [ ] Skill reads `production/session-state/active.md` and milestone files before asking (variant A)
+- [ ] Inferred version is confirmed with the user via AskUserQuestion before proceeding (variant A)
+- [ ] When no version is discoverable, AskUserQuestion is used — skill does not guess (variant B)
+- [ ] Skill does NOT error out when milestone files are absent — it falls back to asking (variant B)
+
+---
+
+## Protocol Compliance
+
+- [ ] `AskUserQuestion` used at each phase transition gate (post-Phase 1, post-Phase 2, post-Phase 3/4 if issues, post-Phase 5 go/no-go)
+- [ ] Phase 3 agents are always issued as parallel Task calls — qa-lead and devops-engineer are never sequential
+- [ ] security-engineer is conditionally spawned based on game features — never silently skipped when features are present
+- [ ] File Write Protocol: orchestrator never calls Write/Edit directly — all writes are delegated to sub-agents or sub-skills
+- [ ] Phase 6 Deployment is strictly conditional on a GO verdict from Phase 5 — never auto-triggered
+- [ ] Error recovery: any BLOCKED agent is surfaced immediately before continuing to dependent phases
+- [ ] Partial reports are always produced if any phase fails or the pipeline is halted (Case 2)
+- [ ] Verdict: COMPLETE only when deployment completes; BLOCKED when go/no-go is NO or a hard blocker is unresolved
+- [ ] Next steps always include 48-hour post-release monitoring, `/retrospective` recommendation, and `production/stage.txt` update to `Live`
+
+---
+
+## Coverage Notes
+
+- Phase 7 post-release actions (release report, milestone tracking, community publishing, dashboard monitoring) are validated implicitly by Case 1. No separate edge case is required as Phase 7 is non-gated and does not have a blocking failure mode.
+- The "devops-engineer build fails" path is not separately tested — it would surface as a BLOCKED result in Phase 3 and follow the standard error recovery protocol (surface → assess → AskUserQuestion options). This is validated structurally by the Static Assertions error recovery check.
+- The parallel Phase 4 path (localization + performance + analytics simultaneously with Phase 3) is a documented option in the skill ("can run in parallel with Phase 3 if resources available"). Case 4 tests Phase 4 as a sequential gate; the parallel variant is left to the skill's implementation judgment.
+- The `network-programmer` sign-off path for multiplayer is validated as part of Case 3 rather than a separate case, as it follows the same parallel-spawn pattern as security-engineer.
+- The "override NO-GO with documented rationale" path in Case 2 is referenced but not exhaustively tested — it is an escape hatch that the skill must support, and its existence is validated by the AskUserQuestion options assertion in Case 2.
--- a/Framework/skills/team/team-ui.md
+++ b/Framework/skills/team/team-ui.md
@@ -0,0 +1,201 @@
+# Skill Test Spec: /team-ui
+
+## Skill Summary
+
+Orchestrates the UI team through the full UX pipeline for a single UI feature.
+Coordinates ux-designer, ui-programmer, art-director, the engine UI specialist,
+and accessibility-specialist through five structured phases: Context Gathering +
+UX Spec (Phase 1a/1b) → UX Review Gate (Phase 1c) → Visual Design (Phase 2) →
+Implementation (Phase 3) → Review in parallel (Phase 4) → Polish (Phase 5).
+Uses `AskUserQuestion` at each phase transition. Delegates all file writes to
+sub-agents and sub-skills (`/ux-design`, `ui-programmer`). Produces a summary report
+with verdict COMPLETE / BLOCKED and handoffs to `/ux-review`, `/code-review`,
+`/team-polish`.
+
+---
+
+## Static Assertions (Structural)
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings (Phase 1a through Phase 5 are all present)
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED
+- [ ] Contains "May I write" or "File Write Protocol" — writes delegated to sub-agents and sub-skills, orchestrator does not write files directly
+- [ ] Has a next-step handoff at the end (references `/ux-review`, `/code-review`, `/team-polish`)
+- [ ] Error Recovery Protocol section is present with all four recovery steps
+- [ ] Uses `AskUserQuestion` at phase transitions for user approval before proceeding
+- [ ] Phase 4 is explicitly marked as parallel (ux-designer, art-director, accessibility-specialist)
+- [ ] UX Review Gate (Phase 1c) is defined as a blocking gate — skill must not proceed to Phase 2 without APPROVED verdict
+- [ ] Team Composition lists all five roles (ux-designer, ui-programmer, art-director, engine UI specialist, accessibility-specialist)
+- [ ] References the interaction pattern library (`design/ux/interaction-patterns.md`) — ui-programmer must use existing patterns
+- [ ] Phase 1a reads `design/accessibility-requirements.md` before design begins
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Full pipeline from UX spec through polish succeeds
+
+**Fixture:**
+- `design/gdd/game-concept.md` exists with platform targets and intended audience
+- `design/player-journey.md` exists
+- `design/ux/interaction-patterns.md` exists with relevant patterns
+- `design/accessibility-requirements.md` exists with committed tier (e.g., Enhanced)
+- Engine UI specialist configured in `.claude/docs/technical-preferences.md`
+
+**Input:** `/team-ui inventory screen`
+
+**Expected behavior:**
+1. Phase 1a — orchestrator reads game-concept.md, player-journey.md, relevant GDD UI sections, interaction-patterns.md, accessibility-requirements.md; summarizes a brief for the ux-designer
+2. Phase 1b — `/ux-design inventory-screen` invoked (or ux-designer spawned directly); produces `design/ux/inventory-screen.md` using `ux-spec.md` template; `AskUserQuestion` confirms spec before review
+3. Phase 1c — `/ux-review design/ux/inventory-screen.md` invoked; returns APPROVED; gate passed, proceed to Phase 2
+4. Phase 2 — art-director spawned; reviews full UX spec (not only wireframes); applies visual treatment; verifies color contrast; produces visual design spec with asset manifest; `AskUserQuestion` confirms before Phase 3
+5. Phase 3 — engine UI specialist spawned first (read from technical-preferences.md); produces implementation notes for ui-programmer; ui-programmer spawned with UX spec + visual spec + engine notes; implementation produced; interaction-patterns.md updated if new patterns introduced
+6. Phase 4 — ux-designer, art-director, accessibility-specialist spawned in parallel; all three return results before Phase 5
+7. Phase 5 — review feedback addressed; animations verified skippable; UI sounds confirmed through audio event system; interaction-patterns.md final check; verdict: COMPLETE
+8. Summary report: UX spec APPROVED, visual design COMPLETE, implementation COMPLETE, accessibility COMPLIANT, all input methods supported, pattern library updated, verdict: COMPLETE
+
+**Assertions:**
+- [ ] Phase 1a reads all five sources before briefing ux-designer
+- [ ] UX Review Gate checked before Phase 2 — Phase 2 does NOT begin until APPROVED
+- [ ] Art-director in Phase 2 reviews full spec, not just wireframe images
+- [ ] Engine UI specialist spawned before ui-programmer in Phase 3
+- [ ] Phase 4 agents launched simultaneously (ux-designer, art-director, accessibility-specialist)
+- [ ] All file writes delegated to sub-agents and sub-skills
+- [ ] Verdict COMPLETE in final summary report
+- [ ] Next steps include `/ux-review`, `/code-review`, `/team-polish`
+
+---
+
+### Case 2: UX Review Gate — Spec fails review; skill halts before implementation
+
+**Fixture:**
+- `design/ux/inventory-screen.md` produced by Phase 1b
+- `/ux-review` returns verdict NEEDS REVISION with specific concerns flagged (e.g., gamepad navigation flow incomplete, contrast ratio below minimum)
+
+**Input:** `/team-ui inventory screen`
+
+**Expected behavior:**
+1. Phase 1a + 1b complete — UX spec produced
+2. Phase 1c — `/ux-review design/ux/inventory-screen.md` returns NEEDS REVISION
+3. Skill does NOT advance to Phase 2
+4. `AskUserQuestion` presented with the specific flagged concerns and options:
+   - (a) Return to ux-designer to address the issues and re-review
+   - (b) Accept the risk and proceed to Phase 2 anyway (conscious decision)
+5. If user chooses (a): ux-designer revises spec, `/ux-review` re-run; loop continues until APPROVED or user overrides
+6. If user chooses (b): skill proceeds with an explicit NEEDS REVISION note in the final report
+7. Skill does NOT silently proceed past the gate
+
+**Assertions:**
+- [ ] Phase 2 does NOT begin while UX review verdict is NEEDS REVISION
+- [ ] `AskUserQuestion` presents the specific flagged concerns before offering options
+- [ ] User must make a conscious choice to override — skill does not assume override
+- [ ] If user accepts risk, NEEDS REVISION concern is documented in the final report
+- [ ] Revision-and-re-review loop is offered (not just a one-shot failure)
+- [ ] Skill does NOT discard the produced UX spec on review failure
+
+---
+
+### Case 3: No Argument — Usage guidance shown
+
+**Fixture:**
+- Any project state
+
+**Input:** `/team-ui` (no argument)
+
+**Expected behavior:**
+1. Skill detects no argument provided
+2. Outputs usage message explaining the required argument (UI feature description)
+3. Provides an example invocation: `/team-ui [UI feature description]`
+4. Skill exits without spawning any subagents or reading any project files
+
+**Assertions:**
+- [ ] Skill does NOT spawn any subagents when no argument is given
+- [ ] Usage message includes the argument-hint format from frontmatter
+- [ ] At least one example of a valid invocation is shown
+- [ ] No UX spec files or GDDs read before failing
+- [ ] Verdict is NOT shown (pipeline never starts)
+
+---
+
+### Case 4: Accessibility Parallel Review — Phase 4 runs three streams simultaneously
+
+**Fixture:**
+- `design/ux/inventory-screen.md` exists (APPROVED)
+- Visual design spec complete
+- Implementation complete
+- `design/accessibility-requirements.md` committed tier: Enhanced
+
+**Input:** `/team-ui inventory screen` (resuming from Phase 3 complete)
+
+**Expected behavior:**
+1. Phase 4 begins after implementation is confirmed complete
+2. Three Task calls issued simultaneously: ux-designer, art-director, accessibility-specialist
+3. Each stream operates independently:
+   - ux-designer: verifies implementation matches wireframes, tests keyboard-only and gamepad-only navigation, checks accessibility features function
+   - art-director: verifies visual consistency with art bible at minimum and maximum supported resolutions
+   - accessibility-specialist: audits against the Enhanced accessibility tier in `design/accessibility-requirements.md`; any violation flagged as a blocker
+4. Skill waits for all three results before proceeding to Phase 5
+5. `AskUserQuestion` presents all three review results before Phase 5 begins
+
+**Assertions:**
+- [ ] All three Task calls issued before any result is awaited (parallel, not sequential)
+- [ ] Phase 5 does NOT begin until all three Phase 4 agents have returned
+- [ ] Accessibility-specialist explicitly reads `design/accessibility-requirements.md` for the committed tier
+- [ ] Accessibility violations flagged as BLOCKING (not merely advisory)
+- [ ] `AskUserQuestion` shows all three review streams' results together before Phase 5 approval
+- [ ] No Phase 4 agent's output is used as input for another Phase 4 agent
+
+---
+
+### Case 5: Missing Interaction Pattern Library — Skill notes the gap rather than inventing patterns
+
+**Fixture:**
+- `design/ux/interaction-patterns.md` does NOT exist
+- All other required files present
+
+**Input:** `/team-ui settings menu`
+
+**Expected behavior:**
+1. Phase 1a — orchestrator attempts to read `design/ux/interaction-patterns.md`; file not found
+2. Skill surfaces the gap: "interaction-patterns.md does not exist — no existing patterns to reuse"
+3. `AskUserQuestion` presented with options:
+   - (a) Run `/ux-design patterns` first to establish the pattern library, then continue
+   - (b) Proceed without the pattern library — ux-designer will document new patterns as they are created
+4. Skill does NOT invent or assume patterns from other sources
+5. If user chooses (b): ui-programmer is explicitly instructed to treat all patterns created as new and to add each to a new `design/ux/interaction-patterns.md` at completion
+6. Final report notes that interaction-patterns.md was created (or is still absent if user skipped)
+
+**Assertions:**
+- [ ] Skill does NOT silently ignore the missing pattern library
+- [ ] Skill does NOT invent patterns by guessing from the feature name or GDD alone
+- [ ] `AskUserQuestion` offers a "create pattern library first" option (referencing `/ux-design patterns`)
+- [ ] If user proceeds without the library, ui-programmer is told to treat all patterns as new
+- [ ] Final report documents pattern library status (created / absent / updated)
+- [ ] Skill does NOT fail entirely — the gap is noted and user is given a choice
+
+---
+
+## Protocol Compliance
+
+- [ ] `AskUserQuestion` used at each phase transition — user approves before pipeline advances
+- [ ] UX Review Gate (Phase 1c) is blocking — Phase 2 cannot begin without APPROVED or explicit user override
+- [ ] All file writes delegated to sub-agents and sub-skills — orchestrator does not call Write or Edit directly
+- [ ] Phase 4 agents launched in parallel per skill spec
+- [ ] Error Recovery Protocol followed: surface → assess → offer options → partial report
+- [ ] Partial report always produced even when agents are BLOCKED
+- [ ] Verdict is one of COMPLETE / BLOCKED
+- [ ] Next steps present at end: `/ux-review`, `/code-review`, `/team-polish`
+
+---
+
+## Coverage Notes
+
+- The HUD-specific path (`/ux-design hud` + `hud-design.md` template + visual budget check in Phase 5)
+  is not separately tested here; it shares the same phase structure but uses different templates.
+- The "Update in place" path for interaction-patterns.md (new pattern added during implementation)
+  is exercised implicitly in Case 1 Step 5 — a dedicated fixture with a known new pattern would
+  strengthen coverage.
+- Engine UI specialist unavailable (no engine configured) — skill spec states "skip if no engine
+  configured"; this path is asserted in Case 1 but not given a dedicated fixture.
+- The NEEDS REVISION acceptance-risk override (Case 2 option b) requires the override to be
+  explicitly documented in the report; this is asserted but not further tested for downstream effects.
--- a/Framework/skills/utility/adopt.md
+++ b/Framework/skills/utility/adopt.md
@@ -0,0 +1,214 @@
+# Skill Test Spec: /adopt
+
+## Skill Summary
+
+`/adopt` audits an existing project's artifacts — GDDs, ADRs, stories, infrastructure
+files, and `technical-preferences.md` — for format compliance with the template's
+skill pipeline. It classifies every gap by severity (BLOCKING / HIGH / MEDIUM / LOW),
+composes a numbered, ordered migration plan, and writes it to `docs/adoption-plan-[date].md`
+after explicit user approval via `AskUserQuestion`.
+
+This skill is distinct from `/project-stage-detect` (which checks what exists).
+`/adopt` checks whether what exists will actually work with the template's skills.
+
+No director gates apply. The skill does NOT invoke any director agents.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains severity tier keywords: BLOCKING, HIGH, MEDIUM, LOW
+- [ ] Contains "May I write" or `AskUserQuestion` language before writing the adoption plan
+- [ ] Has a next-step handoff at the end (e.g., offering to fix the highest-priority gap immediately)
+
+---
+
+## Director Gate Checks
+
+None. `/adopt` is a brownfield audit utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All GDDs compliant, no gaps, COMPLIANT
+
+**Fixture:**
+- `design/gdd/` contains 3 GDD files; each has all 8 required sections with content
+- `docs/architecture/adr-0001.md` exists with `## Status`, `## Engine Compatibility`,
+  and all other required sections
+- `production/stage.txt` exists
+- `docs/architecture/tr-registry.yaml` and `docs/architecture/control-manifest.md` exist
+- Engine configured in `technical-preferences.md`
+
+**Input:** `/adopt`
+
+**Expected behavior:**
+1. Skill emits "Scanning project artifacts..." then reads all artifacts silently
+2. Reports detected phase, GDD count, ADR count, story count
+3. Phase 2 audit: all 3 GDDs have all 8 sections, Status field present and valid
+4. ADR audit: all required sections present
+5. Infrastructure audit: all critical files exist
+6. Phase 3: zero BLOCKING, zero HIGH, zero MEDIUM, zero LOW gaps
+7. Summary reports: "No blocking gaps — this project is template-compatible"
+8. Uses `AskUserQuestion` to ask about writing the plan; user selects write
+9. Adoption plan is written to `docs/adoption-plan-[date].md`
+10. Phase 7 offers next action: no blocking gaps, offers options for next steps
+
+**Assertions:**
+- [ ] Skill reads silently before presenting any output
+- [ ] "Scanning project artifacts..." appears before the silent read phase
+- [ ] Gap counts show 0 BLOCKING, 0 HIGH, 0 MEDIUM (or only LOW)
+- [ ] `AskUserQuestion` is used before writing the adoption plan
+- [ ] Adoption plan file is written to `docs/adoption-plan-[date].md`
+- [ ] Phase 7 offers a specific next action (not just a list)
+
+---
+
+### Case 2: Non-Compliant Documents — GDDs missing sections, NEEDS MIGRATION
+
+**Fixture:**
+- `design/gdd/` contains 2 GDD files:
+  - `combat.md` — missing `## Acceptance Criteria` and `## Formulas` sections
+  - `movement.md` — all 8 sections present
+- One ADR (`adr-0001.md`) is missing `## Status` section
+- `docs/architecture/tr-registry.yaml` does not exist
+
+**Input:** `/adopt`
+
+**Expected behavior:**
+1. Skill scans all artifacts
+2. Phase 2 audit finds:
+   - `combat.md`: 2 missing sections (Acceptance Criteria, Formulas)
+   - `adr-0001.md`: missing `## Status` — BLOCKING impact
+   - `tr-registry.yaml`: missing — HIGH impact
+3. Phase 3 classifies:
+   - BLOCKING: `adr-0001.md` missing `## Status` (story-readiness silently passes)
+   - HIGH: `tr-registry.yaml` missing; `combat.md` missing Acceptance Criteria (can't generate stories)
+   - MEDIUM: `combat.md` missing Formulas
+4. Phase 4 builds ordered migration plan:
+   - Step 1 (BLOCKING): Add `## Status` to `adr-0001.md` — command: `/architecture-decision retrofit`
+   - Step 2 (HIGH): Run `/architecture-review` to bootstrap tr-registry.yaml
+   - Step 3 (HIGH): Add Acceptance Criteria to `combat.md` — command: `/design-system retrofit`
+   - Step 4 (MEDIUM): Add Formulas to `combat.md`
+5. Gap Preview shows BLOCKING items as bullets (actual file names), HIGH/MEDIUM as counts
+6. `AskUserQuestion` asks to write the plan; writes after approval
+7. Phase 7 offers to fix the highest-priority gap (ADR Status) immediately
+
+**Assertions:**
+- [ ] BLOCKING gaps are listed as explicit file-name bullets in the Gap Preview
+- [ ] HIGH and MEDIUM shown as counts in Gap Preview
+- [ ] Migration plan items are in BLOCKING-first order
+- [ ] Each plan item includes the fix command or manual steps
+- [ ] `AskUserQuestion` is used before writing
+- [ ] Phase 7 offers to immediately retrofit the first BLOCKING item
+
+---
+
+### Case 3: Mixed State — Some docs compliant, some not, partial report
+
+**Fixture:**
+- 4 GDD files: 2 fully compliant, 2 with gaps (one missing Tuning Knobs, one missing Edge Cases)
+- ADRs: 3 files — 2 compliant, 1 missing `## ADR Dependencies`
+- Stories: 5 files — 3 have TR-ID references, 2 do not
+- Infrastructure: all critical files present; `technical-preferences.md` fully configured
+
+**Input:** `/adopt`
+
+**Expected behavior:**
+1. Skill audits all artifact types
+2. Audit summary shows totals: "4 GDDs (2 fully compliant, 2 with gaps); 3 ADRs
+   (2 fully compliant, 1 with gaps); 5 stories (3 with TR-IDs, 2 without)"
+3. Gap classification:
+   - No BLOCKING gaps
+   - HIGH: 1 ADR missing `## ADR Dependencies`
+   - MEDIUM: 2 GDDs with missing sections; 2 stories missing TR-IDs
+   - LOW: none
+4. Migration plan lists HIGH gap first, then MEDIUM gaps in order
+5. Note included: "Existing stories continue to work — do not regenerate stories
+   that are in progress or done"
+6. `AskUserQuestion` to write plan; writes after approval
+
+**Assertions:**
+- [ ] Per-artifact compliance tallies are shown (N compliant, M with gaps)
+- [ ] Existing story compatibility note is included in the plan
+- [ ] No BLOCKING gaps results in no BLOCKING section in migration plan
+- [ ] HIGH gap precedes MEDIUM gaps in plan ordering
+- [ ] `AskUserQuestion` is used before writing
+
+---
+
+### Case 4: No Artifacts Found — Fresh project, guidance to run /start
+
+**Fixture:**
+- Repository has no files in `design/gdd/`, `docs/architecture/`, `production/epics/`
+- `production/stage.txt` does not exist
+- `src/` directory does not exist or has fewer than 10 files
+- No game-concept.md, no systems-index.md
+
+**Input:** `/adopt`
+
+**Expected behavior:**
+1. Phase 1 existence check finds no artifacts
+2. Skill infers "Fresh" — no brownfield work to migrate
+3. Uses `AskUserQuestion`:
+   - "This looks like a fresh project — no existing artifacts found. `/adopt` is for
+     projects with work to migrate. What would you like to do?"
+   - Options: "Run `/start`", "My artifacts are in a non-standard location", "Cancel"
+4. Skill stops — does not proceed to audit regardless of user selection
+
+**Assertions:**
+- [ ] `AskUserQuestion` is used (not a plain text message) when no artifacts are found
+- [ ] `/start` is presented as a named option
+- [ ] Skill stops after the question — no audit phases run
+- [ ] No adoption plan file is written
+
+---
+
+### Case 5: Director Gate Check — No gate; adopt is a utility audit skill
+
+**Fixture:**
+- Project with a mix of compliant and non-compliant GDDs
+
+**Input:** `/adopt`
+
+**Expected behavior:**
+1. Skill completes full audit and produces migration plan
+2. No director agents are spawned at any point
+3. No gate IDs (CD-*, TD-*, AD-*, PR-*) appear in output
+4. No `/gate-check` is invoked during the skill run
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Skill reaches plan-writing or cancellation without any gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Emits "Scanning project artifacts..." before silent read phase
+- [ ] Reads all artifacts silently before presenting any results
+- [ ] Shows Adoption Audit Summary and Gap Preview before asking to write
+- [ ] Uses `AskUserQuestion` before writing the adoption plan file
+- [ ] Adoption plan written to `docs/adoption-plan-[date].md` — not to any other path
+- [ ] Migration plan items ordered: BLOCKING first, HIGH second, MEDIUM third, LOW last
+- [ ] Phase 7 always offers a single specific next action (not a generic list)
+- [ ] Never regenerates existing artifacts — only fills gaps in what exists
+- [ ] Does not invoke director gates at any point
+
+---
+
+## Coverage Notes
+
+- The `gdds`, `adrs`, `stories`, and `infra` argument modes narrow the audit scope;
+  each follows the same pattern as the full audit but limited to that artifact type.
+  Not separately fixture-tested here.
+- The systems-index.md parenthetical status value check (BLOCKING) is a special case
+  that triggers an immediate fix offer before writing the plan; not separately tested.
+- The review-mode.txt prompt (Phase 6b) runs after plan writing if `production/review-mode.txt`
+  does not exist; not separately tested here.
--- a/Framework/skills/utility/asset-spec.md
+++ b/Framework/skills/utility/asset-spec.md
@@ -0,0 +1,179 @@
+# Skill Test Spec: /asset-spec
+
+## Skill Summary
+
+`/asset-spec` generates per-asset visual specification documents from design
+requirements. It reads the relevant GDD, art bible, and design system to produce
+a structured asset spec sheet that defines: dimensions, animation states (if
+applicable), color palette reference, style notes, technical constraints
+(format, file size budget), and deliverable checklist.
+
+Spec sheets are written to `assets/specs/[asset-name]-spec.md` after a "May I write"
+ask. If a spec already exists, the skill offers to update it. When multiple assets
+are requested in a single invocation, a "May I write" ask is made per asset. No
+director gates apply. The verdict is COMPLETE when all requested specs are written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language (per asset)
+- [ ] Has a next-step handoff (e.g., assign to an artist, or `/asset-audit` later)
+
+---
+
+## Director Gate Checks
+
+None. `/asset-spec` is a design documentation utility. Technical artists may
+review specs separately but this is not a gate within this skill.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Enemy sprite spec with full GDD and art bible
+
+**Fixture:**
+- `design/gdd/enemies.md` exists with enemy variants defined
+- `design/art-bible.md` exists with color palette and style notes
+- No existing asset spec for "goblin-enemy"
+
+**Input:** `/asset-spec goblin-enemy`
+
+**Expected behavior:**
+1. Skill reads enemies GDD and art bible
+2. Skill generates a spec for the goblin enemy sprite:
+   - Dimensions: inferred from engine defaults or explicitly from GDD
+   - Animation states: idle, walk, attack, hurt, death
+   - Color palette reference: links to art-bible palette section
+   - Style notes: from art bible character design rules
+   - Technical constraints: format (PNG), size budget
+   - Deliverable checklist
+3. Skill asks "May I write to `assets/specs/goblin-enemy-spec.md`?"
+4. File written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 6 spec components are present (dimensions, animations, palette, style, tech, checklist)
+- [ ] Color palette reference links to art bible (not duplicated)
+- [ ] Animation states are drawn from GDD (not invented)
+- [ ] "May I write" is asked with the correct path
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: No Art Bible Found — Spec with Placeholder Style Notes, Dependency Flagged
+
+**Fixture:**
+- `design/gdd/player.md` exists
+- `design/art-bible.md` does NOT exist
+
+**Input:** `/asset-spec player-sprite`
+
+**Expected behavior:**
+1. Skill reads player GDD but cannot find the art bible
+2. Skill generates spec with placeholder style notes: "DEPENDENCY GAP: art bible
+   not found — style notes are placeholders"
+3. Color palette section uses: "TBD — see art bible when created"
+4. Skill asks "May I write to `assets/specs/player-sprite-spec.md`?"
+5. File written with placeholders and dependency flag; verdict is COMPLETE with advisory
+
+**Assertions:**
+- [ ] DEPENDENCY GAP is flagged for the missing art bible
+- [ ] Spec is still generated (not blocked)
+- [ ] Style notes contain placeholder markers, not invented styles
+- [ ] Verdict is COMPLETE with advisory note
+
+---
+
+### Case 3: Asset Spec Already Exists — Offers to Update
+
+**Fixture:**
+- `assets/specs/goblin-enemy-spec.md` already exists
+- GDD has been updated since the spec was written (new attack animation added)
+
+**Input:** `/asset-spec goblin-enemy`
+
+**Expected behavior:**
+1. Skill detects existing spec file
+2. Skill reports: "Asset spec already exists for goblin-enemy — checking for updates"
+3. Skill diffs GDD against existing spec and identifies: new "charge-attack" animation
+   state added in GDD but not in spec
+4. Skill presents the diff: "1 new animation state found — offering to update spec"
+5. Skill asks "May I update `assets/specs/goblin-enemy-spec.md`?" (not overwrite)
+6. Spec is updated; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Existing spec is detected and "update" path is offered
+- [ ] Diff between GDD and existing spec is shown
+- [ ] "May I update" language is used (not "May I write")
+- [ ] Existing spec content is preserved; only the diff is applied
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: Multiple Assets Requested — May-I-Write Per Asset
+
+**Fixture:**
+- GDD and art bible exist
+- User requests specs for 3 assets: goblin-enemy, orc-enemy, treasure-chest
+
+**Input:** `/asset-spec goblin-enemy orc-enemy treasure-chest`
+
+**Expected behavior:**
+1. Skill generates all 3 specs in sequence
+2. For each asset, skill shows the draft and asks "May I write to
+   `assets/specs/[name]-spec.md`?" individually
+3. User can approve all 3 or skip individual assets
+4. All approved specs are written; verdict is COMPLETE
+
+**Assertions:**
+- [ ] "May I write" is asked 3 times (once per asset), not once for all
+- [ ] User can decline one asset without blocking the others
+- [ ] All 3 spec files are written for approved assets
+- [ ] Verdict is COMPLETE when all approved specs are written
+
+---
+
+### Case 5: Director Gate Check — No gate; asset-spec is a design utility
+
+**Fixture:**
+- GDD and art bible exist
+
+**Input:** `/asset-spec goblin-enemy`
+
+**Expected behavior:**
+1. Skill generates and writes the asset spec
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads GDD, art bible, and design system before generating spec
+- [ ] Includes all 6 spec components (dimensions, animations, palette, style, tech, checklist)
+- [ ] Flags missing dependencies (art bible, GDD) with DEPENDENCY GAP notes
+- [ ] Asks "May I write" (or "May I update") per asset
+- [ ] Handles multiple assets with individual write confirmations
+- [ ] Verdict is COMPLETE when all approved specs are written
+
+---
+
+## Coverage Notes
+
+- Audio asset specs (sound effects, music) follow the same structure with
+  different fields (duration, sample rate, looping) and are not separately tested.
+- UI asset specs (icons, button states) follow the same flow with interaction
+  state requirements aligned to the UX spec.
+- The case where GDD is also missing (neither GDD nor art bible exists) is not
+  separately tested; spec would be generated with both dependency gaps flagged.
--- a/Framework/skills/utility/brainstorm.md
+++ b/Framework/skills/utility/brainstorm.md
@@ -0,0 +1,189 @@
+# Skill Test Spec: /brainstorm
+
+## Skill Summary
+
+`/brainstorm` facilitates guided game concept ideation. It presents 2-4 concept
+options with pros/cons, lets the user choose and refine a concept, and produces
+a structured `design/gdd/game-concept.md` document. The skill is collaborative —
+it asks questions before proposing options and iterates until the user approves
+a concept direction.
+
+In `full` review mode, four director gates spawn in parallel after the concept
+is drafted: CD-PILLARS (creative-director), AD-CONCEPT-VISUAL (art-director),
+TD-FEASIBILITY (technical-director), and PR-SCOPE (producer). In `lean` mode,
+all 4 inline gates are skipped (lean mode only runs PHASE-GATEs, and brainstorm
+has none). In `solo` mode, all gates are skipped. The skill asks "May I write"
+before writing `design/gdd/game-concept.md`.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: APPROVED, REJECTED, CONCERNS
+- [ ] Contains "May I write" collaborative protocol language (for game-concept.md)
+- [ ] Has a next-step handoff at the end (`/map-systems`)
+- [ ] Documents 4 director gates in full mode: CD-PILLARS, AD-CONCEPT-VISUAL, TD-FEASIBILITY, PR-SCOPE
+- [ ] Documents that all 4 gates are skipped in lean and solo modes
+
+---
+
+## Director Gate Checks
+
+In `full` mode: CD-PILLARS, AD-CONCEPT-VISUAL, TD-FEASIBILITY, and PR-SCOPE
+spawn in parallel after the concept draft is approved by the user.
+
+In `lean` mode: all 4 inline gates are skipped (brainstorm has no PHASE-GATEs,
+so lean mode skips everything). Output notes all 4 as: "[GATE-ID] skipped — lean mode".
+
+In `solo` mode: all 4 gates are skipped. Output notes all 4 as: "[GATE-ID] skipped — solo mode".
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Full mode, 3 concepts, user picks one, all 4 directors approve
+
+**Fixture:**
+- No existing `design/gdd/game-concept.md`
+- `production/session-state/review-mode.txt` contains `full`
+
+**Input:** `/brainstorm`
+
+**Expected behavior:**
+1. Skill asks the user questions about genre, scope, and target feeling
+2. Skill presents 3 concept options with pros/cons each
+3. User selects one concept
+4. Skill elaborates the chosen concept into a structured draft
+5. All 4 director gates spawn in parallel: CD-PILLARS, AD-CONCEPT-VISUAL, TD-FEASIBILITY, PR-SCOPE
+6. All 4 return APPROVED
+7. Skill asks "May I write `design/gdd/game-concept.md`?"
+8. Concept written after approval
+
+**Assertions:**
+- [ ] Exactly 3 concept options are presented (not 1, not 5+)
+- [ ] All 4 director gates spawn in parallel (not sequentially)
+- [ ] All 4 gates complete before the "May I write" ask
+- [ ] "May I write `design/gdd/game-concept.md`?" is asked before writing
+- [ ] Concept file is NOT written without user approval
+- [ ] Next-step handoff to `/map-systems` is present
+
+---
+
+### Case 2: Failure Path — CD-PILLARS returns REJECT
+
+**Fixture:**
+- Concept draft is complete
+- `production/session-state/review-mode.txt` contains `full`
+- CD-PILLARS gate returns REJECT: "The concept has no identifiable creative pillar"
+
+**Input:** `/brainstorm`
+
+**Expected behavior:**
+1. CD-PILLARS gate returns REJECT with specific feedback
+2. Skill surfaces the rejection to the user
+3. Concept is NOT written to file
+4. User is asked: rethink the concept direction, or override the rejection
+5. If rethinking: skill returns to the concept options phase
+
+**Assertions:**
+- [ ] Concept is NOT written when CD-PILLARS returns REJECT
+- [ ] Rejection feedback is shown to the user verbatim
+- [ ] User is given the option to rethink or override
+- [ ] Skill returns to concept ideation phase if user chooses to rethink
+
+---
+
+### Case 3: Lean Mode — All 4 gates skipped; concept written after user confirms
+
+**Fixture:**
+- No existing game concept
+- `production/session-state/review-mode.txt` contains `lean`
+
+**Input:** `/brainstorm`
+
+**Expected behavior:**
+1. Concept options are presented and user selects one
+2. Concept is elaborated into a structured draft
+3. All 4 director gates are skipped — each noted: "[GATE-ID] skipped — lean mode"
+4. Skill asks user to confirm the concept is ready to write
+5. "May I write `design/gdd/game-concept.md`?" asked after confirmation
+6. Concept written after approval
+
+**Assertions:**
+- [ ] All 4 gate skip notes appear: "CD-PILLARS skipped — lean mode", "AD-CONCEPT-VISUAL skipped — lean mode", "TD-FEASIBILITY skipped — lean mode", "PR-SCOPE skipped — lean mode"
+- [ ] Concept is written after user confirmation only (no director approval needed in lean)
+- [ ] "May I write" is still asked before writing
+
+---
+
+### Case 4: Solo Mode — All gates skipped; concept written with only user approval
+
+**Fixture:**
+- No existing game concept
+- `production/session-state/review-mode.txt` contains `solo`
+
+**Input:** `/brainstorm`
+
+**Expected behavior:**
+1. Concept options are presented and user selects one
+2. Concept draft is shown to user
+3. All 4 director gates are skipped — each noted with "solo mode"
+4. "May I write `design/gdd/game-concept.md`?" asked
+5. Concept written after user approval
+
+**Assertions:**
+- [ ] All 4 skip notes appear with "solo mode" label
+- [ ] No director agents are spawned
+- [ ] Concept is written with only user approval
+- [ ] Behavior is otherwise equivalent to lean mode for this skill
+
+---
+
+### Case 5: Director Gate — PR-SCOPE returns CONCERNS (scope too large)
+
+**Fixture:**
+- Concept draft is complete
+- `production/session-state/review-mode.txt` contains `full`
+- PR-SCOPE gate returns CONCERNS: "The concept scope would require 18+ months for a solo developer"
+
+**Input:** `/brainstorm`
+
+**Expected behavior:**
+1. PR-SCOPE gate returns CONCERNS with specific scope feedback
+2. Skill surfaces the scope concerns to the user
+3. Scope concerns are documented in the concept draft before writing
+4. User is asked: reduce scope, accept concerns and document them, or rethink
+5. If concerns are accepted: concept is written with a "Scope Risk" note embedded
+
+**Assertions:**
+- [ ] PR-SCOPE concerns are shown to the user before the "May I write" ask
+- [ ] Skill does NOT write concept without surfacing scope concerns
+- [ ] If user accepts: scope concerns are documented in the concept file
+- [ ] Skill does NOT auto-reject a concept due to PR-SCOPE CONCERNS (user decides)
+
+---
+
+## Protocol Compliance
+
+- [ ] Presents 2-4 concept options with pros/cons before user commits
+- [ ] User confirms concept direction before director gates are invoked
+- [ ] All 4 director gates spawn in parallel in full mode
+- [ ] All 4 gates skipped in lean AND solo mode — each noted by name
+- [ ] "May I write `design/gdd/game-concept.md`?" asked before writing
+- [ ] Ends with next-step handoff: `/map-systems`
+
+---
+
+## Coverage Notes
+
+- AD-CONCEPT-VISUAL gate (art director feasibility) is grouped with the other
+  3 gates in the parallel spawn — not independently fixture-tested.
+- The iterative concept refinement loop (user rejects all options, skill
+  generates new ones) is not fixture-tested — it follows the same pattern as
+  the option selection phase.
+- The game-concept.md document structure (required sections) is defined in the
+  skill body and not re-enumerated in test assertions.
--- a/Framework/skills/utility/bug-report.md
+++ b/Framework/skills/utility/bug-report.md
@@ -0,0 +1,174 @@
+# Skill Test Spec: /bug-report
+
+## Skill Summary
+
+`/bug-report` creates a structured bug report document from a user description.
+It produces a report with the following required fields: Title, Repro Steps,
+Expected Behavior, Actual Behavior, Severity (CRITICAL/HIGH/MEDIUM/LOW), Affected
+System(s), and Build/Version. If the user's initial description is missing any
+required field, the skill asks follow-up questions to fill the gaps before
+producing the draft.
+
+The skill checks for possibly duplicate reports (by comparing to existing files
+in `production/bugs/`) and offers to link rather than create a new report. Each
+report is written to `production/bugs/bug-[date]-[slug].md` after a "May I write"
+ask. No director gates are used — bug reporting is an operational utility.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before writing the report
+- [ ] Has a next-step handoff (e.g., `/bug-triage` to reprioritize, `/hotfix` for critical)
+
+---
+
+## Director Gate Checks
+
+None. `/bug-report` is an operational documentation skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — User describes a crash, full report produced
+
+**Fixture:**
+- `production/bugs/` directory exists and is empty
+- No similar existing reports
+
+**Input:** `/bug-report` (user describes: "Game crashes when player enters the boss arena")
+
+**Expected behavior:**
+1. Skill extracts: Title = "Game crashes when entering boss arena"
+2. Skill recognizes crash reports as CRITICAL severity
+3. Skill confirms repro steps, expected (no crash), actual (crash), affected system
+   (arena/boss), and build version with the user
+4. Skill drafts the full structured report
+5. Skill asks "May I write to `production/bugs/bug-2026-04-06-game-crashes-boss-arena.md`?"
+6. File is written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 7 required fields are present in the report
+- [ ] Severity is CRITICAL for a crash report
+- [ ] Filename follows the `bug-[date]-[slug].md` convention
+- [ ] "May I write" is asked with the full file path
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Minimal Input — Skill asks follow-up questions for missing fields
+
+**Fixture:**
+- User provides: "Sometimes the audio cuts out"
+- No existing reports
+
+**Input:** `/bug-report`
+
+**Expected behavior:**
+1. Skill identifies missing required fields: repro steps, expected vs. actual,
+   severity, affected system, build
+2. Skill asks targeted follow-up questions for each missing field (one at a time
+   or in a structured prompt)
+3. User provides answers
+4. Skill compiles complete report from answers
+5. Skill asks "May I write?" and writes on approval
+
+**Assertions:**
+- [ ] At least 3 follow-up questions are asked to fill missing fields
+- [ ] Each required field is filled before the report is finalized
+- [ ] Report is not written until all required fields are present
+- [ ] Verdict is COMPLETE after all fields are filled and file is written
+
+---
+
+### Case 3: Possible Duplicate — Offers to link rather than create new
+
+**Fixture:**
+- `production/bugs/bug-2026-03-20-audio-cut-out.md` already exists with
+  similar title and MEDIUM severity
+
+**Input:** `/bug-report` (user describes: "Audio randomly stops working")
+
+**Expected behavior:**
+1. Skill scans existing reports and finds the similar audio bug
+2. Skill reports: "A similar bug report exists: bug-2026-03-20-audio-cut-out.md"
+3. Skill presents options: link as duplicate (add note to existing), create new anyway
+4. If user chooses link: skill adds a cross-reference note to the existing file
+   (asks "May I update the existing report?")
+5. If user chooses create new: normal report creation proceeds
+
+**Assertions:**
+- [ ] Existing similar report is surfaced before creating a new one
+- [ ] User is given the choice (not forced to link or create)
+- [ ] If linking: "May I update" is asked before modifying the existing file
+- [ ] Verdict is COMPLETE in either path
+
+---
+
+### Case 4: Multi-System Bug — Report created with multiple system tags
+
+**Fixture:**
+- No existing reports
+
+**Input:** `/bug-report` (user describes: "After finishing a level, the save system
+  freezes and the UI doesn't show the completion screen")
+
+**Expected behavior:**
+1. Skill identifies 2 affected systems from the description: Save System and UI
+2. Report is drafted with both systems listed under Affected System(s)
+3. Severity is assessed (likely HIGH — data loss risk from save freeze)
+4. Skill asks "May I write" with the appropriate filename
+5. Report is written with both systems tagged; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Both affected systems are listed in the report
+- [ ] Single report is created (not one per system)
+- [ ] Severity reflects the most impactful component (save freeze → HIGH or CRITICAL)
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; bug reporting is operational
+
+**Fixture:**
+- Any bug description provided
+
+**Input:** `/bug-report`
+
+**Expected behavior:**
+1. Skill creates and writes the bug report
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Skill reaches COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Collects all 7 required fields before drafting the report
+- [ ] Asks follow-up questions for any missing required fields
+- [ ] Checks for similar existing reports before creating a new one
+- [ ] Asks "May I write to `production/bugs/bug-[date]-[slug].md`?" before writing
+- [ ] Verdict is COMPLETE when the report file is written
+
+---
+
+## Coverage Notes
+
+- The case where the user provides a severity that seems too low for the
+  described impact (e.g., LOW for a crash) is not tested; the skill may suggest
+  a higher severity but ultimately respects user input.
+- Build/version field is required but may be "unknown" if the user doesn't know —
+  this is accepted as a valid value and not tested separately.
+- Report slug generation (sanitizing the title into a filename) is an
+  implementation detail not assertion-tested here.
--- a/Framework/skills/utility/bug-triage.md
+++ b/Framework/skills/utility/bug-triage.md
@@ -0,0 +1,174 @@
+# Skill Test Spec: /bug-triage
+
+## Skill Summary
+
+`/bug-triage` reads all open bug reports in `production/bugs/` and produces a
+prioritized triage table sorted by severity (CRITICAL → HIGH → MEDIUM → LOW).
+It runs on the Haiku model (read-only, formatting/sorting task) and produces no
+file writes — the triage output is conversational. The skill flags bugs missing
+reproduction steps and identifies possible duplicates by comparing titles and
+affected systems.
+
+The verdict is always TRIAGED — the skill is advisory and informational. No
+director gates apply. The output is intended to help a producer or QA lead
+prioritize which bugs to address next.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: TRIAGED
+- [ ] Does NOT contain "May I write" language (skill is read-only)
+- [ ] Has a next-step handoff (e.g., `/bug-report` to create new reports, `/hotfix` for critical bugs)
+
+---
+
+## Director Gate Checks
+
+None. `/bug-triage` is a read-only advisory skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — 5 bugs of varying severity, sorted table produced
+
+**Fixture:**
+- `production/bugs/` contains 5 bug report files:
+  - bug-2026-03-10-audio-crash.md (CRITICAL)
+  - bug-2026-03-12-score-overflow.md (HIGH)
+  - bug-2026-03-14-ui-overlap.md (MEDIUM)
+  - bug-2026-03-15-typo-tutorial.md (LOW)
+  - bug-2026-03-16-vfx-flicker.md (HIGH)
+
+**Input:** `/bug-triage`
+
+**Expected behavior:**
+1. Skill reads all 5 bug report files
+2. Skill extracts severity, title, system, and repro status from each
+3. Skill produces a triage table sorted: CRITICAL first, then HIGH, MEDIUM, LOW
+4. Within the same severity, bugs are ordered by date (oldest first)
+5. Verdict is TRIAGED
+
+**Assertions:**
+- [ ] Triage table has exactly 5 rows
+- [ ] CRITICAL bug appears before both HIGH bugs
+- [ ] HIGH bugs appear before MEDIUM and LOW bugs
+- [ ] Verdict is TRIAGED
+- [ ] No files are written
+
+---
+
+### Case 2: No Bug Reports Found — Guidance to run /bug-report
+
+**Fixture:**
+- `production/bugs/` directory exists but is empty (or does not exist)
+
+**Input:** `/bug-triage`
+
+**Expected behavior:**
+1. Skill scans `production/bugs/` and finds no reports
+2. Skill outputs: "No open bug reports found in production/bugs/"
+3. Skill suggests running `/bug-report` to create a bug report
+4. No triage table is produced
+
+**Assertions:**
+- [ ] Output explicitly states no bugs were found
+- [ ] `/bug-report` is suggested as the next step
+- [ ] Skill does not error out — it handles empty directory gracefully
+- [ ] Verdict is TRIAGED (with "no bugs found" context)
+
+---
+
+### Case 3: Bug Missing Reproduction Steps — Flagged as NEEDS REPRO INFO
+
+**Fixture:**
+- `production/bugs/` contains 3 bug reports; one has an empty "Repro Steps" section
+
+**Input:** `/bug-triage`
+
+**Expected behavior:**
+1. Skill reads all 3 reports
+2. Skill detects the report with no repro steps
+3. That bug appears in the triage table with a `NEEDS REPRO INFO` tag
+4. Other bugs are triaged normally
+5. Verdict is TRIAGED
+
+**Assertions:**
+- [ ] `NEEDS REPRO INFO` tag appears next to the bug missing repro steps
+- [ ] The flagged bug is still included in the table (not excluded)
+- [ ] Other bugs are unaffected
+- [ ] Verdict is TRIAGED
+
+---
+
+### Case 4: Possible Duplicate Bugs — Flagged in triage output
+
+**Fixture:**
+- `production/bugs/` contains 2 bug reports with similar titles:
+  - bug-2026-03-18-player-fall-through-floor.md
+  - bug-2026-03-20-player-clips-through-floor.md
+  - Both affect the "Physics" system with identical severity
+
+**Input:** `/bug-triage`
+
+**Expected behavior:**
+1. Skill reads both reports and detects similar title + same system + same severity
+2. Both bugs are included in the triage table
+3. Each is tagged with `POSSIBLE DUPLICATE` and cross-references the other report
+4. No bugs are merged or deleted — flagging is advisory
+5. Verdict is TRIAGED
+
+**Assertions:**
+- [ ] Both bugs appear in the table (not merged)
+- [ ] Both are tagged `POSSIBLE DUPLICATE`
+- [ ] Each cross-references the other (by filename or title)
+- [ ] Verdict is TRIAGED
+
+---
+
+### Case 5: Director Gate Check — No gate; triage is advisory
+
+**Fixture:**
+- `production/bugs/` contains any number of reports
+
+**Input:** `/bug-triage`
+
+**Expected behavior:**
+1. Skill produces the triage table
+2. No director agents are spawned
+3. No gate IDs appear in output
+4. No write tool is called
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No write tool is called
+- [ ] No gate skip messages appear
+- [ ] Verdict is TRIAGED without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads all files in `production/bugs/` before generating the table
+- [ ] Sorts by severity (CRITICAL → HIGH → MEDIUM → LOW)
+- [ ] Flags bugs missing repro steps
+- [ ] Flags possible duplicates by title/system similarity
+- [ ] Does not write any files
+- [ ] Verdict is TRIAGED in all cases (even empty)
+
+---
+
+## Coverage Notes
+
+- The case where a bug report is malformed (missing severity field entirely)
+  is not fixture-tested; skill would flag it as `UNKNOWN SEVERITY` and sort it
+  last in the table.
+- Status transitions (marking bugs as resolved) are outside this skill's scope —
+  bug-triage is read-only.
+- The duplicate detection heuristic (title similarity + same system) is
+  approximate; exact matching logic is defined in the skill body.
--- a/Framework/skills/utility/day-one-patch.md
+++ b/Framework/skills/utility/day-one-patch.md
@@ -0,0 +1,175 @@
+# Skill Test Spec: /day-one-patch
+
+## Skill Summary
+
+`/day-one-patch` prepares a day-one patch plan for issues that are known at
+launch but deferred from the v1.0 release. It reads open bug reports in
+`production/bugs/`, deferred acceptance criteria from story files (stories
+marked `Status: Done` but with noted deferred ACs), and produces a prioritized
+patch plan with estimated fix timelines per issue.
+
+The patch plan is written to `production/releases/day-one-patch.md` after a
+"May I write" ask. If a P0 (critical post-ship) issue is discovered, the skill
+triggers guidance to run `/hotfix` before the patch. No director gates apply.
+The verdict is always COMPLETE.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before writing the plan
+- [ ] Has a next-step handoff (e.g., `/hotfix` for P0 issues, `/release-checklist` for follow-up)
+
+---
+
+## Director Gate Checks
+
+None. `/day-one-patch` is a release planning utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — 3 Known Issues, Patch Plan With Fix Estimates
+
+**Fixture:**
+- `production/bugs/` contains 3 open bugs with severities: 1 MEDIUM, 2 LOW
+- No deferred ACs in sprint stories
+- All bugs have repro steps and system identifications
+
+**Input:** `/day-one-patch`
+
+**Expected behavior:**
+1. Skill reads all 3 open bugs
+2. Skill assigns fix effort estimates: MEDIUM bug = 1-2 days, LOW bugs = 4 hours each
+3. Skill produces a patch plan prioritizing MEDIUM bug first
+4. Plan includes: priority order, estimated timeline, responsible system, fix description
+5. Skill asks "May I write to `production/releases/day-one-patch.md`?"
+6. File written; verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 3 bugs appear in the plan
+- [ ] Bugs are prioritized by severity (MEDIUM before LOW)
+- [ ] Fix estimates are provided per issue
+- [ ] "May I write" is asked before writing
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Critical Issue Discovered Post-Ship — P0, Triggers /hotfix Guidance
+
+**Fixture:**
+- A CRITICAL severity bug is found in `production/bugs/` after the v1.0 release
+- The bug causes data loss for all save files
+
+**Input:** `/day-one-patch`
+
+**Expected behavior:**
+1. Skill reads bugs and identifies the CRITICAL severity issue
+2. Skill escalates: "P0 ISSUE DETECTED — data loss bug requires immediate hotfix
+   before patch planning can proceed"
+3. Skill does NOT include the P0 issue in the patch plan timeline
+4. Skill explicitly directs: "Run `/hotfix` to resolve this issue first"
+5. After P0 guidance is issued: plan for remaining lower-severity bugs is still
+   generated and written; verdict is COMPLETE
+
+**Assertions:**
+- [ ] P0 escalation message appears prominently before the patch plan
+- [ ] `/hotfix` is explicitly directed for the P0 issue
+- [ ] P0 issue is NOT scheduled in the patch plan timeline (it needs immediate action)
+- [ ] Non-P0 issues are still planned; verdict is COMPLETE
+
+---
+
+### Case 3: Deferred AC From Story-Done — Pulled Into Patch Plan Automatically
+
+**Fixture:**
+- `production/sprints/sprint-008.md` has a story with `Status: Done` and a note:
+  "DEFERRED AC: Gamepad vibration on damage — deferred to post-launch patch"
+- No open bugs for the same system
+
+**Input:** `/day-one-patch`
+
+**Expected behavior:**
+1. Skill reads sprint stories and detects the deferred AC note
+2. Deferred AC is automatically included in the patch plan as a work item
+3. Plan entry: "Deferred from sprint-008: Gamepad vibration on damage"
+4. Fix estimate is assigned; patch plan written after "May I write" approval
+5. Verdict is COMPLETE
+
+**Assertions:**
+- [ ] Deferred ACs from story files are automatically pulled into the plan
+- [ ] Deferred items are labeled by their source story (sprint-008)
+- [ ] Deferred AC gets a fix estimate like bug entries
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: No Known Issues — Empty Plan With Template Note
+
+**Fixture:**
+- `production/bugs/` is empty
+- No stories have deferred ACs
+
+**Input:** `/day-one-patch`
+
+**Expected behavior:**
+1. Skill reads bugs — none found
+2. Skill reads story deferred ACs — none found
+3. Skill produces an empty patch plan with a note: "No known issues at launch"
+4. Template structure is preserved (headers intact) for future use
+5. Skill asks "May I write to `production/releases/day-one-patch.md`?"
+6. File written; verdict is COMPLETE
+
+**Assertions:**
+- [ ] "No known issues at launch" note appears in the written file
+- [ ] Template headers are present in the empty plan
+- [ ] Skill does NOT error out when there are no issues to plan
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; day-one-patch is a planning utility
+
+**Fixture:**
+- Known issues present in production/bugs/
+
+**Input:** `/day-one-patch`
+
+**Expected behavior:**
+1. Skill generates and writes the patch plan
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads open bugs from `production/bugs/` before generating the plan
+- [ ] Scans story files for deferred AC notes
+- [ ] Escalates CRITICAL (P0) bugs with explicit `/hotfix` guidance
+- [ ] Produces an empty plan with note when no issues exist (not an error)
+- [ ] Asks "May I write to `production/releases/day-one-patch.md`?" before writing
+- [ ] Verdict is COMPLETE in all paths
+
+---
+
+## Coverage Notes
+
+- The case where multiple CRITICAL bugs exist is handled the same as Case 2;
+  all P0 issues are escalated together.
+- Timeline estimation for the patch (e.g., "patch available in 3 days")
+  requires manual QA and build time estimates; the skill uses rough estimates
+  based on severity, not actual team velocity.
+- The patch notes player communication document (`/patch-notes`) is a separate
+  skill invoked after the patch plan is executed.
--- a/Framework/skills/utility/help.md
+++ b/Framework/skills/utility/help.md
@@ -0,0 +1,172 @@
+# Skill Test Spec: /help
+
+## Skill Summary
+
+`/help` analyzes what has been done and what comes next in the project workflow.
+It runs on the Haiku model (read-only, formatting task) and reads `production/stage.txt`,
+the active sprint file, and recent session state to produce a concise situational
+guidance summary. The skill optionally accepts a context query (e.g., `/help testing`)
+to surface relevant skills for a specific topic.
+
+The output is always informational — no files are written and no director gates
+are invoked. The verdict is always HELP COMPLETE. The skill serves as a workflow
+navigator, suggesting 2-3 next skills based on the current project state.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: HELP COMPLETE
+- [ ] Does NOT contain "May I write" language (skill is read-only)
+- [ ] Has a next-step handoff (suggests 2-3 relevant skills based on state)
+
+---
+
+## Director Gate Checks
+
+None. `/help` is a read-only navigation skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Production stage with active sprint
+
+**Fixture:**
+- `production/stage.txt` contains `Production`
+- `production/sprints/sprint-004.md` exists with in-progress stories
+- `production/session-state/active.md` has a recent checkpoint
+
+**Input:** `/help`
+
+**Expected behavior:**
+1. Skill reads stage.txt and active sprint
+2. Skill identifies current sprint number and in-progress story count
+3. Skill outputs: current stage, sprint summary, and 3 suggested next skills
+   (e.g., `/sprint-status`, `/dev-story`, `/story-done`)
+4. Suggestions are ranked by relevance to current sprint state
+5. Verdict is HELP COMPLETE
+
+**Assertions:**
+- [ ] Current stage is shown (Production)
+- [ ] Active sprint number and story count are mentioned
+- [ ] Exactly 2-3 next-skill suggestions are given (not a list of all skills)
+- [ ] Suggestions are appropriate for Production stage
+- [ ] Verdict is HELP COMPLETE
+- [ ] No files are written
+
+---
+
+### Case 2: Concept Stage — Shows concept-to-systems-design workflow path
+
+**Fixture:**
+- `production/stage.txt` contains `Concept`
+- No sprint files, no GDD files
+- `technical-preferences.md` is configured (engine selected)
+
+**Input:** `/help`
+
+**Expected behavior:**
+1. Skill reads stage.txt — detects Concept stage
+2. Skill outputs the Concept-stage workflow: brainstorm → map-systems → design-system
+3. Suggested skills are: `/brainstorm`, `/map-systems` (if concept exists)
+4. Current progress is noted: "Engine configured, concept not yet created"
+
+**Assertions:**
+- [ ] Stage is identified as Concept
+- [ ] Workflow path shows the expected sequence for this stage
+- [ ] Suggestions do not include Production-stage skills (e.g., `/dev-story`)
+- [ ] Verdict is HELP COMPLETE
+
+---
+
+### Case 3: No stage.txt — Shows full workflow overview
+
+**Fixture:**
+- No `production/stage.txt`
+- No sprint files
+- `technical-preferences.md` has placeholders
+
+**Input:** `/help`
+
+**Expected behavior:**
+1. Skill cannot determine stage from stage.txt
+2. Skill runs project-stage-detect logic to infer stage from artifacts
+3. If stage cannot be inferred: outputs the full workflow overview from
+   Concept through Release as a reference map
+4. Primary suggestion is `/start` to begin configuration
+
+**Assertions:**
+- [ ] Skill does not crash when stage.txt is absent
+- [ ] Full workflow overview is shown when stage cannot be determined
+- [ ] `/start` or `/project-stage-detect` is a top suggestion
+- [ ] Verdict is HELP COMPLETE
+
+---
+
+### Case 4: Context Query — User asks for help with testing
+
+**Fixture:**
+- `production/stage.txt` contains `Production`
+- Active sprint has a story with `Status: In Review`
+
+**Input:** `/help testing`
+
+**Expected behavior:**
+1. Skill reads context query: "testing"
+2. Skill surfaces skills relevant to testing: `/qa-plan`, `/smoke-check`,
+   `/regression-suite`, `/test-setup`, `/test-evidence-review`
+3. Output is focused on testing workflow, not general sprint navigation
+4. Currently in-review story is highlighted as a testing candidate
+
+**Assertions:**
+- [ ] Context query is acknowledged in output ("Help topic: testing")
+- [ ] At least 3 testing-relevant skills are listed
+- [ ] General sprint skills (e.g., `/sprint-plan`) are not the primary suggestions
+- [ ] Verdict is HELP COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; help is read-only navigation
+
+**Fixture:**
+- Any project state
+
+**Input:** `/help`
+
+**Expected behavior:**
+1. Skill produces workflow guidance summary
+2. No director agents are spawned
+3. No gate IDs appear in output
+4. No write tool is called
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No write tool is called
+- [ ] No gate skip messages appear
+- [ ] Verdict is HELP COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads stage, sprint, and session state before generating suggestions
+- [ ] Suggestions are specific to the current project state (not generic)
+- [ ] Context query (if provided) narrows the suggestion set
+- [ ] Does not write any files
+- [ ] Verdict is HELP COMPLETE in all cases
+
+---
+
+## Coverage Notes
+
+- The case where the active sprint is complete (all stories Done) is not
+  separately tested; the skill would suggest `/sprint-plan` for the next sprint.
+- The `/help` skill does not validate whether suggested skills are available —
+  it assumes standard skill catalog availability.
+- Stage detection fallback (when stage.txt is absent) delegates to the same
+  logic as `/project-stage-detect` and is not re-tested here in detail.
--- a/Framework/skills/utility/hotfix.md
+++ b/Framework/skills/utility/hotfix.md
@@ -0,0 +1,173 @@
+# Skill Test Spec: /hotfix
+
+## Skill Summary
+
+`/hotfix` manages an emergency fix workflow: it creates a hotfix branch from
+main, applies a targeted fix to the identified file(s), runs `/smoke-check` to
+validate the fix doesn't introduce regressions, and prompts the user to confirm
+merge back to main. Each code change requires a "May I write to [filepath]?" ask.
+Git operations (branch creation, merge) are presented as Bash commands for user
+confirmation before execution.
+
+The skill is time-sensitive — director review is optional post-hoc, not a
+blocking gate. Verdicts: HOTFIX COMPLETE (fix applied, smoke check passed, merged)
+or HOTFIX BLOCKED (fix introduced regression or user declined).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: HOTFIX COMPLETE, HOTFIX BLOCKED
+- [ ] Contains "May I write" language for code changes
+- [ ] Has a next-step handoff (e.g., `/bug-report` to document the issue, or version bump)
+
+---
+
+## Director Gate Checks
+
+None. Hotfixes are time-critical. Director review may follow separately as a
+post-hoc step. No gate is invoked within this skill.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Critical crash bug fixed, smoke check passes
+
+**Fixture:**
+- `main` branch is clean
+- Bug is identified in `src/gameplay/arena.gd` (crash on boss arena entry)
+- Repro steps are provided by user
+
+**Input:** `/hotfix` (user describes the crash and affected file)
+
+**Expected behavior:**
+1. Skill proposes creating a hotfix branch: `hotfix/boss-arena-crash`
+2. User confirms; Bash command for branch creation is shown and confirmed
+3. Skill identifies the fix location in `arena.gd` and drafts the change
+4. Skill asks "May I write to `src/gameplay/arena.gd`?" and applies fix on approval
+5. Skill runs `/smoke-check` — PASS
+6. Skill presents the merge command and asks user to confirm merge to `main`
+7. User confirms; merge executes; verdict is HOTFIX COMPLETE
+
+**Assertions:**
+- [ ] Hotfix branch is created before any code changes
+- [ ] "May I write" is asked before modifying any source file
+- [ ] `/smoke-check` runs after the fix is applied
+- [ ] Merge requires explicit user confirmation (not automatic)
+- [ ] Verdict is HOTFIX COMPLETE after successful merge
+
+---
+
+### Case 2: Smoke Check Fails — HOTFIX BLOCKED
+
+**Fixture:**
+- Fix has been applied to `src/gameplay/arena.gd`
+- `/smoke-check` returns FAIL: "Player health clamping regression detected"
+
+**Input:** `/hotfix`
+
+**Expected behavior:**
+1. Skill applies the fix and runs `/smoke-check`
+2. Smoke check returns FAIL with specific regression identified
+3. Skill reports: "HOTFIX BLOCKED — smoke check failed: [regression detail]"
+4. Skill presents options: attempt revised fix, revert changes, or merge with
+   known regression (user acknowledges risk)
+5. No automatic merge occurs when smoke check fails
+
+**Assertions:**
+- [ ] Verdict is HOTFIX BLOCKED
+- [ ] Smoke check failure is shown verbatim to user
+- [ ] Merge is NOT performed automatically when smoke check fails
+- [ ] User is given explicit options for how to proceed
+
+---
+
+### Case 3: Fix to Already-Released Build — Version tag noted, patch bump prompted
+
+**Fixture:**
+- Latest git tag is `v1.2.0`
+- Hotfix targets a bug in the v1.2.0 release
+
+**Input:** `/hotfix`
+
+**Expected behavior:**
+1. Skill detects that the current HEAD is a tagged release (v1.2.0)
+2. Skill notes: "Hotfix targeting tagged release v1.2.0"
+3. After smoke check passes, skill prompts: "Should version be bumped to v1.2.1?"
+4. If user confirms version bump: skill asks "May I write to VERSION or equivalent?"
+5. After version update and merge: verdict is HOTFIX COMPLETE with version noted
+
+**Assertions:**
+- [ ] Version tag context is detected and surfaced to user
+- [ ] Patch version bump is suggested (not required) after merge
+- [ ] Version bump requires its own "May I write" confirmation
+- [ ] Verdict is HOTFIX COMPLETE
+
+---
+
+### Case 4: No Repro Steps — Skill Asks Before Applying Fix
+
+**Fixture:**
+- User invokes `/hotfix` with a vague description: "something is broken on level 3"
+- No repro steps provided
+
+**Input:** `/hotfix` (vague description)
+
+**Expected behavior:**
+1. Skill detects insufficient information to identify the fix location
+2. Skill asks: "Please provide reproduction steps and the affected file or system"
+3. Skill does NOT create a branch or modify any file until repro steps are provided
+4. After user provides repro steps: normal hotfix flow begins
+
+**Assertions:**
+- [ ] No branch is created without repro steps
+- [ ] No code changes are made without a clearly identified fix location
+- [ ] Repro step request is specific (not a generic "please provide more info")
+- [ ] Normal hotfix flow resumes after user provides repro steps
+
+---
+
+### Case 5: Director Gate Check — No gate; hotfixes are time-critical
+
+**Fixture:**
+- Critical bug with repro steps identified
+
+**Input:** `/hotfix`
+
+**Expected behavior:**
+1. Skill completes the hotfix workflow
+2. No director agents are spawned during execution
+3. No gate IDs appear in output
+4. Post-hoc director review (if needed) is a manual follow-up, not invoked here
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is HOTFIX COMPLETE or HOTFIX BLOCKED — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Creates hotfix branch before making any code changes
+- [ ] Asks "May I write" before modifying any source files
+- [ ] Runs `/smoke-check` after applying the fix
+- [ ] Requires explicit user confirmation before merging
+- [ ] HOTFIX BLOCKED when smoke check fails — no automatic merge
+- [ ] Verdict is HOTFIX COMPLETE or HOTFIX BLOCKED
+
+---
+
+## Coverage Notes
+
+- The case where multiple files need to be modified for one fix follows the same
+  "May I write" per-file pattern and is not separately tested.
+- The post-hotfix steps (create bug report, update changelog) are suggested in
+  the handoff but not tested as part of this skill's execution.
+- Conflict resolution during the merge (if main has diverged) is not tested;
+  the skill would surface the conflict and ask the user to resolve it manually.
--- a/Framework/skills/utility/launch-checklist.md
+++ b/Framework/skills/utility/launch-checklist.md
@@ -0,0 +1,180 @@
+# Skill Test Spec: /launch-checklist
+
+## Skill Summary
+
+`/launch-checklist` generates and evaluates a complete launch readiness checklist
+covering: legal compliance (EULA, privacy policy, ESRB/PEGI ratings), platform
+certification status, store page completeness (screenshots, description, metadata),
+build validation (version tag, reproducible build), analytics and crash reporting
+configuration, and first-run experience verification.
+
+The skill produces a checklist report written to `production/launch/launch-checklist-[date].md`
+after a "May I write" ask. If a previous launch checklist exists, it compares the
+new results against the old to highlight newly resolved and newly blocked items. No
+director gates apply — `/team-release` orchestrates the full release pipeline. Verdicts:
+LAUNCH READY, LAUNCH BLOCKED, or CONCERNS.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: LAUNCH READY, LAUNCH BLOCKED, CONCERNS
+- [ ] Contains "May I write" collaborative protocol language before writing the checklist
+- [ ] Has a next-step handoff (e.g., `/team-release` or `/day-one-patch`)
+
+---
+
+## Director Gate Checks
+
+None. `/launch-checklist` is a readiness audit utility. The full release pipeline
+is managed by `/team-release`.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All Checklist Items Verified, LAUNCH READY
+
+**Fixture:**
+- Legal docs present: EULA, privacy policy in `production/legal/`
+- Platform certification: marked as submitted and approved in production notes
+- Store page assets: screenshots, description, metadata all present in `production/store/`
+- Build: version tag `v1.0.0` exists, reproducible build confirmed
+- Crash reporting: configured in `technical-preferences.md`
+
+**Input:** `/launch-checklist`
+
+**Expected behavior:**
+1. Skill checks all checklist categories
+2. All items pass their verification checks
+3. Skill produces checklist report with all items marked PASS
+4. Skill asks "May I write to `production/launch/launch-checklist-2026-04-06.md`?"
+5. Report written on approval; verdict is LAUNCH READY
+
+**Assertions:**
+- [ ] All checklist categories are checked (legal, platform, store, build, analytics, UX)
+- [ ] All items appear in the report with PASS markers
+- [ ] Verdict is LAUNCH READY
+- [ ] "May I write" is asked with the correct dated filename
+
+---
+
+### Case 2: Platform Certification Not Submitted — LAUNCH BLOCKED
+
+**Fixture:**
+- All other checklist items pass
+- Platform certification section: "not submitted" (no submission record found)
+
+**Input:** `/launch-checklist`
+
+**Expected behavior:**
+1. Skill checks all items
+2. Platform certification check fails: no submission record
+3. Skill reports: "LAUNCH BLOCKED — Platform certification not submitted"
+4. Specific platform(s) missing certification are named
+5. Verdict is LAUNCH BLOCKED
+
+**Assertions:**
+- [ ] Verdict is LAUNCH BLOCKED (not CONCERNS)
+- [ ] Platform certification is identified as the blocking item
+- [ ] Missing platform names are specified
+- [ ] All other passing items are still shown in the report
+
+---
+
+### Case 3: Manual Check Required — CONCERNS Verdict
+
+**Fixture:**
+- All critical checklist items pass
+- First-run experience item: "MANUAL CHECK NEEDED — human must play the first 5
+  minutes and verify tutorial completion flow"
+- Store screenshots item: "MANUAL CHECK NEEDED — art team must verify screenshot
+  quality matches current build"
+
+**Input:** `/launch-checklist`
+
+**Expected behavior:**
+1. Skill checks all items
+2. 2 items are flagged as requiring human verification
+3. Skill reports: "CONCERNS — 2 items require manual verification before launch"
+4. Both items are listed with instructions for what to manually verify
+5. Verdict is CONCERNS (not LAUNCH BLOCKED, since these are advisory)
+
+**Assertions:**
+- [ ] Verdict is CONCERNS (not LAUNCH READY or LAUNCH BLOCKED)
+- [ ] Both manual check items are listed with verification instructions
+- [ ] Skill does not auto-block on MANUAL CHECK items
+
+---
+
+### Case 4: Previous Checklist Exists — Delta Comparison
+
+**Fixture:**
+- `production/launch/launch-checklist-2026-03-25.md` exists with previous results:
+  - 2 items were BLOCKED (platform cert, crash reporting)
+  - 1 item had a MANUAL CHECK
+- New checklist: platform cert is now PASS, crash reporting is now PASS,
+  manual check still open; 1 new item flagged (EULA last updated date)
+
+**Input:** `/launch-checklist`
+
+**Expected behavior:**
+1. Skill finds the previous checklist and loads it for comparison
+2. Skill produces the new checklist and compares:
+   - Newly resolved: "Platform cert — was BLOCKED, now PASS"
+   - Newly resolved: "Crash reporting — was BLOCKED, now PASS"
+   - Still open: manual check (unchanged)
+   - New issue: EULA last updated date (not in previous checklist)
+3. Delta is shown prominently in the report
+4. Verdict is CONCERNS (manual check + new EULA question)
+
+**Assertions:**
+- [ ] Delta section shows newly resolved items
+- [ ] Delta section shows new issues (not present in previous checklist)
+- [ ] Still-open items from the previous checklist are noted as persistent
+- [ ] Verdict reflects the current state (not the previous state)
+
+---
+
+### Case 5: Director Gate Check — No gate; launch-checklist is an audit utility
+
+**Fixture:**
+- All checklist dependencies present
+
+**Input:** `/launch-checklist`
+
+**Expected behavior:**
+1. Skill runs the full checklist and writes the report
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is LAUNCH READY, LAUNCH BLOCKED, or CONCERNS — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Checks all required categories (legal, platform, store, build, analytics, UX)
+- [ ] LAUNCH BLOCKED for hard failures (uncompleted certifications, missing legal docs)
+- [ ] CONCERNS for advisory items requiring manual verification
+- [ ] Compares against previous checklist when one exists
+- [ ] Asks "May I write" before creating the checklist report
+- [ ] Verdict is LAUNCH READY, LAUNCH BLOCKED, or CONCERNS
+
+---
+
+## Coverage Notes
+
+- Region-specific compliance (GDPR data handling, COPPA for under-13 audiences)
+  is checked but the specific requirements are not enumerated in test assertions.
+- The store page completeness check (screenshots, description) relies on the
+  presence of files in `production/store/`; it cannot verify visual quality.
+- Build reproducibility check validates the presence of a version tag and build
+  configuration but does not execute the build process.
--- a/Framework/skills/utility/localize.md
+++ b/Framework/skills/utility/localize.md
@@ -0,0 +1,176 @@
+# Skill Test Spec: /localize
+
+## Skill Summary
+
+`/localize` manages the full localization pipeline: it extracts all player-facing
+strings from source files, manages translation files in `assets/localization/`,
+and validates completeness across all locale files. For new languages, it creates
+a locale file skeleton with all current strings as keys and empty values. For
+existing locale files, it produces a diff showing additions, removals, and
+changed keys.
+
+Translation files are written to `assets/localization/[locale-code].csv` (or
+engine-appropriate format) after a "May I write" ask. No director gates apply.
+Verdicts: LOCALIZATION COMPLETE (all locales are complete) or GAPS FOUND (at
+least one locale is missing string keys).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: LOCALIZATION COMPLETE, GAPS FOUND
+- [ ] Contains "May I write" collaborative protocol language before writing locale files
+- [ ] Has a next-step handoff (e.g., send locale skeletons to translators)
+
+---
+
+## Director Gate Checks
+
+None. `/localize` is a pipeline utility. No director gates apply. Localization
+lead agent may review separately but is not invoked within this skill.
+
+---
+
+## Test Cases
+
+### Case 1: New Language — String Extraction and Locale Skeleton Created
+
+**Fixture:**
+- Source code in `src/` contains player-facing strings (UI text, tutorial messages)
+- Existing locale: `assets/localization/en.csv`
+- No French locale exists
+
+**Input:** `/localize fr`
+
+**Expected behavior:**
+1. Skill extracts all player-facing strings from source files
+2. Skill finds the same strings in `en.csv` as a reference
+3. Skill generates `fr.csv` skeleton with all string keys and empty values
+4. Skill asks "May I write to `assets/localization/fr.csv`?"
+5. File written on approval; verdict is GAPS FOUND (file created but empty values)
+6. Skill notes: "fr.csv created — send to translator to fill values"
+
+**Assertions:**
+- [ ] All string keys from `en.csv` are present in `fr.csv`
+- [ ] All values in `fr.csv` are empty (not copied from English)
+- [ ] "May I write" is asked before creating the file
+- [ ] Verdict is GAPS FOUND (file is created but untranslated)
+
+---
+
+### Case 2: Existing Locale Diff — Additions, Removals, and Changes Listed
+
+**Fixture:**
+- `assets/localization/fr.csv` exists with 20 string keys translated
+- Source code has changed: 3 new strings added, 1 string removed, 2 strings
+  with changed English source text
+
+**Input:** `/localize fr`
+
+**Expected behavior:**
+1. Skill extracts current strings from source
+2. Skill diffs against existing `fr.csv`
+3. Skill produces diff report:
+   - 3 new keys (need translation — listed as empty in fr.csv)
+   - 1 removed key (marked as obsolete — suggest removal)
+   - 2 changed keys (English source changed — French may need update, flagged)
+4. Skill asks "May I update `assets/localization/fr.csv`?"
+5. File updated with new empty keys added, obsolete keys marked; verdict is GAPS FOUND
+
+**Assertions:**
+- [ ] New keys appear as empty in the updated file (not auto-translated)
+- [ ] Removed keys are flagged as obsolete (not silently deleted)
+- [ ] Changed source strings are flagged for translator review
+- [ ] Verdict is GAPS FOUND (new empty keys exist)
+
+---
+
+### Case 3: String Missing in One Locale — GAPS FOUND With Missing Key List
+
+**Fixture:**
+- 3 locale files exist: `en.csv`, `fr.csv`, `de.csv`
+- `de.csv` is missing 4 keys that exist in both `en.csv` and `fr.csv`
+
+**Input:** `/localize`
+
+**Expected behavior:**
+1. Skill reads all 3 locale files and cross-references keys
+2. `de.csv` is missing 4 keys
+3. Skill produces GAPS FOUND report listing the 4 missing keys by locale:
+   "de.csv missing: [key1], [key2], [key3], [key4]"
+4. Skill offers to add the missing keys as empty values to `de.csv`
+5. After approval: file updated; verdict remains GAPS FOUND (values still empty)
+
+**Assertions:**
+- [ ] Missing keys are listed explicitly (not just a count)
+- [ ] Missing keys are attributed to the specific locale file
+- [ ] Verdict is GAPS FOUND (not LOCALIZATION COMPLETE)
+- [ ] Missing keys are added as empty (not auto-translated from English)
+
+---
+
+### Case 4: Translation File Has Syntax Error — Error With Line Reference
+
+**Fixture:**
+- `assets/localization/fr.csv` has a malformed line at line 47
+  (missing quote closure)
+
+**Input:** `/localize fr`
+
+**Expected behavior:**
+1. Skill reads `fr.csv` and encounters a parse error at line 47
+2. Skill outputs: "Parse error in fr.csv at line 47: [error detail]"
+3. Skill cannot diff or validate the file until the error is fixed
+4. Skill does NOT attempt to overwrite or auto-fix the malformed file
+5. Skill suggests fixing the file manually and re-running `/localize`
+
+**Assertions:**
+- [ ] Error message includes line number (line 47)
+- [ ] Error detail describes the nature of the parse error
+- [ ] Skill does NOT overwrite or modify the malformed file
+- [ ] Manual fix + re-run is suggested as remediation
+
+---
+
+### Case 5: Director Gate Check — No gate; localization is a pipeline utility
+
+**Fixture:**
+- Source code with player-facing strings
+
+**Input:** `/localize fr`
+
+**Expected behavior:**
+1. Skill extracts strings and manages locale files
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is LOCALIZATION COMPLETE or GAPS FOUND — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Extracts strings from source before operating on locale files
+- [ ] Creates new locale files with all keys as empty values (not auto-translated)
+- [ ] Diffs existing locale files against current source strings
+- [ ] Flags missing keys by locale and by key name
+- [ ] Asks "May I write" before creating or updating any locale file
+- [ ] Verdict is LOCALIZATION COMPLETE (all locales fully translated) or GAPS FOUND
+
+---
+
+## Coverage Notes
+
+- LOCALIZATION COMPLETE is only achievable when all locale files have all keys
+  with non-empty values; new-language skeleton creation always results in GAPS FOUND.
+- Engine-specific locale formats (Godot `.translation`, Unity `.po` files) are
+  handled by the skill body; `.csv` is used as the canonical format in tests.
+- The case where source strings change at a very high rate (continuous integration
+  of new UI text) is not tested; the diff logic handles this case.
--- a/Framework/skills/utility/onboard.md
+++ b/Framework/skills/utility/onboard.md
@@ -0,0 +1,179 @@
+# Skill Test Spec: /onboard
+
+## Skill Summary
+
+`/onboard` generates a contextual project onboarding summary tailored for a new
+team member. It reads CLAUDE.md, `technical-preferences.md`, the active sprint
+file, recent git commits, and `production/stage.txt` to produce a structured
+orientation document. The skill runs on the Haiku model (read-only, formatting
+task) and produces no file writes — all output is conversational.
+
+The skill optionally accepts a role argument (e.g., `/onboard artist`) to tailor
+the summary to a specific discipline. When the project is in an early stage or
+unconfigured, the output adapts to reflect what little is known. The verdict is
+always ONBOARDING COMPLETE — the skill is purely informational.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: ONBOARDING COMPLETE
+- [ ] Does NOT contain "May I write" language (skill is read-only)
+- [ ] Has a next-step handoff suggesting a relevant follow-on skill
+
+---
+
+## Director Gate Checks
+
+None. `/onboard` is a read-only orientation skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Configured project in Production stage with active sprint
+
+**Fixture:**
+- `production/stage.txt` contains `Production`
+- `technical-preferences.md` has engine, language, and specialists populated
+- `production/sprints/sprint-005.md` exists with stories in progress
+- Git log contains 5 recent commits
+
+**Input:** `/onboard`
+
+**Expected behavior:**
+1. Skill reads stage.txt, technical-preferences.md, active sprint, and git log
+2. Skill produces an onboarding summary with sections: Project Overview, Tech Stack,
+   Current Stage, Active Sprint Summary, Recent Activity
+3. Summary is formatted for readability (headers, bullet points)
+4. Next-step suggestions are appropriate for Production stage (e.g., `/sprint-status`,
+   `/dev-story`)
+5. Verdict ONBOARDING COMPLETE is stated
+
+**Assertions:**
+- [ ] Output includes current stage name from stage.txt
+- [ ] Output includes engine and language from technical-preferences.md
+- [ ] Active sprint stories are summarized (not just the sprint file name)
+- [ ] Recent commit context is present
+- [ ] Verdict is ONBOARDING COMPLETE
+- [ ] No files are written
+
+---
+
+### Case 2: Fresh Project — No engine, no sprint, suggests /start
+
+**Fixture:**
+- `technical-preferences.md` contains only placeholders (`[TO BE CONFIGURED]`)
+- No `production/stage.txt`
+- No sprint files
+- No CLAUDE.md overrides beyond defaults
+
+**Input:** `/onboard`
+
+**Expected behavior:**
+1. Skill reads all config files and detects unconfigured state
+2. Skill produces a minimal summary: "This project has not been configured yet"
+3. Output explains the onboarding workflow: `/start` → `/setup-engine` → `/brainstorm`
+4. Skill suggests running `/start` as the immediate next step
+5. Verdict is ONBOARDING COMPLETE (informational, not a failure)
+
+**Assertions:**
+- [ ] Output explicitly mentions the project is not yet configured
+- [ ] `/start` is recommended as the next step
+- [ ] Skill does NOT error out — it gracefully handles an empty project state
+- [ ] Verdict is still ONBOARDING COMPLETE
+
+---
+
+### Case 3: No CLAUDE.md Found — Error with remediation
+
+**Fixture:**
+- `CLAUDE.md` file does not exist (deleted or never created)
+- All other files may or may not exist
+
+**Input:** `/onboard`
+
+**Expected behavior:**
+1. Skill attempts to read CLAUDE.md and fails
+2. Skill outputs an error: "CLAUDE.md not found — cannot generate onboarding summary"
+3. Skill provides remediation: "Run `/start` to initialize the project configuration"
+4. No partial summary is generated
+
+**Assertions:**
+- [ ] Error message clearly identifies the missing file as CLAUDE.md
+- [ ] Remediation step (`/start`) is explicitly named
+- [ ] Skill does NOT produce a partial output when the root config is missing
+- [ ] Verdict is ONBOARDING COMPLETE (with error context, not a crash)
+
+---
+
+### Case 4: Role-Specific Onboarding — User specifies "artist" role
+
+**Fixture:**
+- Fully configured project in Production stage
+- `art-bible.md` exists in `design/`
+- Active sprint has visual story types (animation, VFX)
+
+**Input:** `/onboard artist`
+
+**Expected behavior:**
+1. Skill reads all standard files plus any art-relevant docs (art bible, asset specs)
+2. Summary is tailored to the artist role: art bible overview, asset pipeline,
+   current visual stories in the active sprint
+3. Technical architecture details (code structure, ADRs) are de-emphasized
+4. Specialist agents for art/audio are highlighted in the summary
+5. Verdict is ONBOARDING COMPLETE
+
+**Assertions:**
+- [ ] Role argument is acknowledged in the output ("Onboarding for: Artist")
+- [ ] Art bible summary is included if the file exists
+- [ ] Current visual stories from the active sprint are shown
+- [ ] Technical implementation details are not the primary focus
+- [ ] Verdict is ONBOARDING COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; onboard is read-only orientation
+
+**Fixture:**
+- Any configured project state
+
+**Input:** `/onboard`
+
+**Expected behavior:**
+1. Skill completes the full onboarding summary
+2. No director agents are spawned at any point
+3. No gate IDs appear in the output
+4. No "May I write" prompts appear
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No write tool is called
+- [ ] No gate skip messages appear
+- [ ] Verdict is ONBOARDING COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads all source files before generating output (no hallucinated project state)
+- [ ] Adapts output to project stage (Production ≠ Concept)
+- [ ] Respects role argument when provided
+- [ ] Does not write any files
+- [ ] Ends with ONBOARDING COMPLETE verdict in all paths
+
+---
+
+## Coverage Notes
+
+- The case where `technical-preferences.md` is missing entirely (as opposed to
+  having placeholders) is not separately tested; behavior follows the graceful
+  error pattern of Case 3.
+- Git history reading is assumed available; offline/no-git scenarios are not
+  tested here.
+- Discipline roles beyond "artist" (e.g., programmer, designer, producer) follow
+  the same tailoring pattern as Case 4 and are not separately tested.
--- a/Framework/skills/utility/playtest-report.md
+++ b/Framework/skills/utility/playtest-report.md
@@ -0,0 +1,178 @@
+# Skill Test Spec: /playtest-report
+
+## Skill Summary
+
+`/playtest-report` generates a structured playtest report from session notes or
+user input. The report is organized into four sections: Feel/Accessibility,
+Bugs Observed, Design Feedback, and Next Steps. When multiple testers participated,
+the skill aggregates feedback and distinguishes majority opinions from minority
+ones. The skill links to existing bug reports when a reported bug matches a file
+in `production/bugs/`.
+
+Reports are written to `production/qa/playtest-[date].md` after a "May I write"
+ask. No director gates apply here — the CD-PLAYTEST director gate (if needed) is
+a separate invocation. The verdict is COMPLETE when the report is written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before writing the report
+- [ ] Has a next-step handoff (e.g., `/bug-report` for new issues found, `/design-review` for feedback)
+
+---
+
+## Director Gate Checks
+
+None. `/playtest-report` is a documentation utility. The CD-PLAYTEST gate is a
+separate invocation and not part of this skill.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — User provides playtest notes, structured report produced
+
+**Fixture:**
+- User provides typed playtest notes from a single session
+- Notes cover: game feel, one bug (framerate drop), and a design concern
+  (tutorial too long)
+- `production/bugs/` exists but is empty (bug not yet reported)
+
+**Input:** `/playtest-report` (user pastes session notes)
+
+**Expected behavior:**
+1. Skill reads the provided notes and structures them into the 4-section template
+2. Feel/Accessibility: extracts feel observations
+3. Bugs: notes the framerate drop with available repro details
+4. Design Feedback: notes the tutorial length concern
+5. Next Steps: suggests `/bug-report` for the framerate issue and `/design-review`
+   for the tutorial feedback
+6. Skill asks "May I write to `production/qa/playtest-2026-04-06.md`?"
+7. Report is written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 4 sections are present in the report
+- [ ] Bug is listed in the Bugs section (not the Design Feedback section)
+- [ ] Next Steps are appropriate (bug report for crash, design review for feedback)
+- [ ] "May I write" is asked before writing
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Empty Input — Guided prompting through each section
+
+**Fixture:**
+- No notes provided by user at invocation
+
+**Input:** `/playtest-report`
+
+**Expected behavior:**
+1. Skill detects empty input
+2. Skill prompts through each section:
+   a. "Describe the overall feel and any accessibility observations"
+   b. "Were any bugs observed? Describe them"
+   c. "What design feedback did testers provide?"
+3. User answers each prompt
+4. Skill compiles report from answers and asks "May I write"
+5. Report written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] At least 3 guiding questions are asked (one per main section)
+- [ ] Report is not created until all sections have input (or user explicitly skips one)
+- [ ] Verdict is COMPLETE after file is written
+
+---
+
+### Case 3: Multiple Testers — Aggregated feedback with majority/minority notes
+
+**Fixture:**
+- User provides notes from 3 testers
+- 2/3 testers found the controls "intuitive"
+- 1/3 tester found the UI font too small
+- All 3 noted the same bug (player stuck on ledge)
+
+**Input:** `/playtest-report` (3-tester session)
+
+**Expected behavior:**
+1. Skill identifies 3 distinct tester perspectives in the input
+2. Control intuitiveness → noted as "Majority (2/3): controls intuitive"
+3. Font size → noted as "Minority (1/3): UI font size concern"
+4. Stuck-on-ledge bug → noted as "All testers: player stuck on ledge (confirmed)"
+5. Skill generates aggregated report with majority/minority labels
+6. Report written after "May I write" approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Majority opinion (2/3) is labeled as majority
+- [ ] Minority opinion (1/3) is labeled as minority
+- [ ] Unanimously reported bug is noted as confirmed by all testers
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: Bug Matches Existing Report — Links to existing file
+
+**Fixture:**
+- `production/bugs/bug-2026-03-30-player-stuck-ledge.md` exists
+- User's playtest notes describe "player gets stuck on ledges near walls"
+
+**Input:** `/playtest-report`
+
+**Expected behavior:**
+1. Skill structures the report and identifies the stuck-on-ledge bug
+2. Skill scans `production/bugs/` and finds `bug-2026-03-30-player-stuck-ledge.md`
+3. In the Bugs section, the report includes: "See existing report:
+   production/bugs/bug-2026-03-30-player-stuck-ledge.md"
+4. Skill does NOT suggest creating a new bug report for this issue
+5. Report written; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Existing bug report is found and linked in the playtest report
+- [ ] `/bug-report` is NOT suggested for the already-reported issue
+- [ ] Cross-reference to existing file appears in the Bugs section
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; CD-PLAYTEST is a separate invocation
+
+**Fixture:**
+- Playtest notes provided
+
+**Input:** `/playtest-report`
+
+**Expected behavior:**
+1. Skill generates and writes the playtest report
+2. No director agents are spawned (CD-PLAYTEST is not invoked here)
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No CD-PLAYTEST gate skip message appears
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Structures output into all 4 sections (Feel, Bugs, Design Feedback, Next Steps)
+- [ ] Labels majority vs. minority opinions when multiple testers are involved
+- [ ] Cross-references existing bug reports when bugs match
+- [ ] Asks "May I write to `production/qa/playtest-[date].md`?" before writing
+- [ ] Verdict is COMPLETE when report is written
+
+---
+
+## Coverage Notes
+
+- The CD-PLAYTEST director gate (creative director reviews playtest insights
+  for design implications) is a separate invocation and is not tested here.
+- Video recording or screenshot attachments are not tested; the report is a
+  text-only document.
+- The case where a tester's identity is unknown (anonymous feedback) follows
+  the same aggregation pattern as Case 3 without tester labels.
--- a/Framework/skills/utility/project-stage-detect.md
+++ b/Framework/skills/utility/project-stage-detect.md
@@ -0,0 +1,183 @@
+# Skill Test Spec: /project-stage-detect
+
+## Skill Summary
+
+`/project-stage-detect` automatically analyzes project artifacts to determine
+the current development stage. It runs on the Haiku model (read-only) and
+examines `production/stage.txt` (if present), design documents in `design/`,
+source code in `src/`, sprint and milestone files in `production/`, and the
+presence of engine configuration to classify the project into one of seven
+stages: Concept, Systems Design, Technical Setup, Pre-Production, Production,
+Polish, or Release.
+
+The skill is advisory — it never writes `stage.txt`. That file is only updated
+when `/gate-check` passes and the user confirms advancement. The skill reports
+its confidence level (HIGH if stage.txt was read directly, MEDIUM if inferred
+from artifacts, LOW if conflicting signals were found).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains all seven stage names: Concept, Systems Design, Technical Setup, Pre-Production, Production, Polish, Release
+- [ ] Does NOT contain "May I write" language (skill is detection-only)
+- [ ] Has a next-step handoff (e.g., `/gate-check` to formally advance stage)
+
+---
+
+## Director Gate Checks
+
+None. `/project-stage-detect` is a read-only detection utility. No director
+gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: stage.txt Exists — Reads directly and cross-checks artifacts
+
+**Fixture:**
+- `production/stage.txt` contains `Production`
+- `design/gdd/` has 4 GDD files
+- `src/` has source code files
+- `production/sprints/sprint-002.md` exists
+
+**Input:** `/project-stage-detect`
+
+**Expected behavior:**
+1. Skill reads `production/stage.txt` — detects stage `Production`
+2. Skill cross-checks artifacts: GDDs present, source code present, sprint present
+3. Artifacts are consistent with Production stage
+4. Skill reports: Stage = Production, Confidence = HIGH (from stage.txt, confirmed by artifacts)
+5. Next step: continue with `/sprint-plan` or `/dev-story`
+
+**Assertions:**
+- [ ] Detected stage is Production
+- [ ] Confidence is reported as HIGH when stage.txt is present
+- [ ] Cross-check result (consistent vs. discrepant) is noted
+- [ ] No files are written
+- [ ] Verdict clearly states the detected stage
+
+---
+
+### Case 2: No stage.txt but GDDs and Epics Exist — Infers Production
+
+**Fixture:**
+- No `production/stage.txt`
+- `design/gdd/` has 3 GDD files
+- `production/epics/` has 2 epic files
+- `src/` has source code files
+- `production/sprints/sprint-001.md` exists
+
+**Input:** `/project-stage-detect`
+
+**Expected behavior:**
+1. Skill finds no stage.txt — switches to artifact inference mode
+2. Skill finds GDDs (Systems Design complete), epics (Pre-Production complete),
+   source code and sprints (Production active)
+3. Skill infers: Stage = Production
+4. Confidence is MEDIUM (inferred from artifacts, not from stage.txt)
+5. Skill recommends running `/gate-check` to formalize and write stage.txt
+
+**Assertions:**
+- [ ] Inferred stage is Production
+- [ ] Confidence is MEDIUM (not HIGH, since stage.txt is absent)
+- [ ] Recommendation to run `/gate-check` is present
+- [ ] No stage.txt is written by this skill
+
+---
+
+### Case 3: No stage.txt, No Docs, No Source — Infers Concept
+
+**Fixture:**
+- No `production/stage.txt`
+- `design/` directory exists but is empty
+- `src/` exists but contains no code files
+- `technical-preferences.md` has placeholders only
+
+**Input:** `/project-stage-detect`
+
+**Expected behavior:**
+1. Skill finds no stage.txt
+2. Artifact scan: no GDDs, no source, no epics, no sprints, engine unconfigured
+3. Skill infers: Stage = Concept
+4. Confidence is MEDIUM
+5. Skill suggests `/start` to begin the onboarding workflow
+
+**Assertions:**
+- [ ] Inferred stage is Concept
+- [ ] Output lists the artifacts that were checked (and found absent)
+- [ ] `/start` is suggested as the next step
+- [ ] No files are written
+
+---
+
+### Case 4: Discrepancy — stage.txt says Production but no source code
+
+**Fixture:**
+- `production/stage.txt` contains `Production`
+- `design/gdd/` has GDD files
+- `src/` directory exists but contains no source code files
+- No sprint files exist
+
+**Input:** `/project-stage-detect`
+
+**Expected behavior:**
+1. Skill reads stage.txt — detects `Production`
+2. Cross-check finds: no source code, no sprints — inconsistent with Production
+3. Skill flags discrepancy: "stage.txt says Production but no source code or sprints found"
+4. Skill reports detected stage as Production (honoring stage.txt) but
+   confidence drops to LOW due to artifact mismatch
+5. Skill suggests reviewing stage.txt manually or running `/gate-check`
+
+**Assertions:**
+- [ ] Discrepancy is flagged explicitly in the output
+- [ ] Confidence is LOW when artifacts contradict stage.txt
+- [ ] stage.txt value is not silently overridden
+- [ ] User is advised to verify the discrepancy manually
+
+---
+
+### Case 5: Director Gate Check — No gate; detection is advisory
+
+**Fixture:**
+- Any project state with or without stage.txt
+
+**Input:** `/project-stage-detect`
+
+**Expected behavior:**
+1. Skill completes full stage detection
+2. No director agents are spawned at any point
+3. No gate IDs appear in output
+4. No write tool is called
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No write tool is called
+- [ ] Detection output is purely advisory
+- [ ] Verdict names the detected stage without triggering any gate
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads stage.txt if present; falls back to artifact inference if absent
+- [ ] Always reports a confidence level (HIGH / MEDIUM / LOW)
+- [ ] Cross-checks stage.txt against artifacts and flags discrepancies
+- [ ] Does not write stage.txt (that is `/gate-check`'s responsibility)
+- [ ] Ends with a next-step recommendation appropriate to the detected stage
+
+---
+
+## Coverage Notes
+
+- The Technical Setup stage (engine configured, no GDDs yet) and Pre-Production
+  stage (GDDs complete, no epics yet) follow the same artifact-inference pattern
+  as Cases 2 and 3 and are not separately fixture-tested.
+- The Polish and Release stages are not fixture-tested here; they follow the
+  same high-confidence (stage.txt present) or inference logic.
+- Confidence levels are advisory — the skill does not gate any actions on them.
--- a/Framework/skills/utility/prototype.md
+++ b/Framework/skills/utility/prototype.md
@@ -0,0 +1,178 @@
+# Skill Test Spec: /prototype
+
+## Skill Summary
+
+`/prototype` manages a rapid prototyping workflow for validating a game mechanic
+before committing to full production implementation. Prototypes are created in
+`prototypes/[mechanic-name]/` and are intentionally disposable — coding standards
+are relaxed (no ADR required, AC can be minimal, hardcoded values acceptable).
+After implementation, the skill produces a findings document summarizing what
+was learned and recommending next steps.
+
+The skill asks "May I write to `prototypes/[name]/`?" before creating files. If a
+prototype already exists, the skill offers to extend, replace, or archive. No
+director gates apply. Verdicts: PROTOTYPE COMPLETE (prototype built and findings
+documented) or PROTOTYPE ABANDONED (mechanic found to be unworkable).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: PROTOTYPE COMPLETE, PROTOTYPE ABANDONED
+- [ ] Contains "May I write" language before creating prototype files
+- [ ] Has a next-step handoff (e.g., `/design-system` to formalize, or archive)
+
+---
+
+## Director Gate Checks
+
+None. Prototypes are throwaway validation artifacts. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Mechanic concept prototyped, findings documented
+
+**Fixture:**
+- `prototypes/` directory exists
+- No existing prototype for "grapple-hook"
+
+**Input:** `/prototype grapple-hook`
+
+**Expected behavior:**
+1. Skill asks "May I write to `prototypes/grapple-hook/`?"
+2. After approval: creates `prototypes/grapple-hook/` directory and basic
+   implementation skeleton (main scene, player controller extension)
+3. Skill implements a minimal grapple hook mechanic (intentionally rough — no
+   polish, hardcoded values acceptable)
+4. Skill produces `prototypes/grapple-hook/findings.md` with:
+   - What was tested
+   - What worked
+   - What didn't work
+   - Recommendation (proceed / abandon / revise concept)
+5. Verdict is PROTOTYPE COMPLETE
+
+**Assertions:**
+- [ ] "May I write to `prototypes/grapple-hook/`?" is asked before any files are created
+- [ ] Implementation is isolated to `prototypes/` (not `src/`)
+- [ ] `findings.md` is created with at minimum: tested/worked/didn't-work/recommendation
+- [ ] Verdict is PROTOTYPE COMPLETE
+
+---
+
+### Case 2: Prototype Already Exists — Offers Extend, Replace, or Archive
+
+**Fixture:**
+- `prototypes/grapple-hook/` already exists from a previous prototype session
+- It contains a basic implementation and a findings.md
+
+**Input:** `/prototype grapple-hook`
+
+**Expected behavior:**
+1. Skill detects existing `prototypes/grapple-hook/` directory
+2. Skill reports: "Prototype already exists for grapple-hook"
+3. Skill presents 3 options:
+   - Extend: add new features to the existing prototype
+   - Replace: start fresh (asks "May I replace `prototypes/grapple-hook/`?")
+   - Archive: move to `prototypes/archive/grapple-hook/` and start fresh
+4. User selects; skill proceeds accordingly
+
+**Assertions:**
+- [ ] Existing prototype is detected and reported
+- [ ] Exactly 3 options are presented (extend, replace, archive)
+- [ ] Replace path includes a "May I replace" confirmation
+- [ ] Archive path moves (not deletes) the existing prototype
+
+---
+
+### Case 3: Prototype Validates Mechanic — Recommends Proceeding to Production
+
+**Fixture:**
+- Prototype implementation complete
+- Findings: grapple hook mechanic is fun and technically feasible
+
+**Input:** `/prototype grapple-hook` (prototype session complete)
+
+**Expected behavior:**
+1. After prototype is built and tested, findings are summarized
+2. Recommendation in findings.md: "Mechanic validated — recommend proceeding
+   to `/design-system` for full specification"
+3. Skill handoff message explicitly suggests `/design-system grapple-hook`
+4. Verdict is PROTOTYPE COMPLETE
+
+**Assertions:**
+- [ ] `findings.md` contains an explicit recommendation
+- [ ] Recommendation references `/design-system` when mechanic is validated
+- [ ] Handoff message echoes the recommendation
+- [ ] Verdict is PROTOTYPE COMPLETE (not PROTOTYPE ABANDONED)
+
+---
+
+### Case 4: Prototype Reveals Mechanic is Unworkable — PROTOTYPE ABANDONED
+
+**Fixture:**
+- Prototype implemented for "procedural-dialogue"
+- After testing: the mechanic creates incoherent dialogue trees and is
+  frustrating to play
+
+**Input:** `/prototype procedural-dialogue`
+
+**Expected behavior:**
+1. Prototype is built
+2. Findings document the failure: incoherent output, player confusion, technical complexity
+3. Recommendation in findings.md: "Mechanic not viable — abandoning"
+4. `findings.md` documents the specific reasons the mechanic failed
+5. Skill suggests alternatives in the handoff (e.g., curated dialogue instead)
+6. Verdict is PROTOTYPE ABANDONED
+
+**Assertions:**
+- [ ] Verdict is PROTOTYPE ABANDONED (not PROTOTYPE COMPLETE)
+- [ ] `findings.md` documents specific failure reasons (not vague)
+- [ ] Alternative approaches are suggested in the handoff
+- [ ] Prototype files are retained (not deleted) for reference
+
+---
+
+### Case 5: Director Gate Check — No gate; prototypes are validation artifacts
+
+**Fixture:**
+- Mechanic concept provided
+
+**Input:** `/prototype wall-jump`
+
+**Expected behavior:**
+1. Skill creates and documents the prototype
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is PROTOTYPE COMPLETE or PROTOTYPE ABANDONED — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Asks "May I write to `prototypes/[name]/`?" before creating any files
+- [ ] Creates all files under `prototypes/` (not `src/`)
+- [ ] Produces `findings.md` with tested/worked/didn't-work/recommendation
+- [ ] Notes that production coding standards are intentionally relaxed
+- [ ] Offers extend/replace/archive when prototype already exists
+- [ ] Verdict is PROTOTYPE COMPLETE or PROTOTYPE ABANDONED
+
+---
+
+## Coverage Notes
+
+- Prototype implementation quality (code style) is intentionally not tested —
+  prototypes are throwaway artifacts and quality standards do not apply.
+- The archiving mechanism is mentioned in Case 2 but the archive format is
+  not assertion-tested in detail.
+- Engine-specific prototype scaffolding (GDScript scenes vs. C# MonoBehaviour)
+  follows the same flow with engine-appropriate file types.
--- a/Framework/skills/utility/qa-plan.md
+++ b/Framework/skills/utility/qa-plan.md
@@ -0,0 +1,175 @@
+# Skill Test Spec: /qa-plan
+
+## Skill Summary
+
+`/qa-plan` generates a structured QA test plan for a feature or sprint milestone.
+It reads story files for the specified sprint, extracts acceptance criteria from
+each story, cross-references test standards from `coding-standards.md` to assign
+the appropriate test type (unit, integration, visual, UI, or config/data), and
+produces a prioritized QA plan document.
+
+The skill asks "May I write to `production/qa/qa-plan-sprint-NNN.md`?" before
+persisting the output. If an existing test plan for the same sprint is found, the
+skill offers to update rather than replace. The verdict is COMPLETE when the plan
+is written. No director gates are used — gate-level story readiness is handled by
+`/story-readiness`.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before writing the plan
+- [ ] Has a next-step handoff (e.g., `/smoke-check` or `/story-readiness`)
+
+---
+
+## Director Gate Checks
+
+None. `/qa-plan` is a planning utility. Story readiness gates are separate.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Sprint with 4 stories generates full test plan
+
+**Fixture:**
+- `production/sprints/sprint-003.md` lists 4 stories with defined acceptance criteria
+- Stories span types: 1 logic (formula), 1 integration, 1 visual, 1 UI
+- `coding-standards.md` is present with test evidence table
+
+**Input:** `/qa-plan sprint-003`
+
+**Expected behavior:**
+1. Skill reads sprint-003.md and identifies 4 stories
+2. Skill reads each story's acceptance criteria
+3. Skill assigns test types per coding-standards.md table:
+   - Logic story → Unit test (BLOCKING)
+   - Integration story → Integration test (BLOCKING)
+   - Visual story → Screenshot + lead sign-off (ADVISORY)
+   - UI story → Manual walkthrough doc (ADVISORY)
+4. Skill drafts QA plan with story-by-story test type breakdown
+5. Skill asks "May I write to `production/qa/qa-plan-sprint-003.md`?"
+6. File is written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 4 stories are included in the plan
+- [ ] Test type is assigned per coding-standards.md (not guessed)
+- [ ] Gate level (BLOCKING vs ADVISORY) is noted for each story
+- [ ] "May I write" is asked with the correct file path
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Story With No Acceptance Criteria — Flagged as UNTESTABLE
+
+**Fixture:**
+- `production/sprints/sprint-004.md` lists 3 stories; one story has empty
+  acceptance criteria section
+
+**Input:** `/qa-plan sprint-004`
+
+**Expected behavior:**
+1. Skill reads all 3 stories
+2. Skill detects the story with no AC
+3. Story is flagged as `UNTESTABLE — Acceptance Criteria required` in the plan
+4. Other 2 stories receive normal test type assignments
+5. Plan is written with the UNTESTABLE story flagged; verdict is COMPLETE
+
+**Assertions:**
+- [ ] UNTESTABLE label appears for the story with no AC
+- [ ] Plan is not blocked — the other stories are still planned
+- [ ] Output suggests adding AC to the flagged story (next step)
+- [ ] Verdict is COMPLETE (the plan is still generated)
+
+---
+
+### Case 3: Existing Test Plan Found — Offers update rather than replace
+
+**Fixture:**
+- `production/qa/qa-plan-sprint-003.md` already exists from a previous run
+- Sprint-003 has 2 new stories added since the last plan
+
+**Input:** `/qa-plan sprint-003`
+
+**Expected behavior:**
+1. Skill reads sprint-003.md and detects 2 stories not in the existing plan
+2. Skill reports: "Existing QA plan found for sprint-003 — offering to update"
+3. Skill presents the 2 new stories and their proposed test assignments
+4. Skill asks "May I update `production/qa/qa-plan-sprint-003.md`?" (not overwrite)
+5. Updated plan is written on approval
+
+**Assertions:**
+- [ ] Skill detects the existing plan file
+- [ ] "update" language is used (not "overwrite")
+- [ ] Only new stories are proposed for addition — existing entries preserved
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: No Stories Found for Sprint — Error with guidance
+
+**Fixture:**
+- `production/sprints/sprint-007.md` does not exist
+- No other sprint file matching sprint-007
+
+**Input:** `/qa-plan sprint-007`
+
+**Expected behavior:**
+1. Skill attempts to read sprint-007.md — file not found
+2. Skill outputs: "No sprint file found for sprint-007"
+3. Skill suggests running `/sprint-plan` to create the sprint first
+4. No plan is written; no "May I write" is asked
+
+**Assertions:**
+- [ ] Error message names the missing sprint file
+- [ ] `/sprint-plan` is suggested as the remediation step
+- [ ] No write tool is called
+- [ ] Verdict is not COMPLETE (error state)
+
+---
+
+### Case 5: Director Gate Check — No gate; QA planning is a utility
+
+**Fixture:**
+- Sprint with valid stories and AC
+
+**Input:** `/qa-plan sprint-003`
+
+**Expected behavior:**
+1. Skill generates and writes QA plan
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Skill reaches COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads coding-standards.md test evidence table before assigning test types
+- [ ] Assigns BLOCKING or ADVISORY gate level per story type
+- [ ] Flags stories with no AC as UNTESTABLE (does not silently skip them)
+- [ ] Detects existing plan and offers update path
+- [ ] Asks "May I write" before creating or updating the plan file
+- [ ] Verdict is COMPLETE when plan is written
+
+---
+
+## Coverage Notes
+
+- The case where `coding-standards.md` is missing (skill cannot assign test types)
+  is not fixture-tested; behavior would follow the BLOCKED pattern with a note
+  to restore the standards file.
+- Multi-sprint planning (spanning 2 sprints) is not tested; the skill is designed
+  for one sprint at a time.
+- Config/data story type (balance tuning → smoke check) follows the same
+  assignment pattern as other types in Case 1 and is not separately tested.
--- a/Framework/skills/utility/regression-suite.md
+++ b/Framework/skills/utility/regression-suite.md
@@ -0,0 +1,172 @@
+# Skill Test Spec: /regression-suite
+
+## Skill Summary
+
+`/regression-suite` maps test coverage to GDD requirements: it reads the
+acceptance criteria from story files in the current sprint (or a specified epic),
+then scans `tests/` for corresponding test files and checks whether each AC has
+a matching assertion. It produces a coverage report identifying which ACs are
+fully covered, partially covered, or untested, and which test files have no
+matching AC (orphan tests).
+
+The skill may write a coverage report to `production/qa/` after a "May I write"
+ask. No director gates apply. Verdicts: FULL COVERAGE (all ACs have tests),
+GAPS FOUND (some ACs are untested), or CRITICAL GAPS (a critical-priority AC
+has no test).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: FULL COVERAGE, GAPS FOUND, CRITICAL GAPS
+- [ ] Contains "May I write" language (skill may write coverage report)
+- [ ] Has a next-step handoff (e.g., `/test-setup` if framework missing, `/qa-plan` if plan missing)
+
+---
+
+## Director Gate Checks
+
+None. `/regression-suite` is a QA analysis utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Full Coverage — All ACs in sprint have corresponding tests
+
+**Fixture:**
+- `production/sprints/sprint-004.md` lists 3 stories with 2 ACs each (6 total)
+- `tests/unit/` and `tests/integration/` contain test files that match all 6 ACs
+  (by system name and scenario description)
+
+**Input:** `/regression-suite sprint-004`
+
+**Expected behavior:**
+1. Skill reads all 6 ACs from sprint-004 stories
+2. Skill scans test files and matches each AC to at least one test assertion
+3. All 6 ACs have coverage
+4. Skill produces coverage report: "6/6 ACs covered"
+5. Skill asks "May I write to `production/qa/regression-sprint-004.md`?"
+6. File is written on approval; verdict is FULL COVERAGE
+
+**Assertions:**
+- [ ] All 6 ACs appear in the coverage report
+- [ ] Each AC is marked as covered with the matching test file referenced
+- [ ] Verdict is FULL COVERAGE
+- [ ] "May I write" is asked before writing the report
+
+---
+
+### Case 2: Gaps Found — 3 ACs have no tests
+
+**Fixture:**
+- Sprint has 5 stories with 8 total ACs
+- Tests exist for 5 of the 8 ACs; 3 ACs have no corresponding test file or assertion
+
+**Input:** `/regression-suite`
+
+**Expected behavior:**
+1. Skill reads all 8 ACs
+2. Skill scans tests — 5 matched, 3 unmatched
+3. Coverage report lists the 3 untested ACs by story and AC text
+4. Skill asks "May I write to `production/qa/regression-[sprint]-[date].md`?"
+5. Report is written; verdict is GAPS FOUND
+
+**Assertions:**
+- [ ] The 3 untested ACs are listed by name in the report
+- [ ] Matched ACs are also shown (not only the gaps)
+- [ ] Verdict is GAPS FOUND (not FULL COVERAGE)
+- [ ] Report is written after "May I write" approval
+
+---
+
+### Case 3: Critical AC Untested — CRITICAL GAPS verdict, flagged prominently
+
+**Fixture:**
+- Sprint has 4 stories; one story is Priority: Critical with 2 ACs
+- One of the critical-priority ACs has no test
+
+**Input:** `/regression-suite`
+
+**Expected behavior:**
+1. Skill reads all stories and ACs, noting which stories are critical priority
+2. Skill scans tests — the critical AC has no match
+3. Report prominently flags: "CRITICAL GAP: [AC text] — no test found (Critical priority story)"
+4. Skill recommends blocking story completion until test is added
+5. Verdict is CRITICAL GAPS
+
+**Assertions:**
+- [ ] Verdict is CRITICAL GAPS (not GAPS FOUND)
+- [ ] Critical priority AC is flagged more prominently than normal gaps
+- [ ] Recommendation to block story completion is included
+- [ ] Non-critical gaps (if any) are also listed
+
+---
+
+### Case 4: Orphan Tests — Test file has no matching AC
+
+**Fixture:**
+- `tests/unit/save_system_test.gd` exists with assertions for scenarios
+  not present in any current story's AC list
+- Current sprint stories do not reference save system
+
+**Input:** `/regression-suite`
+
+**Expected behavior:**
+1. Skill scans tests and cross-references ACs
+2. `save_system_test.gd` assertions do not match any current AC
+3. Test file is flagged as ORPHAN TEST in the coverage report
+4. Report notes: "Orphan tests may belong to a past or future sprint, or AC was renamed"
+5. Verdict is FULL COVERAGE or GAPS FOUND depending on overall AC coverage
+   (orphan tests do not affect verdict, they are advisory)
+
+**Assertions:**
+- [ ] Orphan test is flagged in the report
+- [ ] Orphan flag includes the filename and suggestion (past sprint / renamed AC)
+- [ ] Orphan tests do not cause a GAPS FOUND verdict on their own
+- [ ] Overall verdict reflects AC coverage only
+
+---
+
+### Case 5: Director Gate Check — No gate; regression-suite is a QA utility
+
+**Fixture:**
+- Sprint with stories and test files
+
+**Input:** `/regression-suite`
+
+**Expected behavior:**
+1. Skill produces coverage report and writes it
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is FULL COVERAGE, GAPS FOUND, or CRITICAL GAPS — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads story ACs from sprint files before scanning tests
+- [ ] Matches ACs to tests by system name and scenario (not file name alone)
+- [ ] Flags critical-priority untested ACs as CRITICAL GAPS
+- [ ] Flags orphan tests (exist in tests/ but no AC matches)
+- [ ] Asks "May I write" before persisting the coverage report
+- [ ] Verdict is FULL COVERAGE, GAPS FOUND, or CRITICAL GAPS
+
+---
+
+## Coverage Notes
+
+- The heuristic for matching an AC to a test (by system name + scenario keywords)
+  is approximate; exact matching logic is defined in the skill body.
+- Integration test coverage is mapped the same way as unit test coverage; no
+  distinction in verdicts is made between the two.
+- This skill does not run the tests — it maps AC text to test assertions. Test
+  execution is handled by the CI pipeline.
--- a/Framework/skills/utility/release-checklist.md
+++ b/Framework/skills/utility/release-checklist.md
@@ -0,0 +1,177 @@
+# Skill Test Spec: /release-checklist
+
+## Skill Summary
+
+`/release-checklist` generates an internal release readiness checklist covering:
+sprint story completion, open bug severity, QA sign-off status, build stability,
+and changelog readiness. It is an internal gate — not a platform/store checklist
+(that is `/launch-checklist`). When a previous release checklist exists, it shows
+a delta of resolved and newly introduced issues.
+
+The skill writes its checklist report to `production/releases/release-checklist-[date].md`
+after a "May I write" ask. No director gates apply — `/gate-check` handles
+formal phase gate logic. Verdicts: RELEASE READY, RELEASE BLOCKED, or CONCERNS.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: RELEASE READY, RELEASE BLOCKED, CONCERNS
+- [ ] Contains "May I write" collaborative protocol language before writing the report
+- [ ] Has a next-step handoff (e.g., `/launch-checklist` for external or `/gate-check` for phase)
+
+---
+
+## Director Gate Checks
+
+None. `/release-checklist` is an internal audit utility. Formal phase advancement
+is managed by `/gate-check`.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All Sprint Stories Complete, QA Passed, RELEASE READY
+
+**Fixture:**
+- `production/sprints/sprint-008.md` — all stories are `Status: Done`
+- No open bugs with severity HIGH or CRITICAL in `production/bugs/`
+- `production/qa/qa-plan-sprint-008.md` has QA sign-off annotation
+- Changelog entry for this version exists
+- `production/stage.txt` contains `Polish`
+
+**Input:** `/release-checklist`
+
+**Expected behavior:**
+1. Skill reads sprint-008: all stories Done
+2. Skill reads bugs: no HIGH or CRITICAL open bugs
+3. Skill confirms QA plan has sign-off
+4. Skill confirms changelog entry exists
+5. All checks pass; skill asks "May I write to
+   `production/releases/release-checklist-2026-04-06.md`?"
+6. Report written; verdict is RELEASE READY
+
+**Assertions:**
+- [ ] All 4 check categories are evaluated (stories, bugs, QA, changelog)
+- [ ] All items appear with PASS markers
+- [ ] Verdict is RELEASE READY
+- [ ] "May I write" is asked before writing
+
+---
+
+### Case 2: Open HIGH Severity Bugs — RELEASE BLOCKED
+
+**Fixture:**
+- All sprint stories are Done
+- `production/bugs/` contains 2 open bugs with severity HIGH
+
+**Input:** `/release-checklist`
+
+**Expected behavior:**
+1. Skill reads sprint — stories complete
+2. Skill reads bugs — 2 HIGH severity bugs open
+3. Skill reports: "RELEASE BLOCKED — 2 open HIGH severity bugs must be resolved"
+4. Both bug filenames are listed in the report
+5. Verdict is RELEASE BLOCKED
+
+**Assertions:**
+- [ ] Verdict is RELEASE BLOCKED (not CONCERNS)
+- [ ] Both bug filenames are listed explicitly
+- [ ] Skill makes clear HIGH severity bugs are blocking (not advisory)
+
+---
+
+### Case 3: Changelog Not Generated — CONCERNS
+
+**Fixture:**
+- All stories Done, no HIGH/CRITICAL bugs
+- No changelog entry found for the current version/sprint
+
+**Input:** `/release-checklist`
+
+**Expected behavior:**
+1. Skill checks all items
+2. Changelog check fails: no changelog entry found
+3. Skill reports: "CONCERNS — Changelog not generated for this release"
+4. Skill suggests running `/changelog` to generate it
+5. Verdict is CONCERNS (advisory — not a hard block)
+
+**Assertions:**
+- [ ] Verdict is CONCERNS (not RELEASE BLOCKED — changelog is advisory)
+- [ ] `/changelog` is suggested as the remediation
+- [ ] Other passing checks are shown in the report
+- [ ] Missing changelog is described as advisory, not blocking
+
+---
+
+### Case 4: Previous Release Checklist Exists — Delta From Last Release
+
+**Fixture:**
+- `production/releases/release-checklist-2026-03-20.md` exists
+- Previous: 1 story was incomplete, 1 HIGH bug open
+- Current: all stories Done, HIGH bug resolved, but now 1 MEDIUM bug appeared
+
+**Input:** `/release-checklist`
+
+**Expected behavior:**
+1. Skill finds the previous checklist and loads it
+2. New checklist is generated and compared:
+   - Newly resolved: "Story [X] — was open, now Done"
+   - Newly resolved: "HIGH bug [filename] — was open, now closed"
+   - New item: "1 MEDIUM bug appeared (advisory)"
+3. Delta section shows all changes prominently
+4. Verdict is CONCERNS (MEDIUM bug is advisory, not blocking)
+
+**Assertions:**
+- [ ] Delta section appears in the report with resolved and new items
+- [ ] Newly resolved items from the previous checklist are noted
+- [ ] New items not present in the previous checklist are highlighted
+- [ ] Verdict reflects current state (not previous state)
+
+---
+
+### Case 5: Director Gate Check — No gate; release-checklist is an internal audit
+
+**Fixture:**
+- Active sprint with stories and bug reports
+
+**Input:** `/release-checklist`
+
+**Expected behavior:**
+1. Skill runs the full checklist and writes the report
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is RELEASE READY, RELEASE BLOCKED, or CONCERNS — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Checks sprint story completion status
+- [ ] Checks open bug severity (CRITICAL/HIGH = BLOCKED; MEDIUM/LOW = CONCERNS)
+- [ ] Checks QA plan sign-off status
+- [ ] Checks changelog existence
+- [ ] Compares against previous checklist when one exists
+- [ ] Asks "May I write" before writing the report
+- [ ] Verdict is RELEASE READY, RELEASE BLOCKED, or CONCERNS
+
+---
+
+## Coverage Notes
+
+- Build stability verification (no failed CI runs) is listed as a check category
+  but relies on external CI system state; the skill notes this as a MANUAL CHECK
+  if CI integration is not configured.
+- CRITICAL bugs always result in RELEASE BLOCKED regardless of other items;
+  this is equivalent to the HIGH severity case in Case 2.
+- Stories with `Status: In Review` (not Done) are treated as incomplete
+  and result in RELEASE BLOCKED; this edge case follows the same pattern
+  as the HIGH bug case.
--- a/Framework/skills/utility/reverse-document.md
+++ b/Framework/skills/utility/reverse-document.md
@@ -0,0 +1,180 @@
+# Skill Test Spec: /reverse-document
+
+## Skill Summary
+
+`/reverse-document` generates design or architecture documentation from existing
+source code. It reads the specified source file(s), infers design intent from
+class structure, method names, constants, and comments, and produces either a
+GDD skeleton (for gameplay systems) or an architecture overview (for technical
+systems). The output is a best-effort inference — magic numbers and undocumented
+logic may result in a PARTIAL verdict.
+
+The skill asks "May I write to [inferred path]?" before creating the document.
+No director gates apply. Verdicts: COMPLETE (clean inference), PARTIAL (some
+fields are ambiguous and need human review).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, PARTIAL
+- [ ] Contains "May I write" collaborative protocol language before writing the doc
+- [ ] Has a next-step handoff (e.g., `/design-review` to validate the generated doc)
+
+---
+
+## Director Gate Checks
+
+None. `/reverse-document` is a documentation utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Well-Structured Source — Accurate design doc skeleton produced
+
+**Fixture:**
+- `src/gameplay/health_system.gd` exists with:
+  - `@export var max_health: int = 100`
+  - `func take_damage(amount: int)` with clamping logic
+  - `signal health_changed(new_value: int)`
+  - Docstrings on all public methods
+
+**Input:** `/reverse-document src/gameplay/health_system.gd`
+
+**Expected behavior:**
+1. Skill reads the source file and identifies the health system
+2. Skill infers design intent: max health, take_damage behavior, health signal
+3. Skill produces GDD skeleton for health system with 8 required sections:
+   Overview, Player Fantasy, Detailed Rules, Formulas, Edge Cases, Dependencies,
+   Tuning Knobs, Acceptance Criteria
+4. Formulas section includes the inferred clamping formula
+5. Tuning Knobs notes `max_health = 100` as a configurable value
+6. Skill asks "May I write to `design/gdd/health-system.md`?"
+7. File written; verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 8 required GDD sections are present in the output
+- [ ] `max_health = 100` appears as a Tuning Knob
+- [ ] Clamping formula is captured in the Formulas section
+- [ ] "May I write" is asked with the inferred path
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Ambiguous Source — Magic Numbers, PARTIAL Verdict
+
+**Fixture:**
+- `src/gameplay/enemy_ai.gd` exists with:
+  - Inline magic numbers: `if distance < 150:`, `speed = 3.5`
+  - No comments or docstrings
+  - Complex state machine logic that is not self-explanatory
+
+**Input:** `/reverse-document src/gameplay/enemy_ai.gd`
+
+**Expected behavior:**
+1. Skill reads the file and detects magic numbers with no context
+2. Skill produces a GDD skeleton with notes: "AMBIGUOUS VALUE: 150 (unknown units —
+   is this pixels, world units, or tiles?)"
+3. Skill marks the Formulas and Tuning Knobs sections as requiring human review
+4. Skill asks "May I write to `design/gdd/enemy-ai.md`?" with PARTIAL advisory
+5. File written with PARTIAL markers; verdict is PARTIAL
+
+**Assertions:**
+- [ ] AMBIGUOUS VALUE annotations appear for magic numbers
+- [ ] Sections needing human review are marked explicitly
+- [ ] Verdict is PARTIAL (not COMPLETE)
+- [ ] File is still written — PARTIAL is not a blocking failure
+
+---
+
+### Case 3: Multiple Interdependent Files — Cross-System Overview Produced
+
+**Fixture:**
+- User provides 2 source files: `combat_system.gd` and `damage_resolver.gd`
+- The files reference each other (combat calls damage_resolver)
+
+**Input:** `/reverse-document src/gameplay/combat_system.gd src/gameplay/damage_resolver.gd`
+
+**Expected behavior:**
+1. Skill reads both files and detects the dependency relationship
+2. Skill produces a cross-system architecture overview (not individual GDDs)
+3. Overview describes: Combat System → Damage Resolver interaction, shared
+   interfaces, data flow between the two
+4. Skill asks "May I write to `docs/architecture/combat-damage-overview.md`?"
+5. Overview written after approval; verdict is COMPLETE (or PARTIAL if ambiguous)
+
+**Assertions:**
+- [ ] Both files are analyzed together (not as two separate docs)
+- [ ] Cross-system dependency is documented in the output
+- [ ] Output file is written to `docs/architecture/` (not `design/gdd/`)
+- [ ] Verdict is COMPLETE or PARTIAL
+
+---
+
+### Case 4: Source File Not Found — Error
+
+**Fixture:**
+- `src/gameplay/inventory_system.gd` does not exist
+
+**Input:** `/reverse-document src/gameplay/inventory_system.gd`
+
+**Expected behavior:**
+1. Skill attempts to read the specified file — not found
+2. Skill outputs: "Source file not found: src/gameplay/inventory_system.gd"
+3. Skill suggests checking the path or running `/map-systems` to identify
+   the correct source file
+4. No document is created
+
+**Assertions:**
+- [ ] Error message names the missing file with the full path
+- [ ] Alternative suggestion (check path or `/map-systems`) is provided
+- [ ] No write tool is called
+- [ ] No verdict is issued (error state)
+
+---
+
+### Case 5: Director Gate Check — No gate; reverse-document is a utility
+
+**Fixture:**
+- Well-structured source file exists
+
+**Input:** `/reverse-document src/gameplay/health_system.gd`
+
+**Expected behavior:**
+1. Skill generates and writes the design doc
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE or PARTIAL — no gate verdict involved
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads source file(s) before generating any content
+- [ ] Produces all 8 required GDD sections when target is a gameplay system
+- [ ] Annotates ambiguous values with AMBIGUOUS VALUE markers
+- [ ] Produces cross-system overview (not individual GDDs) for multiple files
+- [ ] Asks "May I write" before creating any output file
+- [ ] Verdict is COMPLETE (clean inference) or PARTIAL (ambiguous fields)
+
+---
+
+## Coverage Notes
+
+- Architecture overview format (for technical/infrastructure systems) differs
+  from GDD format; the inferred output type is determined by the nature of the
+  source file (gameplay logic → GDD; engine/infra code → architecture doc).
+- The case where a source file is readable but contains only auto-generated
+  boilerplate with no meaningful logic is not tested; skill would likely produce
+  a near-empty skeleton with a PARTIAL verdict.
+- C# and Blueprint source files follow the same inference pattern as GDScript;
+  language-specific differences are handled in the skill body.
--- a/Framework/skills/utility/setup-engine.md
+++ b/Framework/skills/utility/setup-engine.md
@@ -0,0 +1,182 @@
+# Skill Test Spec: /setup-engine
+
+## Skill Summary
+
+`/setup-engine` configures the project's engine, language, rendering backend,
+physics engine, specialist agent assignments, and naming conventions by
+populating `technical-preferences.md`. It accepts an optional engine argument
+(e.g., `/setup-engine godot`) to skip the engine-selection step. For each
+section of `technical-preferences.md`, the skill presents a draft and asks
+"May I write to `technical-preferences.md`?" before updating.
+
+The skill also populates the specialist routing table (file extension → agent
+mappings) based on the chosen engine. It has no director gates — configuration
+is a technical utility task. The verdict is always COMPLETE when the file is
+fully written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before updating technical-preferences.md
+- [ ] Has a next-step handoff (e.g., `/brainstorm` or `/start` depending on flow)
+
+---
+
+## Director Gate Checks
+
+None. `/setup-engine` is a technical configuration skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Godot 4 + GDScript — Full engine configuration
+
+**Fixture:**
+- `technical-preferences.md` contains only placeholders
+- Engine argument provided: `godot`
+
+**Input:** `/setup-engine godot`
+
+**Expected behavior:**
+1. Skill skips engine-selection step (argument provided)
+2. Skill presents language options for Godot: GDScript or C#
+3. User selects GDScript
+4. Skill drafts all engine sections: engine/language/rendering/physics fields,
+   naming conventions (snake_case for GDScript), specialist assignments
+   (godot-specialist, gdscript-specialist, godot-shader-specialist, etc.)
+5. Skill populates the routing table: `.gd` → gdscript-specialist, `.gdshader` →
+   godot-shader-specialist, `.tscn` → godot-specialist
+6. Skill asks "May I write to `technical-preferences.md`?"
+7. File is written after approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Engine field is set to Godot 4 (not a placeholder)
+- [ ] Language field is set to GDScript
+- [ ] Naming conventions are GDScript-appropriate (snake_case)
+- [ ] Routing table includes `.gd`, `.gdshader`, and `.tscn` entries
+- [ ] Specialists are assigned (not placeholders)
+- [ ] "May I write" is asked before writing
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Unity + C# — Unity-specific configuration
+
+**Fixture:**
+- `technical-preferences.md` contains only placeholders
+- Engine argument provided: `unity`
+
+**Input:** `/setup-engine unity`
+
+**Expected behavior:**
+1. Skill sets engine to Unity, language to C#
+2. Naming conventions are C#-appropriate (PascalCase for classes, camelCase for fields)
+3. Specialist assignments reference unity-specialist, csharp-specialist
+4. Routing table: `.cs` → csharp-specialist, `.asmdef` → unity-specialist,
+   `.unity` (scene) → unity-specialist
+5. Skill asks "May I write to `technical-preferences.md`?" and writes on approval
+
+**Assertions:**
+- [ ] Engine field is set to Unity (not Godot or Unreal)
+- [ ] Language field is set to C#
+- [ ] Naming conventions reflect C# conventions
+- [ ] Routing table includes `.cs` and `.unity` entries
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 3: Unreal + Blueprint — Unreal-specific configuration
+
+**Fixture:**
+- `technical-preferences.md` contains only placeholders
+- Engine argument provided: `unreal`
+
+**Input:** `/setup-engine unreal`
+
+**Expected behavior:**
+1. Skill sets engine to Unreal Engine 5, primary language to Blueprint (Visual Scripting)
+2. Specialist assignments reference unreal-specialist, blueprint-specialist
+3. Routing table: `.uasset` → blueprint-specialist or unreal-specialist,
+   `.umap` → unreal-specialist
+4. Performance budgets are pre-set with Unreal defaults (e.g., higher draw call budget)
+5. Skill asks "May I write" and writes on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Engine field is set to Unreal Engine 5
+- [ ] Routing table includes `.uasset` and `.umap` entries
+- [ ] Blueprint specialist is assigned
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: Engine Already Configured — Offers to reconfigure specific sections
+
+**Fixture:**
+- `technical-preferences.md` has engine set to Godot 4 with all fields populated
+- No engine argument provided
+
+**Input:** `/setup-engine`
+
+**Expected behavior:**
+1. Skill reads `technical-preferences.md` and detects fully configured engine (Godot 4)
+2. Skill reports: "Engine already configured as Godot 4 + GDScript"
+3. Skill presents options: reconfigure all, reconfigure specific section only
+   (Engine/Language, Naming Conventions, Specialists, Performance Budgets)
+4. User selects "Reconfigure Performance Budgets only"
+5. Only the performance budget section is updated; all other fields unchanged
+6. Skill asks "May I write to `technical-preferences.md`?" and writes on approval
+
+**Assertions:**
+- [ ] Skill does NOT overwrite all fields when only a section update was requested
+- [ ] User is offered section-specific reconfiguration
+- [ ] Only the selected section is modified in the written file
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; setup-engine is a utility skill
+
+**Fixture:**
+- Fresh project with no engine configured
+
+**Input:** `/setup-engine godot`
+
+**Expected behavior:**
+1. Skill completes full engine configuration
+2. No director agents are spawned at any point
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Presents draft configuration before asking to write
+- [ ] Asks "May I write to `technical-preferences.md`?" before writing
+- [ ] Respects engine argument when provided (skips selection step)
+- [ ] Detects existing config and offers partial reconfigure
+- [ ] Routing table is populated for all key file types for the chosen engine
+- [ ] Verdict is COMPLETE after file is written
+
+---
+
+## Coverage Notes
+
+- Godot 4 + C# (instead of GDScript) follows the same flow as Case 1 with
+  different naming conventions and the godot-csharp-specialist assignment.
+  This variant is not separately tested.
+- The engine-version-specific guidance (e.g., Godot 4.6 knowledge gap warning
+  from VERSION.md) is surfaced by the skill but not assertion-tested here.
+- Performance budget defaults per engine are noted as engine-specific but
+  exact default values are not assertion-tested.
--- a/Framework/skills/utility/skill-improve.md
+++ b/Framework/skills/utility/skill-improve.md
@@ -0,0 +1,185 @@
+# Skill Test Spec: /skill-improve
+
+## Skill Summary
+
+`/skill-improve` runs an automated test-fix-retest improvement loop on a skill
+file. It invokes `/skill-test static` (and optionally `/skill-test category`) to
+establish a baseline score, diagnoses the failing checks, proposes targeted fixes
+to the SKILL.md file, asks "May I write the improvements to [skill path]?", applies
+the fixes, and re-runs the tests to confirm improvement.
+
+If the proposed fix makes the skill worse (regression), the fix is reverted (with
+user confirmation) rather than applied. If the skill is already perfect (0 failures),
+the skill exits immediately without making changes. No director gates apply. Verdicts:
+IMPROVED (score went up), NO CHANGE (no improvements possible or user declined), or
+REVERTED (fix was applied but caused regression and was reverted).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: IMPROVED, NO CHANGE, REVERTED
+- [ ] Contains "May I write" collaborative protocol language before applying fixes
+- [ ] Has a next-step handoff (e.g., run `/skill-test spec` to validate behavioral compliance)
+
+---
+
+## Director Gate Checks
+
+None. `/skill-improve` is a meta-utility skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Skill With 2 Static Failures, Both Fixed, IMPROVED
+
+**Fixture:**
+- `.claude/skills/some-skill/SKILL.md` has 2 static failures:
+  - Check 4: no "May I write" language despite having Write in allowed-tools
+  - Check 5: no next-step handoff at the end
+
+**Input:** `/skill-improve some-skill`
+
+**Expected behavior:**
+1. Skill runs `/skill-test static some-skill` — baseline: 5/7 checks pass
+2. Skill diagnoses the 2 failing checks (4 and 5)
+3. Skill proposes fixes:
+   - Add "May I write" language to the appropriate phase
+   - Add a next-step handoff section at the end
+4. Skill asks "May I write improvements to `.claude/skills/some-skill/SKILL.md`?"
+5. Fixes applied; `/skill-test static some-skill` re-run — now 7/7 checks pass
+6. Verdict is IMPROVED (5→7)
+
+**Assertions:**
+- [ ] Baseline score is established before any changes (5/7)
+- [ ] Both failing checks are diagnosed and addressed in the proposed fix
+- [ ] "May I write" is asked before applying the fix
+- [ ] Re-test confirms improvement (7/7)
+- [ ] Verdict is IMPROVED with before/after score shown
+
+---
+
+### Case 2: Fix Causes Regression — Score Comparison Shows Regression, REVERTED
+
+**Fixture:**
+- `.claude/skills/some-skill/SKILL.md` has 1 static failure (missing handoff)
+- Proposed fix inadvertently removes the verdict keywords section
+  (introducing a new failure)
+
+**Input:** `/skill-improve some-skill`
+
+**Expected behavior:**
+1. Baseline: 6/7 checks pass (1 failure: missing handoff)
+2. Skill proposes fix and asks "May I write improvements?"
+3. Fix is applied; re-test runs
+4. Re-test result: 5/7 (fixed the handoff but broke verdict keywords)
+5. Skill detects regression: score went DOWN
+6. Skill asks user: "Fix caused a regression (6→5). May I revert the changes?"
+7. User confirms; changes are reverted; verdict is REVERTED
+
+**Assertions:**
+- [ ] Re-test score is compared to baseline before finalizing
+- [ ] Regression is detected when score decreases
+- [ ] User is asked to confirm revert (not automatic)
+- [ ] File is reverted on user confirmation
+- [ ] Verdict is REVERTED
+
+---
+
+### Case 3: Skill With Category Assignment — Baseline Captures Both Scores
+
+**Fixture:**
+- `.claude/skills/gate-check/SKILL.md` is a gate skill with 1 static failure
+  and 2 category (G-criteria) failures
+- `tests/skills/quality-rubric.md` has Gate Skills section
+
+**Input:** `/skill-improve gate-check`
+
+**Expected behavior:**
+1. Skill runs both static and category tests for the baseline:
+   - Static: 6/7 checks pass
+   - Category: 3/5 G-criteria pass
+2. Combined baseline: 9/12
+3. Skill diagnoses all 3 failures and proposes fixes
+4. "May I write improvements to `.claude/skills/gate-check/SKILL.md`?"
+5. Fixes applied; both test types re-run
+6. Re-test: static 7/7, category 5/5 = 12/12
+7. Verdict is IMPROVED (9→12)
+
+**Assertions:**
+- [ ] Both static and category scores are captured in the baseline
+- [ ] Combined score is used for comparison (not just one type)
+- [ ] All 3 failures are addressed in the proposed fix
+- [ ] Re-test confirms improvement in both score types
+- [ ] Verdict is IMPROVED with combined before/after
+
+---
+
+### Case 4: Skill Already Perfect — No Improvements Needed
+
+**Fixture:**
+- `.claude/skills/brainstorm/SKILL.md` has no static failures
+- Category score is also 5/5 (if applicable)
+
+**Input:** `/skill-improve brainstorm`
+
+**Expected behavior:**
+1. Skill runs `/skill-test static brainstorm` — 7/7 checks pass
+2. If category applies: 5/5 criteria pass
+3. Skill outputs: "No improvements needed — brainstorm is fully compliant"
+4. Skill exits without proposing any changes
+5. No "May I write" is asked; no files are modified
+6. Verdict is NO CHANGE
+
+**Assertions:**
+- [ ] Skill exits immediately after confirming 0 failures
+- [ ] "No improvements needed" message is shown
+- [ ] No changes are proposed
+- [ ] No "May I write" is asked
+- [ ] Verdict is NO CHANGE
+
+---
+
+### Case 5: Director Gate Check — No gate; skill-improve is a meta utility
+
+**Fixture:**
+- Skill with at least 1 static failure
+
+**Input:** `/skill-improve some-skill`
+
+**Expected behavior:**
+1. Skill runs the test-fix-retest loop
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is IMPROVED, NO CHANGE, or REVERTED — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Always establishes a baseline score before proposing any changes
+- [ ] Shows before/after score comparison in the output
+- [ ] Asks "May I write" before applying any fix
+- [ ] Detects regressions by comparing re-test score to baseline
+- [ ] Asks for user confirmation before reverting (not automatic)
+- [ ] Ends with IMPROVED, NO CHANGE, or REVERTED verdict
+
+---
+
+## Coverage Notes
+
+- The improvement loop is designed to run only one fix-retest cycle per
+  invocation; running multiple iterations requires re-invoking `/skill-improve`.
+- Behavioral compliance (spec-mode test results) is not included in the
+  improvement loop — only structural (static) and category scores are automated.
+- The case where the skill file cannot be read (permissions error or missing file)
+  is not tested; this would result in an error before the baseline is established.
--- a/Framework/skills/utility/skill-test.md
+++ b/Framework/skills/utility/skill-test.md
@@ -0,0 +1,188 @@
+# Skill Test Spec: /skill-test
+
+## Skill Summary
+
+`/skill-test` validates skill files for structural correctness, behavioral
+compliance, and category-rubric scoring. It operates in three modes:
+
+- **static**: Checks a single skill file for structural requirements
+  (frontmatter fields, phase headings, verdict keywords, "May I write" language,
+  next-step handoff) without needing a fixture. Produces a per-check PASS/FAIL
+  table.
+- **spec**: Reads a test spec file from `tests/skills/` and evaluates the skill
+  against each test case assertion, producing a case-by-case verdict.
+- **audit**: Produces a coverage table of all skills in `.claude/skills/` and
+  all agents in `.claude/agents/`, showing which have spec files and which do not.
+
+An additional **category** mode reads the quality rubric for a skill category
+(e.g., gate skills) and scores the skill against rubric criteria. The verdict
+system differs by mode.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdicts: COMPLIANT, NON-COMPLIANT, WARNINGS (static mode); PASS, FAIL, PARTIAL (spec mode); COMPLETE (audit mode)
+- [ ] Does NOT contain "May I write" language (skill is read-only in all modes)
+- [ ] Has a next-step handoff (e.g., `/skill-improve` to fix issues found)
+
+---
+
+## Director Gate Checks
+
+None. `/skill-test` is a meta-utility skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Static Mode — Well-formed skill, all 7 checks pass, COMPLIANT
+
+**Fixture:**
+- `.claude/skills/brainstorm/SKILL.md` exists and is well-formed:
+  - Has all required frontmatter fields
+  - Has ≥2 phase headings
+  - Has verdict keywords
+  - Has "May I write" language
+  - Has a next-step handoff
+  - Documents director gates
+  - Documents gate mode behavior (lean/solo skips)
+
+**Input:** `/skill-test static brainstorm`
+
+**Expected behavior:**
+1. Skill reads `.claude/skills/brainstorm/SKILL.md`
+2. Skill runs all 7 structural checks
+3. All 7 checks pass
+4. Skill outputs a PASS/FAIL table with all 7 checks marked PASS
+5. Verdict is COMPLIANT
+
+**Assertions:**
+- [ ] Exactly 7 structural checks are reported
+- [ ] All 7 are marked PASS
+- [ ] Verdict is COMPLIANT
+- [ ] No files are written
+
+---
+
+### Case 2: Static Mode — Skill Missing "May I Write" Despite Write Tool in allowed-tools
+
+**Fixture:**
+- `.claude/skills/some-skill/SKILL.md` has `Write` in `allowed-tools` frontmatter
+- The skill body has no "May I write" or "May I update" language
+
+**Input:** `/skill-test static some-skill`
+
+**Expected behavior:**
+1. Skill reads `some-skill/SKILL.md`
+2. Check 4 (collaborative write protocol) fails: `Write` in allowed-tools but no
+   "May I write" language found
+3. All other checks may pass
+4. Verdict is NON-COMPLIANT with Check 4 as the failing assertion
+5. Output lists Check 4 as FAIL with explanation
+
+**Assertions:**
+- [ ] Check 4 is marked FAIL
+- [ ] Explanation identifies the specific mismatch (Write tool without "May I write" language)
+- [ ] Verdict is NON-COMPLIANT
+- [ ] Other passing checks are shown (not only the failure)
+
+---
+
+### Case 3: Spec Mode — gate-check Skill Evaluated Against Spec
+
+**Fixture:**
+- `tests/skills/gate-check.md` exists with 5 test cases
+- `.claude/skills/gate-check/SKILL.md` exists
+
+**Input:** `/skill-test spec gate-check`
+
+**Expected behavior:**
+1. Skill reads both the skill file and the spec file
+2. Skill evaluates each of the 5 test case assertions against the skill's behavior
+3. For each case: PASS if skill behavior matches spec assertions, FAIL if not
+4. Skill produces a case-by-case result table
+5. Overall verdict: PASS (all 5), PARTIAL (some), or FAIL (majority failing)
+
+**Assertions:**
+- [ ] All 5 test cases from the spec are evaluated
+- [ ] Each case has an individual PASS/FAIL result
+- [ ] Overall verdict is PASS, PARTIAL, or FAIL based on case results
+- [ ] No files are written
+
+---
+
+### Case 4: Audit Mode — Coverage Table of All Skills and Agents
+
+**Fixture:**
+- `.claude/skills/` contains 72+ skill directories
+- `.claude/agents/` contains 49+ agent files
+- `tests/skills/` contains spec files for a subset of skills
+
+**Input:** `/skill-test audit`
+
+**Expected behavior:**
+1. Skill enumerates all skills in `.claude/skills/` and all agents in `.claude/agents/`
+2. Skill checks `tests/skills/` for a corresponding spec file for each
+3. Skill produces a coverage table:
+   - Each skill/agent listed
+   - "Has Spec" column: YES or NO
+   - Summary: "X of Y skills have specs; A of B agents have specs"
+4. Verdict is COMPLETE
+
+**Assertions:**
+- [ ] All skill directories are enumerated (not just a sample)
+- [ ] "Has Spec" column is accurate for each entry
+- [ ] Summary counts are correct
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Category Mode — Gate Skill Evaluated Against Quality Rubric
+
+**Fixture:**
+- `tests/skills/quality-rubric.md` exists with a "Gate Skills" section defining
+  criteria G1-G5 (e.g., G1: has mode guard, G2: has verdict table, etc.)
+- `.claude/skills/gate-check/SKILL.md` is a gate skill
+
+**Input:** `/skill-test category gate-check`
+
+**Expected behavior:**
+1. Skill reads `quality-rubric.md` and identifies the Gate Skills section
+2. Skill evaluates `gate-check/SKILL.md` against criteria G1-G5
+3. Each criterion is scored: PASS, PARTIAL, or FAIL
+4. Overall category score is computed (e.g., 4/5 criteria pass)
+5. Verdict is COMPLIANT (all pass), WARNINGS (some partial), or NON-COMPLIANT (failures)
+
+**Assertions:**
+- [ ] All gate criteria (G1-G5) from quality-rubric.md are evaluated
+- [ ] Each criterion has an individual score
+- [ ] Overall verdict reflects the score distribution
+- [ ] No files are written
+
+---
+
+## Protocol Compliance
+
+- [ ] Static mode checks exactly 7 structural assertions
+- [ ] Spec mode evaluates each test case from the spec file individually
+- [ ] Audit mode covers all skills AND agents (not just one category)
+- [ ] Category mode reads quality-rubric.md to get criteria (not hardcoded)
+- [ ] Does not write any files in any mode
+- [ ] Suggests `/skill-improve` as the next step when issues are found
+
+---
+
+## Coverage Notes
+
+- The skill-test skill is self-referential (it can test itself). The static
+  mode case for skill-test's own SKILL.md is not separately fixture-tested to
+  avoid infinite recursion in test design.
+- The specific 7 structural checks are defined in the skill body; only Check 4
+  (May I write) is individually tested here because it has the most nuanced logic.
+- Audit mode counts are approximate — the exact number of skills and agents will
+  change as the system grows; assertions use "all" rather than fixed counts.
--- a/Framework/skills/utility/smoke-check.md
+++ b/Framework/skills/utility/smoke-check.md
@@ -0,0 +1,193 @@
+# Skill Test Spec: /smoke-check
+
+## Skill Summary
+
+`/smoke-check` is the gate between implementation and QA hand-off. It detects the
+test environment, runs the automated test suite (via Bash), scans test coverage
+against sprint stories, and uses `AskUserQuestion` to batch-verify manual smoke
+checks with the developer. It writes a report to `production/qa/smoke-[date].md`
+after explicit user approval.
+
+Verdicts: PASS (tests pass, all smoke checks pass, no missing test evidence),
+PASS WITH WARNINGS (tests pass or NOT RUN, all critical checks pass, but advisory
+gaps exist such as missing test coverage), or FAIL (any automated test failure or
+any Batch 1/Batch 2 smoke check returns FAIL).
+
+No director gates apply. The skill does NOT invoke any director agents.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: PASS, PASS WITH WARNINGS, FAIL
+- [ ] Contains "May I write" collaborative protocol language before writing the report
+- [ ] Has a next-step handoff (e.g., `/bug-report` on FAIL, QA hand-off guidance on PASS)
+
+---
+
+## Director Gate Checks
+
+None. `/smoke-check` is a pre-QA utility skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Automated tests pass, manual items confirmed, PASS
+
+**Fixture:**
+- `tests/` directory exists with a GDUnit4 runner script
+- Engine detected as Godot from `technical-preferences.md`
+- `production/qa/qa-plan-sprint-005.md` exists
+- Automated test runner reports 12 tests, 12 passing, 0 failing
+- Developer confirms all Batch 1 and Batch 2 smoke checks as PASS
+- All sprint stories have matching test files (no MISSING coverage)
+
+**Input:** `/smoke-check`
+
+**Expected behavior:**
+1. Skill detects test directory and engine, notes QA plan found
+2. Runs `godot --headless --script tests/gdunit4_runner.gd` via Bash
+3. Parses output: 12/12 passing
+4. Scans test coverage — all stories COVERED or EXPECTED
+5. Uses `AskUserQuestion` for Batch 1 (core stability) and Batch 2 (sprint mechanics)
+6. Developer selects PASS for all items
+7. Report assembled: automated tests PASS, all smoke checks PASS, no MISSING coverage
+8. Asks "May I write this smoke check report to `production/qa/smoke-[date].md`?"
+9. Writes report after approval
+10. Delivers verdict: PASS
+
+**Assertions:**
+- [ ] Automated test runner is invoked via Bash
+- [ ] `AskUserQuestion` is used for manual smoke check batches
+- [ ] "May I write" is asked before writing the report file
+- [ ] Report is written to `production/qa/smoke-[date].md`
+- [ ] Verdict is PASS
+
+---
+
+### Case 2: Failure Path — Automated test fails, FAIL verdict
+
+**Fixture:**
+- `tests/` directory exists, engine is Godot
+- Automated test runner reports 10 tests run: 8 passing, 2 failing
+  - Failing tests: `test_health_clamp_at_zero`, `test_damage_calculation_negative`
+- QA plan exists
+
+**Input:** `/smoke-check`
+
+**Expected behavior:**
+1. Skill runs automated tests via Bash
+2. Parses output — 2 failures detected
+3. Records failing test names
+4. Proceeds through manual smoke check batches
+5. Report shows automated tests as FAIL with failing test names listed
+6. Asks to write report; writes after approval
+7. Delivers FAIL verdict with message: "The smoke check failed. Do not hand off to
+   QA until these failures are resolved." Lists failing tests and suggests fixing
+   then re-running `/smoke-check`
+
+**Assertions:**
+- [ ] Failing test names are listed in the report
+- [ ] Verdict is FAIL
+- [ ] Post-verdict message directs developer to fix failures before QA hand-off
+- [ ] `/smoke-check` re-run is suggested after fixing
+
+---
+
+### Case 3: Manual Confirmation — AskUserQuestion used, PASS WITH WARNINGS
+
+**Fixture:**
+- `tests/` directory exists, engine is Godot
+- Automated test runner reports all tests passing (8/8)
+- One Logic story has no matching test file (MISSING coverage)
+- Developer confirms all Batch 1 and Batch 2 smoke checks as PASS
+
+**Input:** `/smoke-check`
+
+**Expected behavior:**
+1. Automated tests PASS
+2. Coverage scan finds 1 MISSING entry for a Logic story
+3. `AskUserQuestion` is used for Batch 1 and Batch 2 — developer confirms all PASS
+4. Report shows: automated tests PASS, manual checks all PASS, 1 MISSING coverage entry
+5. Verdict is PASS WITH WARNINGS — build ready for QA, but MISSING entry must be
+   resolved before `/story-done` closes the affected story
+6. Asks to write report; writes after approval
+
+**Assertions:**
+- [ ] `AskUserQuestion` is used for manual smoke check batches (not inline text prompts)
+- [ ] MISSING test coverage entry appears in the report
+- [ ] Verdict is PASS WITH WARNINGS (not PASS, not FAIL)
+- [ ] Advisory note explains MISSING entry must be resolved before `/story-done`
+- [ ] Report file is written to `production/qa/smoke-[date].md`
+
+---
+
+### Case 4: No Test Directory — Skill stops with guidance
+
+**Fixture:**
+- `tests/` directory does not exist
+- Engine is configured as Godot
+
+**Input:** `/smoke-check`
+
+**Expected behavior:**
+1. Phase 1 checks for `tests/` directory — not found
+2. Skill outputs: "No test directory found at `tests/`. Run `/test-setup` to
+   scaffold the testing infrastructure, or create the directory manually if
+   tests live elsewhere."
+3. Skill stops — no automated tests run, no manual smoke checks, no report written
+
+**Assertions:**
+- [ ] Error message references the missing `tests/` directory
+- [ ] `/test-setup` is suggested as the remediation step
+- [ ] Skill stops after this message (no further phases run)
+- [ ] No report file is written
+
+---
+
+### Case 5: Director Gate Check — No gate; smoke-check is a QA pre-check utility
+
+**Fixture:**
+- Valid test setup, automated tests pass, manual smoke checks confirmed
+
+**Input:** `/smoke-check`
+
+**Expected behavior:**
+1. Skill runs all phases and produces a PASS or PASS WITH WARNINGS verdict
+2. No director agents are spawned at any point
+3. No gate IDs (CD-*, TD-*, AD-*, PR-*) appear in output
+4. No `/gate-check` is invoked
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is PASS, PASS WITH WARNINGS, or FAIL — no gate verdict involved
+
+---
+
+## Protocol Compliance
+
+- [ ] Uses `AskUserQuestion` for all manual smoke check batches (Batch 1, Batch 2, Batch 3)
+- [ ] Runs automated tests via Bash before asking any manual questions
+- [ ] Asks "May I write" before creating the report file — never writes without approval
+- [ ] Verdict vocabulary is strictly PASS / PASS WITH WARNINGS / FAIL — no other verdicts
+- [ ] FAIL is triggered by automated test failures or Batch 1/Batch 2 FAIL responses
+- [ ] PASS WITH WARNINGS is triggered when MISSING test coverage exists but no critical failures
+- [ ] NOT RUN (engine binary unavailable) is recorded as a warning, not a FAIL
+- [ ] Does not invoke director gates at any point
+
+---
+
+## Coverage Notes
+
+- The `quick` argument (skips Phase 3 coverage scan and Batch 3) is not separately
+  fixture-tested; it follows the same pattern as Case 1 with a coverage-skip note in output.
+- The `--platform` argument adds platform-specific AskUserQuestion batches and a
+  per-platform verdict table; not separately tested here.
+- The case where the engine binary is not on PATH (NOT RUN) follows the PASS WITH
+  WARNINGS pattern and is covered by the protocol compliance assertions above.
--- a/Framework/skills/utility/soak-test.md
+++ b/Framework/skills/utility/soak-test.md
@@ -0,0 +1,178 @@
+# Skill Test Spec: /soak-test
+
+## Skill Summary
+
+`/soak-test` generates a structured soak test protocol — an extended runtime
+test plan designed to surface memory leaks, performance drift, and stability
+issues that only appear under sustained gameplay. The skill produces a document
+specifying the test duration, system under test, monitoring checkpoints (e.g.,
+memory sample every 30 minutes), pass/fail thresholds, and conditions for early
+termination.
+
+The skill asks "May I write to `production/qa/soak-[slug]-[date].md`?" before
+persisting. If a previous soak test for the same system exists, the skill offers
+to extend the duration or add new conditions. No director gates apply. The verdict
+is COMPLETE when the soak test protocol is written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before writing the protocol
+- [ ] Has a next-step handoff (e.g., `/regression-suite` or `/release-checklist`)
+
+---
+
+## Director Gate Checks
+
+None. `/soak-test` is a QA planning utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Online gameplay feature, 2-hour soak protocol
+
+**Fixture:**
+- User specifies: system = "online multiplayer lobby", duration = "2 hours"
+- `technical-preferences.md` has engine configured
+
+**Input:** `/soak-test online-lobby 2h`
+
+**Expected behavior:**
+1. Skill generates a 2-hour soak test protocol for the online lobby system
+2. Protocol includes: monitoring checkpoints every 30 minutes, metrics to track
+   (memory usage, connection count, packet loss), pass thresholds, early termination
+   conditions (crash or >20% memory growth)
+3. Networking-specific checks are included (session drop rate, reconnect handling)
+4. Skill asks "May I write to `production/qa/soak-online-lobby-2026-04-06.md`?"
+5. File is written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Protocol duration matches the requested 2 hours
+- [ ] Monitoring checkpoints are at reasonable intervals (e.g., every 30 minutes)
+- [ ] Network-specific checks are included (not just generic memory checks)
+- [ ] "May I write" is asked with the correct file path
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: No Target Defined — Prompts for system, duration, and conditions
+
+**Fixture:**
+- No arguments provided
+- No soak test config in session state
+
+**Input:** `/soak-test`
+
+**Expected behavior:**
+1. Skill detects no target system or duration specified
+2. Skill asks: "What system or feature should be soak-tested?"
+3. After user responds with system: Skill asks: "What duration? (e.g., 1h, 4h, 8h)"
+4. After user responds with duration: Skill asks for specific conditions or
+   uses defaults (normal gameplay loop, default player count)
+5. Skill generates protocol from collected inputs and asks "May I write"
+
+**Assertions:**
+- [ ] At minimum 2 follow-up questions are asked (system + duration)
+- [ ] Default conditions are applied when user doesn't specify custom ones
+- [ ] Protocol is not generated until system and duration are known
+- [ ] Verdict is COMPLETE after file is written
+
+---
+
+### Case 3: Previous Soak Test Exists — Offers to extend or add conditions
+
+**Fixture:**
+- `production/qa/soak-online-lobby-2026-03-15.md` exists with a 1-hour protocol
+- User wants to extend to 4 hours with new memory threshold conditions
+
+**Input:** `/soak-test online-lobby 4h`
+
+**Expected behavior:**
+1. Skill finds existing soak test for online-lobby
+2. Skill reports: "Previous soak test found: soak-online-lobby-2026-03-15.md (1h)"
+3. Skill presents options: create new protocol (4h standalone), or extend the
+   existing protocol to 4h and add new conditions
+4. User selects extend; existing checkpoints are preserved, new ones added
+5. Skill asks "May I write to `production/qa/soak-online-lobby-2026-04-06.md`?"
+   (new file, not overwriting old one)
+
+**Assertions:**
+- [ ] Existing soak test is surfaced and referenced
+- [ ] User is offered extend vs. new options
+- [ ] New file is created (old file is not overwritten)
+- [ ] Extended protocol includes both old and new checkpoints
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: Mobile Target Platform — Memory-specific checkpoints added
+
+**Fixture:**
+- `technical-preferences.md` specifies target platform: Mobile
+- User requests soak test for "gameplay session" at 30 minutes
+
+**Input:** `/soak-test gameplay 30m`
+
+**Expected behavior:**
+1. Skill reads `technical-preferences.md` and detects mobile target platform
+2. Soak test protocol includes mobile-specific memory checkpoints:
+   - Check heap memory growth vs. device baseline
+   - Check texture memory at checkpoint intervals
+   - Add warning threshold at 300MB (mobile ceiling)
+3. Protocol also includes thermal/battery drain advisory notes
+4. Skill asks "May I write?" and writes on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Mobile platform is detected from technical-preferences.md
+- [ ] Memory checkpoints include mobile-appropriate thresholds (not desktop)
+- [ ] Thermal/battery notes are present in the protocol
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; soak-test is a planning utility
+
+**Fixture:**
+- Valid system and duration provided
+
+**Input:** `/soak-test combat 1h`
+
+**Expected behavior:**
+1. Skill generates and writes the soak test protocol
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Skill reaches COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Collects system, duration, and conditions before generating protocol
+- [ ] Includes monitoring checkpoints at regular intervals
+- [ ] Includes pass/fail thresholds and early termination conditions
+- [ ] Adapts checkpoints to target platform (mobile vs. desktop)
+- [ ] Asks "May I write" before creating the protocol file
+- [ ] Verdict is COMPLETE when file is written
+
+---
+
+## Coverage Notes
+
+- Soak tests for specific engine subsystems (rendering pipeline, physics
+  simulation) follow the same protocol structure and are not separately tested.
+- The case where the user provides a duration shorter than the minimum useful
+  soak period (e.g., 5 minutes) is not tested; the skill would note this is
+  too short for meaningful results.
+- Automated execution of the soak test protocol is outside this skill's scope —
+  this skill generates the plan, not the runner.
--- a/Framework/skills/utility/start.md
+++ b/Framework/skills/utility/start.md
@@ -0,0 +1,173 @@
+# Skill Test Spec: /start
+
+## Skill Summary
+
+`/start` is the first-time onboarding skill for new projects. It guides the
+user through naming the project, choosing a game engine, and setting up the
+initial directory structure. It creates stub configuration files (CLAUDE.md,
+technical-preferences.md) and then routes to `/setup-engine` with the chosen
+engine as an argument. Each file or directory created is gated behind a
+"May I write" ask, following the collaborative protocol.
+
+The skill detects whether a project is already configured and whether a
+partial setup exists, offering to resume or restart as appropriate. It has
+no director gates — it is a utility setup skill that runs before any agent
+hierarchy exists.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED
+- [ ] Contains "May I write" collaborative protocol language for each config file
+- [ ] Has a next-step handoff at the end (routes to `/setup-engine`)
+
+---
+
+## Director Gate Checks
+
+None. `/start` is a utility setup skill. No director agents exist yet at the
+point this skill runs.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Fresh repo, no engine, full onboarding flow
+
+**Fixture:**
+- Empty repository: no CLAUDE.md overrides, no `production/stage.txt`, no
+  `technical-preferences.md` content beyond placeholders
+- No existing design docs or source code
+
+**Input:** `/start`
+
+**Expected behavior:**
+1. Skill detects no existing configuration and begins fresh onboarding
+2. Skill asks for project name
+3. Skill presents 3 engine options: Godot 4, Unity, Unreal Engine 5
+4. User selects an engine
+5. Skill asks "May I write the initial directory structure?"
+6. Skill creates all directories defined in `directory-structure.md`
+7. Skill asks "May I write CLAUDE.md stub?" and writes it on approval
+8. Skill routes to `/setup-engine [chosen-engine]` to complete technical config
+
+**Assertions:**
+- [ ] Project name is captured before any file is written
+- [ ] Exactly 3 engine options are presented
+- [ ] "May I write" is asked for each config file individually
+- [ ] No file is written without explicit user approval
+- [ ] Handoff to `/setup-engine` occurs at the end with the chosen engine argument
+- [ ] Verdict is COMPLETE after all files are written and handoff is issued
+
+---
+
+### Case 2: Already Configured — Detects existing config, offers to skip or reconfigure
+
+**Fixture:**
+- `technical-preferences.md` has engine already set (not placeholder)
+- `production/stage.txt` exists with `Concept`
+
+**Input:** `/start`
+
+**Expected behavior:**
+1. Skill reads `technical-preferences.md` and detects configured engine
+2. Skill reports: "This project is already configured with [engine]"
+3. Skill presents options: skip (exit), reconfigure engine, or reconfigure specific sections
+4. If user selects skip: skill exits cleanly with a summary of current config
+5. If user selects reconfigure: skill proceeds to the engine-selection step
+
+**Assertions:**
+- [ ] Skill does NOT overwrite existing config without user choosing reconfigure
+- [ ] Detected engine name is shown to the user in the status message
+- [ ] User is offered at least 2 options (skip or reconfigure)
+- [ ] Verdict is COMPLETE whether user skips or reconfigures
+
+---
+
+### Case 3: Engine Choice — User picks Godot 4, routes to /setup-engine godot
+
+**Fixture:**
+- Fresh repo — no existing configuration
+
+**Input:** `/start`
+
+**Expected behavior:**
+1. Skill presents engine options and user selects Godot 4
+2. Skill writes initial stubs (directory structure, CLAUDE.md) after approval
+3. Skill explicitly routes to `/setup-engine godot` as the next step
+4. Handoff message clearly names the engine and the next skill invocation
+
+**Assertions:**
+- [ ] Handoff command is `/setup-engine godot` (not generic `/setup-engine`)
+- [ ] Handoff is issued after all initial stubs are written, not before
+- [ ] Engine choice is echoed back to user before writing begins
+
+---
+
+### Case 4: Interrupted Setup — Partial config detected, offers resume or restart
+
+**Fixture:**
+- Directory structure exists (was created) but `technical-preferences.md` is
+  still all placeholders (engine was never chosen — setup was interrupted)
+- No `production/stage.txt`
+
+**Input:** `/start`
+
+**Expected behavior:**
+1. Skill detects partial state: directories exist but engine is unconfigured
+2. Skill reports: "A partial setup was detected — directories exist but engine is not configured"
+3. Skill offers: resume from engine selection, or restart from scratch
+4. If resume: skill skips directory creation, proceeds to engine choice
+5. If restart: skill asks "May I overwrite existing structure?" before proceeding
+
+**Assertions:**
+- [ ] Partial state is correctly identified (directories present, engine absent)
+- [ ] User is offered resume vs. restart choice — not forced into one path
+- [ ] Resume path skips re-creating directories (no redundant "May I write" for structure)
+- [ ] Restart path asks for permission to overwrite before touching any files
+
+---
+
+### Case 5: Director Gate Check — No gate; start is a utility setup skill
+
+**Fixture:**
+- Any fixture
+
+**Input:** `/start`
+
+**Expected behavior:**
+1. Skill completes full onboarding flow
+2. No director agents are spawned at any point
+3. No gate IDs (CD-*, TD-*, AD-*, PR-*) appear in the output
+
+**Assertions:**
+- [ ] No director gate is invoked during the skill execution
+- [ ] No gate skip messages appear (gates are absent, not suppressed)
+- [ ] Skill reaches COMPLETE without any gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Asks for project name before any file is written
+- [ ] Presents engine options as a structured choice (not free text)
+- [ ] Asks "May I write" separately for directory structure and for CLAUDE.md stub
+- [ ] Ends with a handoff to `/setup-engine` with the engine name as argument
+- [ ] Verdict is clearly stated (COMPLETE or BLOCKED) at end of output
+
+---
+
+## Coverage Notes
+
+- The case where the user rejects all engine options and provides a custom
+  engine name is not tested — the skill is designed for the three supported
+  engines only.
+- Git initialization (if any) is not tested here; that is an infrastructure
+  concern outside the skill boundary.
+- Solo vs. lean mode behavior is not applicable — this skill has no gates and
+  mode selection is irrelevant.
--- a/Framework/skills/utility/test-helpers.md
+++ b/Framework/skills/utility/test-helpers.md
@@ -0,0 +1,175 @@
+# Skill Test Spec: /test-helpers
+
+## Skill Summary
+
+`/test-helpers` generates engine-specific test helper utilities for the project's
+test suite. Helpers include factory functions (for creating test entities with
+known state), fixture loaders, assertion helpers, and mock stubs for external
+dependencies. Generated helpers follow the naming and structure conventions in
+`coding-standards.md` and are written to `tests/helpers/`.
+
+Each helper file is gated behind a "May I write" ask. If a helper file already
+exists, the skill offers to extend it rather than replace. No director gates
+apply. The verdict is COMPLETE when helper files are written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before writing helpers
+- [ ] Has a next-step handoff (e.g., write a test using the generated helper)
+
+---
+
+## Director Gate Checks
+
+None. `/test-helpers` is a scaffolding utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Player factory helper generated for Godot/GDScript
+
+**Fixture:**
+- `technical-preferences.md` has engine Godot 4, language GDScript
+- `tests/` directory exists (test-setup has been run)
+- `design/gdd/player.md` exists with defined player properties
+- No existing helpers in `tests/helpers/`
+
+**Input:** `/test-helpers player-factory`
+
+**Expected behavior:**
+1. Skill reads engine (Godot 4 / GDScript) and player GDD for property context
+2. Skill generates a deterministic `PlayerFactory` helper in GDScript:
+   - `create_player(health: int = 100, speed: float = 200.0)` function
+   - Returns a player node pre-configured to a known state
+   - Uses dependency injection (no singletons)
+3. Skill asks "May I write to `tests/helpers/player_factory.gd`?"
+4. File is written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Generated helper is in GDScript (not C# or Blueprint)
+- [ ] Factory function parameters use defaults matching GDD values
+- [ ] Helper uses dependency injection (no Autoload/singleton references)
+- [ ] Filename follows snake_case convention for GDScript
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: No Test Setup Exists — Redirects to /test-setup
+
+**Fixture:**
+- `tests/` directory does not exist
+
+**Input:** `/test-helpers player-factory`
+
+**Expected behavior:**
+1. Skill checks for `tests/` directory — not found
+2. Skill reports: "Test directory not found — test framework must be set up first"
+3. Skill suggests running `/test-setup` before generating helpers
+4. No helper file is created
+
+**Assertions:**
+- [ ] Error message identifies the missing tests/ directory
+- [ ] `/test-setup` is suggested as the prerequisite step
+- [ ] No write tool is called
+- [ ] Verdict is not COMPLETE (blocked state)
+
+---
+
+### Case 3: Helper Already Exists — Offers to extend rather than replace
+
+**Fixture:**
+- `tests/helpers/player_factory.gd` already exists with a `create_player()` function
+- User requests a new `create_enemy()` function be added to the factory
+
+**Input:** `/test-helpers enemy-factory`
+
+**Expected behavior:**
+1. Skill finds an existing `player_factory.gd` and checks if it's the right file
+   to extend (or if a separate `enemy_factory.gd` should be created)
+2. Skill presents options: add `create_enemy()` to existing factory or create
+   `tests/helpers/enemy_factory.gd`
+3. User selects extend; skill drafts the `create_enemy()` function
+4. Skill asks "May I extend `tests/helpers/player_factory.gd`?"
+5. Function is added on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Existing helper is detected and surfaced
+- [ ] User is given extend vs. new file choice
+- [ ] "May I extend" language is used (not "May I write" for replacement)
+- [ ] Existing `create_player()` is preserved in the extended file
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: System Has No GDD — Notes missing design context in helper
+
+**Fixture:**
+- `technical-preferences.md` has Godot 4 / GDScript
+- `tests/` exists
+- User requests a helper for the "inventory system" but no `design/gdd/inventory.md` exists
+
+**Input:** `/test-helpers inventory-factory`
+
+**Expected behavior:**
+1. Skill looks for `design/gdd/inventory.md` — not found
+2. Skill notes: "No GDD found for inventory — generating helper with placeholder defaults"
+3. Skill generates an `inventory_factory.gd` with generic placeholder values
+   (item_count = 0, max_capacity = 20) and a comment: "# TODO: align defaults
+   with inventory GDD when written"
+4. Skill asks "May I write to `tests/helpers/inventory_factory.gd`?"
+5. File is written; verdict is COMPLETE with advisory note
+
+**Assertions:**
+- [ ] Skill proceeds without GDD (does not block)
+- [ ] Generated helper has placeholder defaults with TODO comment
+- [ ] Missing GDD is noted in the output (advisory warning)
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; test-helpers is a scaffolding utility
+
+**Fixture:**
+- Engine configured, tests/ exists
+
+**Input:** `/test-helpers player-factory`
+
+**Expected behavior:**
+1. Skill generates and writes the helper file
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads engine before generating any helper (helpers are engine-specific)
+- [ ] Reads GDD for default values when available
+- [ ] Notes missing GDD context rather than blocking
+- [ ] Detects existing helper files and offers extend rather than replace
+- [ ] Asks "May I write" (or "May I extend") before any file operation
+- [ ] Verdict is COMPLETE when helper is written
+
+---
+
+## Coverage Notes
+
+- Mock/stub helper generation (for dependencies like save systems or audio buses)
+  follows the same pattern as factory helpers and is not separately tested.
+- Unity C# helper generation (using NSubstitute or custom mocks) follows the
+  same logic as Case 1 with language-appropriate output.
+- The case where the requested helper type is not recognized is not tested;
+  the skill would ask the user to clarify the helper type.
--- a/Framework/skills/utility/test-setup.md
+++ b/Framework/skills/utility/test-setup.md
@@ -0,0 +1,173 @@
+# Skill Test Spec: /test-setup
+
+## Skill Summary
+
+`/test-setup` scaffolds the test framework for the project based on the
+configured engine. It creates the `tests/` directory structure defined in
+`coding-standards.md` (unit/, integration/, performance/, playtest/) and
+generates the appropriate test runner configuration for the detected engine:
+GdUnit4 config for Godot, Unity Test Runner asmdef for Unity, or Unreal headless
+runner for Unreal Engine.
+
+Each file or directory created is gated behind a "May I write" ask. If the test
+framework already exists, the skill verifies the configuration rather than
+reinitializing. No director gates apply. The verdict is COMPLETE when the
+scaffold is in place.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before creating files
+- [ ] Has a next-step handoff (e.g., `/test-helpers` to generate helper utilities)
+
+---
+
+## Director Gate Checks
+
+None. `/test-setup` is a scaffolding utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Godot project, scaffolds GdUnit4 test structure
+
+**Fixture:**
+- `technical-preferences.md` has engine set to Godot 4, language GDScript
+- `tests/` directory does not exist yet
+
+**Input:** `/test-setup`
+
+**Expected behavior:**
+1. Skill reads engine from `technical-preferences.md` → Godot 4 + GDScript
+2. Skill drafts the test directory structure: tests/unit/, tests/integration/,
+   tests/performance/, tests/playtest/, and a GdUnit4 runner config file
+3. Skill asks "May I write the tests/ directory structure?"
+4. Directories and GdUnit4 runner script created on approval
+5. Skill confirms the runner script matches the CI command in coding-standards.md:
+   `godot --headless --script tests/gdunit4_runner.gd`
+6. Verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 4 subdirectories (unit/, integration/, performance/, playtest/) are created
+- [ ] GdUnit4 runner config is generated
+- [ ] Runner script path matches coding-standards.md CI command
+- [ ] "May I write" is asked before creating any files
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Unity Project — Scaffolds Unity Test Runner with asmdef
+
+**Fixture:**
+- `technical-preferences.md` has engine set to Unity, language C#
+- `tests/` directory does not exist
+
+**Input:** `/test-setup`
+
+**Expected behavior:**
+1. Skill reads engine → Unity + C#
+2. Skill creates `Tests/` directory with Unity conventions (capitalized)
+3. Skill generates `Tests/Tests.asmdef` and `Tests/Editor/EditorTests.asmdef`
+4. EditMode and PlayMode test runner modes are configured
+5. Skill asks "May I write the Tests/ directory structure?"
+6. Verdict is COMPLETE
+
+**Assertions:**
+- [ ] Unity-specific `Tests/` structure is created (not the Godot structure)
+- [ ] `.asmdef` files are generated
+- [ ] EditMode and PlayMode runner config is present
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 3: Test Framework Already Exists — Verifies config, not re-initialized
+
+**Fixture:**
+- `tests/unit/`, `tests/integration/` exist
+- GdUnit4 runner script exists (Godot project)
+
+**Input:** `/test-setup`
+
+**Expected behavior:**
+1. Skill detects existing tests/ structure
+2. Skill reports: "Test framework already exists — verifying configuration"
+3. Skill checks: runner script path, directory completeness, CI command alignment
+4. If all checks pass: reports "Configuration verified — no changes needed"
+5. If checks fail (e.g., missing tests/performance/): reports specific gap and
+   asks "May I add the missing directories?"
+
+**Assertions:**
+- [ ] Skill does NOT reinitialize when framework exists
+- [ ] Verification checks are performed on existing structure
+- [ ] Only missing parts trigger a "May I write" ask
+- [ ] Verdict is COMPLETE whether everything was OK or gaps were fixed
+
+---
+
+### Case 4: No Engine Configured — Redirects to /setup-engine
+
+**Fixture:**
+- `technical-preferences.md` contains only placeholders (engine not set)
+
+**Input:** `/test-setup`
+
+**Expected behavior:**
+1. Skill reads `technical-preferences.md` and finds engine placeholder
+2. Skill reports: "Engine not configured — cannot scaffold engine-specific test framework"
+3. Skill suggests running `/setup-engine` first
+4. No directories or files are created
+
+**Assertions:**
+- [ ] Error message explicitly states engine is not configured
+- [ ] `/setup-engine` is suggested as the next step
+- [ ] No write tool is called
+- [ ] Verdict is not COMPLETE (blocked state)
+
+---
+
+### Case 5: Director Gate Check — No gate; test-setup is a scaffolding utility
+
+**Fixture:**
+- Engine configured, tests/ does not exist
+
+**Input:** `/test-setup`
+
+**Expected behavior:**
+1. Skill scaffolds and writes all test framework files
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads engine from `technical-preferences.md` before generating any scaffold
+- [ ] Generates engine-appropriate test runner config (not generic)
+- [ ] Creates all 4 subdirectories from coding-standards.md
+- [ ] Asks "May I write" before creating files
+- [ ] Detects existing framework and offers verification (not reinitialization)
+- [ ] Verdict is COMPLETE when scaffold is in place
+
+---
+
+## Coverage Notes
+
+- Unreal Engine test scaffolding (headless runner with `-nullrhi`) follows the
+  same pattern as Cases 1 and 2 and is not separately fixture-tested.
+- CI integration file generation (e.g., `.github/workflows/test.yml`) is
+  referenced but not assertion-tested here — it may be a separate skill concern.
+- The case where tests/ exists but is from a different engine (e.g., Unity tests
+  in a now-Godot project) is not tested; the skill would detect the mismatch
+  and offer to reconcile.