添加 claude code game studios 到项目

2026-05-15 14:52:29 +08:00
parent dff559462d
commit a16fe4bff7
415 changed files with 78609 additions and 0 deletions
--- a/Framework/skills/utility/adopt.md
+++ b/Framework/skills/utility/adopt.md
@@ -0,0 +1,214 @@
+# Skill Test Spec: /adopt
+
+## Skill Summary
+
+`/adopt` audits an existing project's artifacts — GDDs, ADRs, stories, infrastructure
+files, and `technical-preferences.md` — for format compliance with the template's
+skill pipeline. It classifies every gap by severity (BLOCKING / HIGH / MEDIUM / LOW),
+composes a numbered, ordered migration plan, and writes it to `docs/adoption-plan-[date].md`
+after explicit user approval via `AskUserQuestion`.
+
+This skill is distinct from `/project-stage-detect` (which checks what exists).
+`/adopt` checks whether what exists will actually work with the template's skills.
+
+No director gates apply. The skill does NOT invoke any director agents.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains severity tier keywords: BLOCKING, HIGH, MEDIUM, LOW
+- [ ] Contains "May I write" or `AskUserQuestion` language before writing the adoption plan
+- [ ] Has a next-step handoff at the end (e.g., offering to fix the highest-priority gap immediately)
+
+---
+
+## Director Gate Checks
+
+None. `/adopt` is a brownfield audit utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All GDDs compliant, no gaps, COMPLIANT
+
+**Fixture:**
+- `design/gdd/` contains 3 GDD files; each has all 8 required sections with content
+- `docs/architecture/adr-0001.md` exists with `## Status`, `## Engine Compatibility`,
+  and all other required sections
+- `production/stage.txt` exists
+- `docs/architecture/tr-registry.yaml` and `docs/architecture/control-manifest.md` exist
+- Engine configured in `technical-preferences.md`
+
+**Input:** `/adopt`
+
+**Expected behavior:**
+1. Skill emits "Scanning project artifacts..." then reads all artifacts silently
+2. Reports detected phase, GDD count, ADR count, story count
+3. Phase 2 audit: all 3 GDDs have all 8 sections, Status field present and valid
+4. ADR audit: all required sections present
+5. Infrastructure audit: all critical files exist
+6. Phase 3: zero BLOCKING, zero HIGH, zero MEDIUM, zero LOW gaps
+7. Summary reports: "No blocking gaps — this project is template-compatible"
+8. Uses `AskUserQuestion` to ask about writing the plan; user selects write
+9. Adoption plan is written to `docs/adoption-plan-[date].md`
+10. Phase 7 offers next action: no blocking gaps, offers options for next steps
+
+**Assertions:**
+- [ ] Skill reads silently before presenting any output
+- [ ] "Scanning project artifacts..." appears before the silent read phase
+- [ ] Gap counts show 0 BLOCKING, 0 HIGH, 0 MEDIUM (or only LOW)
+- [ ] `AskUserQuestion` is used before writing the adoption plan
+- [ ] Adoption plan file is written to `docs/adoption-plan-[date].md`
+- [ ] Phase 7 offers a specific next action (not just a list)
+
+---
+
+### Case 2: Non-Compliant Documents — GDDs missing sections, NEEDS MIGRATION
+
+**Fixture:**
+- `design/gdd/` contains 2 GDD files:
+  - `combat.md` — missing `## Acceptance Criteria` and `## Formulas` sections
+  - `movement.md` — all 8 sections present
+- One ADR (`adr-0001.md`) is missing `## Status` section
+- `docs/architecture/tr-registry.yaml` does not exist
+
+**Input:** `/adopt`
+
+**Expected behavior:**
+1. Skill scans all artifacts
+2. Phase 2 audit finds:
+   - `combat.md`: 2 missing sections (Acceptance Criteria, Formulas)
+   - `adr-0001.md`: missing `## Status` — BLOCKING impact
+   - `tr-registry.yaml`: missing — HIGH impact
+3. Phase 3 classifies:
+   - BLOCKING: `adr-0001.md` missing `## Status` (story-readiness silently passes)
+   - HIGH: `tr-registry.yaml` missing; `combat.md` missing Acceptance Criteria (can't generate stories)
+   - MEDIUM: `combat.md` missing Formulas
+4. Phase 4 builds ordered migration plan:
+   - Step 1 (BLOCKING): Add `## Status` to `adr-0001.md` — command: `/architecture-decision retrofit`
+   - Step 2 (HIGH): Run `/architecture-review` to bootstrap tr-registry.yaml
+   - Step 3 (HIGH): Add Acceptance Criteria to `combat.md` — command: `/design-system retrofit`
+   - Step 4 (MEDIUM): Add Formulas to `combat.md`
+5. Gap Preview shows BLOCKING items as bullets (actual file names), HIGH/MEDIUM as counts
+6. `AskUserQuestion` asks to write the plan; writes after approval
+7. Phase 7 offers to fix the highest-priority gap (ADR Status) immediately
+
+**Assertions:**
+- [ ] BLOCKING gaps are listed as explicit file-name bullets in the Gap Preview
+- [ ] HIGH and MEDIUM shown as counts in Gap Preview
+- [ ] Migration plan items are in BLOCKING-first order
+- [ ] Each plan item includes the fix command or manual steps
+- [ ] `AskUserQuestion` is used before writing
+- [ ] Phase 7 offers to immediately retrofit the first BLOCKING item
+
+---
+
+### Case 3: Mixed State — Some docs compliant, some not, partial report
+
+**Fixture:**
+- 4 GDD files: 2 fully compliant, 2 with gaps (one missing Tuning Knobs, one missing Edge Cases)
+- ADRs: 3 files — 2 compliant, 1 missing `## ADR Dependencies`
+- Stories: 5 files — 3 have TR-ID references, 2 do not
+- Infrastructure: all critical files present; `technical-preferences.md` fully configured
+
+**Input:** `/adopt`
+
+**Expected behavior:**
+1. Skill audits all artifact types
+2. Audit summary shows totals: "4 GDDs (2 fully compliant, 2 with gaps); 3 ADRs
+   (2 fully compliant, 1 with gaps); 5 stories (3 with TR-IDs, 2 without)"
+3. Gap classification:
+   - No BLOCKING gaps
+   - HIGH: 1 ADR missing `## ADR Dependencies`
+   - MEDIUM: 2 GDDs with missing sections; 2 stories missing TR-IDs
+   - LOW: none
+4. Migration plan lists HIGH gap first, then MEDIUM gaps in order
+5. Note included: "Existing stories continue to work — do not regenerate stories
+   that are in progress or done"
+6. `AskUserQuestion` to write plan; writes after approval
+
+**Assertions:**
+- [ ] Per-artifact compliance tallies are shown (N compliant, M with gaps)
+- [ ] Existing story compatibility note is included in the plan
+- [ ] No BLOCKING gaps results in no BLOCKING section in migration plan
+- [ ] HIGH gap precedes MEDIUM gaps in plan ordering
+- [ ] `AskUserQuestion` is used before writing
+
+---
+
+### Case 4: No Artifacts Found — Fresh project, guidance to run /start
+
+**Fixture:**
+- Repository has no files in `design/gdd/`, `docs/architecture/`, `production/epics/`
+- `production/stage.txt` does not exist
+- `src/` directory does not exist or has fewer than 10 files
+- No game-concept.md, no systems-index.md
+
+**Input:** `/adopt`
+
+**Expected behavior:**
+1. Phase 1 existence check finds no artifacts
+2. Skill infers "Fresh" — no brownfield work to migrate
+3. Uses `AskUserQuestion`:
+   - "This looks like a fresh project — no existing artifacts found. `/adopt` is for
+     projects with work to migrate. What would you like to do?"
+   - Options: "Run `/start`", "My artifacts are in a non-standard location", "Cancel"
+4. Skill stops — does not proceed to audit regardless of user selection
+
+**Assertions:**
+- [ ] `AskUserQuestion` is used (not a plain text message) when no artifacts are found
+- [ ] `/start` is presented as a named option
+- [ ] Skill stops after the question — no audit phases run
+- [ ] No adoption plan file is written
+
+---
+
+### Case 5: Director Gate Check — No gate; adopt is a utility audit skill
+
+**Fixture:**
+- Project with a mix of compliant and non-compliant GDDs
+
+**Input:** `/adopt`
+
+**Expected behavior:**
+1. Skill completes full audit and produces migration plan
+2. No director agents are spawned at any point
+3. No gate IDs (CD-*, TD-*, AD-*, PR-*) appear in output
+4. No `/gate-check` is invoked during the skill run
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Skill reaches plan-writing or cancellation without any gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Emits "Scanning project artifacts..." before silent read phase
+- [ ] Reads all artifacts silently before presenting any results
+- [ ] Shows Adoption Audit Summary and Gap Preview before asking to write
+- [ ] Uses `AskUserQuestion` before writing the adoption plan file
+- [ ] Adoption plan written to `docs/adoption-plan-[date].md` — not to any other path
+- [ ] Migration plan items ordered: BLOCKING first, HIGH second, MEDIUM third, LOW last
+- [ ] Phase 7 always offers a single specific next action (not a generic list)
+- [ ] Never regenerates existing artifacts — only fills gaps in what exists
+- [ ] Does not invoke director gates at any point
+
+---
+
+## Coverage Notes
+
+- The `gdds`, `adrs`, `stories`, and `infra` argument modes narrow the audit scope;
+  each follows the same pattern as the full audit but limited to that artifact type.
+  Not separately fixture-tested here.
+- The systems-index.md parenthetical status value check (BLOCKING) is a special case
+  that triggers an immediate fix offer before writing the plan; not separately tested.
+- The review-mode.txt prompt (Phase 6b) runs after plan writing if `production/review-mode.txt`
+  does not exist; not separately tested here.
--- a/Framework/skills/utility/asset-spec.md
+++ b/Framework/skills/utility/asset-spec.md
@@ -0,0 +1,179 @@
+# Skill Test Spec: /asset-spec
+
+## Skill Summary
+
+`/asset-spec` generates per-asset visual specification documents from design
+requirements. It reads the relevant GDD, art bible, and design system to produce
+a structured asset spec sheet that defines: dimensions, animation states (if
+applicable), color palette reference, style notes, technical constraints
+(format, file size budget), and deliverable checklist.
+
+Spec sheets are written to `assets/specs/[asset-name]-spec.md` after a "May I write"
+ask. If a spec already exists, the skill offers to update it. When multiple assets
+are requested in a single invocation, a "May I write" ask is made per asset. No
+director gates apply. The verdict is COMPLETE when all requested specs are written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language (per asset)
+- [ ] Has a next-step handoff (e.g., assign to an artist, or `/asset-audit` later)
+
+---
+
+## Director Gate Checks
+
+None. `/asset-spec` is a design documentation utility. Technical artists may
+review specs separately but this is not a gate within this skill.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Enemy sprite spec with full GDD and art bible
+
+**Fixture:**
+- `design/gdd/enemies.md` exists with enemy variants defined
+- `design/art-bible.md` exists with color palette and style notes
+- No existing asset spec for "goblin-enemy"
+
+**Input:** `/asset-spec goblin-enemy`
+
+**Expected behavior:**
+1. Skill reads enemies GDD and art bible
+2. Skill generates a spec for the goblin enemy sprite:
+   - Dimensions: inferred from engine defaults or explicitly from GDD
+   - Animation states: idle, walk, attack, hurt, death
+   - Color palette reference: links to art-bible palette section
+   - Style notes: from art bible character design rules
+   - Technical constraints: format (PNG), size budget
+   - Deliverable checklist
+3. Skill asks "May I write to `assets/specs/goblin-enemy-spec.md`?"
+4. File written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 6 spec components are present (dimensions, animations, palette, style, tech, checklist)
+- [ ] Color palette reference links to art bible (not duplicated)
+- [ ] Animation states are drawn from GDD (not invented)
+- [ ] "May I write" is asked with the correct path
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: No Art Bible Found — Spec with Placeholder Style Notes, Dependency Flagged
+
+**Fixture:**
+- `design/gdd/player.md` exists
+- `design/art-bible.md` does NOT exist
+
+**Input:** `/asset-spec player-sprite`
+
+**Expected behavior:**
+1. Skill reads player GDD but cannot find the art bible
+2. Skill generates spec with placeholder style notes: "DEPENDENCY GAP: art bible
+   not found — style notes are placeholders"
+3. Color palette section uses: "TBD — see art bible when created"
+4. Skill asks "May I write to `assets/specs/player-sprite-spec.md`?"
+5. File written with placeholders and dependency flag; verdict is COMPLETE with advisory
+
+**Assertions:**
+- [ ] DEPENDENCY GAP is flagged for the missing art bible
+- [ ] Spec is still generated (not blocked)
+- [ ] Style notes contain placeholder markers, not invented styles
+- [ ] Verdict is COMPLETE with advisory note
+
+---
+
+### Case 3: Asset Spec Already Exists — Offers to Update
+
+**Fixture:**
+- `assets/specs/goblin-enemy-spec.md` already exists
+- GDD has been updated since the spec was written (new attack animation added)
+
+**Input:** `/asset-spec goblin-enemy`
+
+**Expected behavior:**
+1. Skill detects existing spec file
+2. Skill reports: "Asset spec already exists for goblin-enemy — checking for updates"
+3. Skill diffs GDD against existing spec and identifies: new "charge-attack" animation
+   state added in GDD but not in spec
+4. Skill presents the diff: "1 new animation state found — offering to update spec"
+5. Skill asks "May I update `assets/specs/goblin-enemy-spec.md`?" (not overwrite)
+6. Spec is updated; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Existing spec is detected and "update" path is offered
+- [ ] Diff between GDD and existing spec is shown
+- [ ] "May I update" language is used (not "May I write")
+- [ ] Existing spec content is preserved; only the diff is applied
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: Multiple Assets Requested — May-I-Write Per Asset
+
+**Fixture:**
+- GDD and art bible exist
+- User requests specs for 3 assets: goblin-enemy, orc-enemy, treasure-chest
+
+**Input:** `/asset-spec goblin-enemy orc-enemy treasure-chest`
+
+**Expected behavior:**
+1. Skill generates all 3 specs in sequence
+2. For each asset, skill shows the draft and asks "May I write to
+   `assets/specs/[name]-spec.md`?" individually
+3. User can approve all 3 or skip individual assets
+4. All approved specs are written; verdict is COMPLETE
+
+**Assertions:**
+- [ ] "May I write" is asked 3 times (once per asset), not once for all
+- [ ] User can decline one asset without blocking the others
+- [ ] All 3 spec files are written for approved assets
+- [ ] Verdict is COMPLETE when all approved specs are written
+
+---
+
+### Case 5: Director Gate Check — No gate; asset-spec is a design utility
+
+**Fixture:**
+- GDD and art bible exist
+
+**Input:** `/asset-spec goblin-enemy`
+
+**Expected behavior:**
+1. Skill generates and writes the asset spec
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads GDD, art bible, and design system before generating spec
+- [ ] Includes all 6 spec components (dimensions, animations, palette, style, tech, checklist)
+- [ ] Flags missing dependencies (art bible, GDD) with DEPENDENCY GAP notes
+- [ ] Asks "May I write" (or "May I update") per asset
+- [ ] Handles multiple assets with individual write confirmations
+- [ ] Verdict is COMPLETE when all approved specs are written
+
+---
+
+## Coverage Notes
+
+- Audio asset specs (sound effects, music) follow the same structure with
+  different fields (duration, sample rate, looping) and are not separately tested.
+- UI asset specs (icons, button states) follow the same flow with interaction
+  state requirements aligned to the UX spec.
+- The case where GDD is also missing (neither GDD nor art bible exists) is not
+  separately tested; spec would be generated with both dependency gaps flagged.
--- a/Framework/skills/utility/brainstorm.md
+++ b/Framework/skills/utility/brainstorm.md
@@ -0,0 +1,189 @@
+# Skill Test Spec: /brainstorm
+
+## Skill Summary
+
+`/brainstorm` facilitates guided game concept ideation. It presents 2-4 concept
+options with pros/cons, lets the user choose and refine a concept, and produces
+a structured `design/gdd/game-concept.md` document. The skill is collaborative —
+it asks questions before proposing options and iterates until the user approves
+a concept direction.
+
+In `full` review mode, four director gates spawn in parallel after the concept
+is drafted: CD-PILLARS (creative-director), AD-CONCEPT-VISUAL (art-director),
+TD-FEASIBILITY (technical-director), and PR-SCOPE (producer). In `lean` mode,
+all 4 inline gates are skipped (lean mode only runs PHASE-GATEs, and brainstorm
+has none). In `solo` mode, all gates are skipped. The skill asks "May I write"
+before writing `design/gdd/game-concept.md`.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: APPROVED, REJECTED, CONCERNS
+- [ ] Contains "May I write" collaborative protocol language (for game-concept.md)
+- [ ] Has a next-step handoff at the end (`/map-systems`)
+- [ ] Documents 4 director gates in full mode: CD-PILLARS, AD-CONCEPT-VISUAL, TD-FEASIBILITY, PR-SCOPE
+- [ ] Documents that all 4 gates are skipped in lean and solo modes
+
+---
+
+## Director Gate Checks
+
+In `full` mode: CD-PILLARS, AD-CONCEPT-VISUAL, TD-FEASIBILITY, and PR-SCOPE
+spawn in parallel after the concept draft is approved by the user.
+
+In `lean` mode: all 4 inline gates are skipped (brainstorm has no PHASE-GATEs,
+so lean mode skips everything). Output notes all 4 as: "[GATE-ID] skipped — lean mode".
+
+In `solo` mode: all 4 gates are skipped. Output notes all 4 as: "[GATE-ID] skipped — solo mode".
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Full mode, 3 concepts, user picks one, all 4 directors approve
+
+**Fixture:**
+- No existing `design/gdd/game-concept.md`
+- `production/session-state/review-mode.txt` contains `full`
+
+**Input:** `/brainstorm`
+
+**Expected behavior:**
+1. Skill asks the user questions about genre, scope, and target feeling
+2. Skill presents 3 concept options with pros/cons each
+3. User selects one concept
+4. Skill elaborates the chosen concept into a structured draft
+5. All 4 director gates spawn in parallel: CD-PILLARS, AD-CONCEPT-VISUAL, TD-FEASIBILITY, PR-SCOPE
+6. All 4 return APPROVED
+7. Skill asks "May I write `design/gdd/game-concept.md`?"
+8. Concept written after approval
+
+**Assertions:**
+- [ ] Exactly 3 concept options are presented (not 1, not 5+)
+- [ ] All 4 director gates spawn in parallel (not sequentially)
+- [ ] All 4 gates complete before the "May I write" ask
+- [ ] "May I write `design/gdd/game-concept.md`?" is asked before writing
+- [ ] Concept file is NOT written without user approval
+- [ ] Next-step handoff to `/map-systems` is present
+
+---
+
+### Case 2: Failure Path — CD-PILLARS returns REJECT
+
+**Fixture:**
+- Concept draft is complete
+- `production/session-state/review-mode.txt` contains `full`
+- CD-PILLARS gate returns REJECT: "The concept has no identifiable creative pillar"
+
+**Input:** `/brainstorm`
+
+**Expected behavior:**
+1. CD-PILLARS gate returns REJECT with specific feedback
+2. Skill surfaces the rejection to the user
+3. Concept is NOT written to file
+4. User is asked: rethink the concept direction, or override the rejection
+5. If rethinking: skill returns to the concept options phase
+
+**Assertions:**
+- [ ] Concept is NOT written when CD-PILLARS returns REJECT
+- [ ] Rejection feedback is shown to the user verbatim
+- [ ] User is given the option to rethink or override
+- [ ] Skill returns to concept ideation phase if user chooses to rethink
+
+---
+
+### Case 3: Lean Mode — All 4 gates skipped; concept written after user confirms
+
+**Fixture:**
+- No existing game concept
+- `production/session-state/review-mode.txt` contains `lean`
+
+**Input:** `/brainstorm`
+
+**Expected behavior:**
+1. Concept options are presented and user selects one
+2. Concept is elaborated into a structured draft
+3. All 4 director gates are skipped — each noted: "[GATE-ID] skipped — lean mode"
+4. Skill asks user to confirm the concept is ready to write
+5. "May I write `design/gdd/game-concept.md`?" asked after confirmation
+6. Concept written after approval
+
+**Assertions:**
+- [ ] All 4 gate skip notes appear: "CD-PILLARS skipped — lean mode", "AD-CONCEPT-VISUAL skipped — lean mode", "TD-FEASIBILITY skipped — lean mode", "PR-SCOPE skipped — lean mode"
+- [ ] Concept is written after user confirmation only (no director approval needed in lean)
+- [ ] "May I write" is still asked before writing
+
+---
+
+### Case 4: Solo Mode — All gates skipped; concept written with only user approval
+
+**Fixture:**
+- No existing game concept
+- `production/session-state/review-mode.txt` contains `solo`
+
+**Input:** `/brainstorm`
+
+**Expected behavior:**
+1. Concept options are presented and user selects one
+2. Concept draft is shown to user
+3. All 4 director gates are skipped — each noted with "solo mode"
+4. "May I write `design/gdd/game-concept.md`?" asked
+5. Concept written after user approval
+
+**Assertions:**
+- [ ] All 4 skip notes appear with "solo mode" label
+- [ ] No director agents are spawned
+- [ ] Concept is written with only user approval
+- [ ] Behavior is otherwise equivalent to lean mode for this skill
+
+---
+
+### Case 5: Director Gate — PR-SCOPE returns CONCERNS (scope too large)
+
+**Fixture:**
+- Concept draft is complete
+- `production/session-state/review-mode.txt` contains `full`
+- PR-SCOPE gate returns CONCERNS: "The concept scope would require 18+ months for a solo developer"
+
+**Input:** `/brainstorm`
+
+**Expected behavior:**
+1. PR-SCOPE gate returns CONCERNS with specific scope feedback
+2. Skill surfaces the scope concerns to the user
+3. Scope concerns are documented in the concept draft before writing
+4. User is asked: reduce scope, accept concerns and document them, or rethink
+5. If concerns are accepted: concept is written with a "Scope Risk" note embedded
+
+**Assertions:**
+- [ ] PR-SCOPE concerns are shown to the user before the "May I write" ask
+- [ ] Skill does NOT write concept without surfacing scope concerns
+- [ ] If user accepts: scope concerns are documented in the concept file
+- [ ] Skill does NOT auto-reject a concept due to PR-SCOPE CONCERNS (user decides)
+
+---
+
+## Protocol Compliance
+
+- [ ] Presents 2-4 concept options with pros/cons before user commits
+- [ ] User confirms concept direction before director gates are invoked
+- [ ] All 4 director gates spawn in parallel in full mode
+- [ ] All 4 gates skipped in lean AND solo mode — each noted by name
+- [ ] "May I write `design/gdd/game-concept.md`?" asked before writing
+- [ ] Ends with next-step handoff: `/map-systems`
+
+---
+
+## Coverage Notes
+
+- AD-CONCEPT-VISUAL gate (art director feasibility) is grouped with the other
+  3 gates in the parallel spawn — not independently fixture-tested.
+- The iterative concept refinement loop (user rejects all options, skill
+  generates new ones) is not fixture-tested — it follows the same pattern as
+  the option selection phase.
+- The game-concept.md document structure (required sections) is defined in the
+  skill body and not re-enumerated in test assertions.
--- a/Framework/skills/utility/bug-report.md
+++ b/Framework/skills/utility/bug-report.md
@@ -0,0 +1,174 @@
+# Skill Test Spec: /bug-report
+
+## Skill Summary
+
+`/bug-report` creates a structured bug report document from a user description.
+It produces a report with the following required fields: Title, Repro Steps,
+Expected Behavior, Actual Behavior, Severity (CRITICAL/HIGH/MEDIUM/LOW), Affected
+System(s), and Build/Version. If the user's initial description is missing any
+required field, the skill asks follow-up questions to fill the gaps before
+producing the draft.
+
+The skill checks for possibly duplicate reports (by comparing to existing files
+in `production/bugs/`) and offers to link rather than create a new report. Each
+report is written to `production/bugs/bug-[date]-[slug].md` after a "May I write"
+ask. No director gates are used — bug reporting is an operational utility.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before writing the report
+- [ ] Has a next-step handoff (e.g., `/bug-triage` to reprioritize, `/hotfix` for critical)
+
+---
+
+## Director Gate Checks
+
+None. `/bug-report` is an operational documentation skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — User describes a crash, full report produced
+
+**Fixture:**
+- `production/bugs/` directory exists and is empty
+- No similar existing reports
+
+**Input:** `/bug-report` (user describes: "Game crashes when player enters the boss arena")
+
+**Expected behavior:**
+1. Skill extracts: Title = "Game crashes when entering boss arena"
+2. Skill recognizes crash reports as CRITICAL severity
+3. Skill confirms repro steps, expected (no crash), actual (crash), affected system
+   (arena/boss), and build version with the user
+4. Skill drafts the full structured report
+5. Skill asks "May I write to `production/bugs/bug-2026-04-06-game-crashes-boss-arena.md`?"
+6. File is written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 7 required fields are present in the report
+- [ ] Severity is CRITICAL for a crash report
+- [ ] Filename follows the `bug-[date]-[slug].md` convention
+- [ ] "May I write" is asked with the full file path
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Minimal Input — Skill asks follow-up questions for missing fields
+
+**Fixture:**
+- User provides: "Sometimes the audio cuts out"
+- No existing reports
+
+**Input:** `/bug-report`
+
+**Expected behavior:**
+1. Skill identifies missing required fields: repro steps, expected vs. actual,
+   severity, affected system, build
+2. Skill asks targeted follow-up questions for each missing field (one at a time
+   or in a structured prompt)
+3. User provides answers
+4. Skill compiles complete report from answers
+5. Skill asks "May I write?" and writes on approval
+
+**Assertions:**
+- [ ] At least 3 follow-up questions are asked to fill missing fields
+- [ ] Each required field is filled before the report is finalized
+- [ ] Report is not written until all required fields are present
+- [ ] Verdict is COMPLETE after all fields are filled and file is written
+
+---
+
+### Case 3: Possible Duplicate — Offers to link rather than create new
+
+**Fixture:**
+- `production/bugs/bug-2026-03-20-audio-cut-out.md` already exists with
+  similar title and MEDIUM severity
+
+**Input:** `/bug-report` (user describes: "Audio randomly stops working")
+
+**Expected behavior:**
+1. Skill scans existing reports and finds the similar audio bug
+2. Skill reports: "A similar bug report exists: bug-2026-03-20-audio-cut-out.md"
+3. Skill presents options: link as duplicate (add note to existing), create new anyway
+4. If user chooses link: skill adds a cross-reference note to the existing file
+   (asks "May I update the existing report?")
+5. If user chooses create new: normal report creation proceeds
+
+**Assertions:**
+- [ ] Existing similar report is surfaced before creating a new one
+- [ ] User is given the choice (not forced to link or create)
+- [ ] If linking: "May I update" is asked before modifying the existing file
+- [ ] Verdict is COMPLETE in either path
+
+---
+
+### Case 4: Multi-System Bug — Report created with multiple system tags
+
+**Fixture:**
+- No existing reports
+
+**Input:** `/bug-report` (user describes: "After finishing a level, the save system
+  freezes and the UI doesn't show the completion screen")
+
+**Expected behavior:**
+1. Skill identifies 2 affected systems from the description: Save System and UI
+2. Report is drafted with both systems listed under Affected System(s)
+3. Severity is assessed (likely HIGH — data loss risk from save freeze)
+4. Skill asks "May I write" with the appropriate filename
+5. Report is written with both systems tagged; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Both affected systems are listed in the report
+- [ ] Single report is created (not one per system)
+- [ ] Severity reflects the most impactful component (save freeze → HIGH or CRITICAL)
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; bug reporting is operational
+
+**Fixture:**
+- Any bug description provided
+
+**Input:** `/bug-report`
+
+**Expected behavior:**
+1. Skill creates and writes the bug report
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Skill reaches COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Collects all 7 required fields before drafting the report
+- [ ] Asks follow-up questions for any missing required fields
+- [ ] Checks for similar existing reports before creating a new one
+- [ ] Asks "May I write to `production/bugs/bug-[date]-[slug].md`?" before writing
+- [ ] Verdict is COMPLETE when the report file is written
+
+---
+
+## Coverage Notes
+
+- The case where the user provides a severity that seems too low for the
+  described impact (e.g., LOW for a crash) is not tested; the skill may suggest
+  a higher severity but ultimately respects user input.
+- Build/version field is required but may be "unknown" if the user doesn't know —
+  this is accepted as a valid value and not tested separately.
+- Report slug generation (sanitizing the title into a filename) is an
+  implementation detail not assertion-tested here.
--- a/Framework/skills/utility/bug-triage.md
+++ b/Framework/skills/utility/bug-triage.md
@@ -0,0 +1,174 @@
+# Skill Test Spec: /bug-triage
+
+## Skill Summary
+
+`/bug-triage` reads all open bug reports in `production/bugs/` and produces a
+prioritized triage table sorted by severity (CRITICAL → HIGH → MEDIUM → LOW).
+It runs on the Haiku model (read-only, formatting/sorting task) and produces no
+file writes — the triage output is conversational. The skill flags bugs missing
+reproduction steps and identifies possible duplicates by comparing titles and
+affected systems.
+
+The verdict is always TRIAGED — the skill is advisory and informational. No
+director gates apply. The output is intended to help a producer or QA lead
+prioritize which bugs to address next.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: TRIAGED
+- [ ] Does NOT contain "May I write" language (skill is read-only)
+- [ ] Has a next-step handoff (e.g., `/bug-report` to create new reports, `/hotfix` for critical bugs)
+
+---
+
+## Director Gate Checks
+
+None. `/bug-triage` is a read-only advisory skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — 5 bugs of varying severity, sorted table produced
+
+**Fixture:**
+- `production/bugs/` contains 5 bug report files:
+  - bug-2026-03-10-audio-crash.md (CRITICAL)
+  - bug-2026-03-12-score-overflow.md (HIGH)
+  - bug-2026-03-14-ui-overlap.md (MEDIUM)
+  - bug-2026-03-15-typo-tutorial.md (LOW)
+  - bug-2026-03-16-vfx-flicker.md (HIGH)
+
+**Input:** `/bug-triage`
+
+**Expected behavior:**
+1. Skill reads all 5 bug report files
+2. Skill extracts severity, title, system, and repro status from each
+3. Skill produces a triage table sorted: CRITICAL first, then HIGH, MEDIUM, LOW
+4. Within the same severity, bugs are ordered by date (oldest first)
+5. Verdict is TRIAGED
+
+**Assertions:**
+- [ ] Triage table has exactly 5 rows
+- [ ] CRITICAL bug appears before both HIGH bugs
+- [ ] HIGH bugs appear before MEDIUM and LOW bugs
+- [ ] Verdict is TRIAGED
+- [ ] No files are written
+
+---
+
+### Case 2: No Bug Reports Found — Guidance to run /bug-report
+
+**Fixture:**
+- `production/bugs/` directory exists but is empty (or does not exist)
+
+**Input:** `/bug-triage`
+
+**Expected behavior:**
+1. Skill scans `production/bugs/` and finds no reports
+2. Skill outputs: "No open bug reports found in production/bugs/"
+3. Skill suggests running `/bug-report` to create a bug report
+4. No triage table is produced
+
+**Assertions:**
+- [ ] Output explicitly states no bugs were found
+- [ ] `/bug-report` is suggested as the next step
+- [ ] Skill does not error out — it handles empty directory gracefully
+- [ ] Verdict is TRIAGED (with "no bugs found" context)
+
+---
+
+### Case 3: Bug Missing Reproduction Steps — Flagged as NEEDS REPRO INFO
+
+**Fixture:**
+- `production/bugs/` contains 3 bug reports; one has an empty "Repro Steps" section
+
+**Input:** `/bug-triage`
+
+**Expected behavior:**
+1. Skill reads all 3 reports
+2. Skill detects the report with no repro steps
+3. That bug appears in the triage table with a `NEEDS REPRO INFO` tag
+4. Other bugs are triaged normally
+5. Verdict is TRIAGED
+
+**Assertions:**
+- [ ] `NEEDS REPRO INFO` tag appears next to the bug missing repro steps
+- [ ] The flagged bug is still included in the table (not excluded)
+- [ ] Other bugs are unaffected
+- [ ] Verdict is TRIAGED
+
+---
+
+### Case 4: Possible Duplicate Bugs — Flagged in triage output
+
+**Fixture:**
+- `production/bugs/` contains 2 bug reports with similar titles:
+  - bug-2026-03-18-player-fall-through-floor.md
+  - bug-2026-03-20-player-clips-through-floor.md
+  - Both affect the "Physics" system with identical severity
+
+**Input:** `/bug-triage`
+
+**Expected behavior:**
+1. Skill reads both reports and detects similar title + same system + same severity
+2. Both bugs are included in the triage table
+3. Each is tagged with `POSSIBLE DUPLICATE` and cross-references the other report
+4. No bugs are merged or deleted — flagging is advisory
+5. Verdict is TRIAGED
+
+**Assertions:**
+- [ ] Both bugs appear in the table (not merged)
+- [ ] Both are tagged `POSSIBLE DUPLICATE`
+- [ ] Each cross-references the other (by filename or title)
+- [ ] Verdict is TRIAGED
+
+---
+
+### Case 5: Director Gate Check — No gate; triage is advisory
+
+**Fixture:**
+- `production/bugs/` contains any number of reports
+
+**Input:** `/bug-triage`
+
+**Expected behavior:**
+1. Skill produces the triage table
+2. No director agents are spawned
+3. No gate IDs appear in output
+4. No write tool is called
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No write tool is called
+- [ ] No gate skip messages appear
+- [ ] Verdict is TRIAGED without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads all files in `production/bugs/` before generating the table
+- [ ] Sorts by severity (CRITICAL → HIGH → MEDIUM → LOW)
+- [ ] Flags bugs missing repro steps
+- [ ] Flags possible duplicates by title/system similarity
+- [ ] Does not write any files
+- [ ] Verdict is TRIAGED in all cases (even empty)
+
+---
+
+## Coverage Notes
+
+- The case where a bug report is malformed (missing severity field entirely)
+  is not fixture-tested; skill would flag it as `UNKNOWN SEVERITY` and sort it
+  last in the table.
+- Status transitions (marking bugs as resolved) are outside this skill's scope —
+  bug-triage is read-only.
+- The duplicate detection heuristic (title similarity + same system) is
+  approximate; exact matching logic is defined in the skill body.
--- a/Framework/skills/utility/day-one-patch.md
+++ b/Framework/skills/utility/day-one-patch.md
@@ -0,0 +1,175 @@
+# Skill Test Spec: /day-one-patch
+
+## Skill Summary
+
+`/day-one-patch` prepares a day-one patch plan for issues that are known at
+launch but deferred from the v1.0 release. It reads open bug reports in
+`production/bugs/`, deferred acceptance criteria from story files (stories
+marked `Status: Done` but with noted deferred ACs), and produces a prioritized
+patch plan with estimated fix timelines per issue.
+
+The patch plan is written to `production/releases/day-one-patch.md` after a
+"May I write" ask. If a P0 (critical post-ship) issue is discovered, the skill
+triggers guidance to run `/hotfix` before the patch. No director gates apply.
+The verdict is always COMPLETE.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before writing the plan
+- [ ] Has a next-step handoff (e.g., `/hotfix` for P0 issues, `/release-checklist` for follow-up)
+
+---
+
+## Director Gate Checks
+
+None. `/day-one-patch` is a release planning utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — 3 Known Issues, Patch Plan With Fix Estimates
+
+**Fixture:**
+- `production/bugs/` contains 3 open bugs with severities: 1 MEDIUM, 2 LOW
+- No deferred ACs in sprint stories
+- All bugs have repro steps and system identifications
+
+**Input:** `/day-one-patch`
+
+**Expected behavior:**
+1. Skill reads all 3 open bugs
+2. Skill assigns fix effort estimates: MEDIUM bug = 1-2 days, LOW bugs = 4 hours each
+3. Skill produces a patch plan prioritizing MEDIUM bug first
+4. Plan includes: priority order, estimated timeline, responsible system, fix description
+5. Skill asks "May I write to `production/releases/day-one-patch.md`?"
+6. File written; verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 3 bugs appear in the plan
+- [ ] Bugs are prioritized by severity (MEDIUM before LOW)
+- [ ] Fix estimates are provided per issue
+- [ ] "May I write" is asked before writing
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Critical Issue Discovered Post-Ship — P0, Triggers /hotfix Guidance
+
+**Fixture:**
+- A CRITICAL severity bug is found in `production/bugs/` after the v1.0 release
+- The bug causes data loss for all save files
+
+**Input:** `/day-one-patch`
+
+**Expected behavior:**
+1. Skill reads bugs and identifies the CRITICAL severity issue
+2. Skill escalates: "P0 ISSUE DETECTED — data loss bug requires immediate hotfix
+   before patch planning can proceed"
+3. Skill does NOT include the P0 issue in the patch plan timeline
+4. Skill explicitly directs: "Run `/hotfix` to resolve this issue first"
+5. After P0 guidance is issued: plan for remaining lower-severity bugs is still
+   generated and written; verdict is COMPLETE
+
+**Assertions:**
+- [ ] P0 escalation message appears prominently before the patch plan
+- [ ] `/hotfix` is explicitly directed for the P0 issue
+- [ ] P0 issue is NOT scheduled in the patch plan timeline (it needs immediate action)
+- [ ] Non-P0 issues are still planned; verdict is COMPLETE
+
+---
+
+### Case 3: Deferred AC From Story-Done — Pulled Into Patch Plan Automatically
+
+**Fixture:**
+- `production/sprints/sprint-008.md` has a story with `Status: Done` and a note:
+  "DEFERRED AC: Gamepad vibration on damage — deferred to post-launch patch"
+- No open bugs for the same system
+
+**Input:** `/day-one-patch`
+
+**Expected behavior:**
+1. Skill reads sprint stories and detects the deferred AC note
+2. Deferred AC is automatically included in the patch plan as a work item
+3. Plan entry: "Deferred from sprint-008: Gamepad vibration on damage"
+4. Fix estimate is assigned; patch plan written after "May I write" approval
+5. Verdict is COMPLETE
+
+**Assertions:**
+- [ ] Deferred ACs from story files are automatically pulled into the plan
+- [ ] Deferred items are labeled by their source story (sprint-008)
+- [ ] Deferred AC gets a fix estimate like bug entries
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: No Known Issues — Empty Plan With Template Note
+
+**Fixture:**
+- `production/bugs/` is empty
+- No stories have deferred ACs
+
+**Input:** `/day-one-patch`
+
+**Expected behavior:**
+1. Skill reads bugs — none found
+2. Skill reads story deferred ACs — none found
+3. Skill produces an empty patch plan with a note: "No known issues at launch"
+4. Template structure is preserved (headers intact) for future use
+5. Skill asks "May I write to `production/releases/day-one-patch.md`?"
+6. File written; verdict is COMPLETE
+
+**Assertions:**
+- [ ] "No known issues at launch" note appears in the written file
+- [ ] Template headers are present in the empty plan
+- [ ] Skill does NOT error out when there are no issues to plan
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; day-one-patch is a planning utility
+
+**Fixture:**
+- Known issues present in production/bugs/
+
+**Input:** `/day-one-patch`
+
+**Expected behavior:**
+1. Skill generates and writes the patch plan
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads open bugs from `production/bugs/` before generating the plan
+- [ ] Scans story files for deferred AC notes
+- [ ] Escalates CRITICAL (P0) bugs with explicit `/hotfix` guidance
+- [ ] Produces an empty plan with note when no issues exist (not an error)
+- [ ] Asks "May I write to `production/releases/day-one-patch.md`?" before writing
+- [ ] Verdict is COMPLETE in all paths
+
+---
+
+## Coverage Notes
+
+- The case where multiple CRITICAL bugs exist is handled the same as Case 2;
+  all P0 issues are escalated together.
+- Timeline estimation for the patch (e.g., "patch available in 3 days")
+  requires manual QA and build time estimates; the skill uses rough estimates
+  based on severity, not actual team velocity.
+- The patch notes player communication document (`/patch-notes`) is a separate
+  skill invoked after the patch plan is executed.
--- a/Framework/skills/utility/help.md
+++ b/Framework/skills/utility/help.md
@@ -0,0 +1,172 @@
+# Skill Test Spec: /help
+
+## Skill Summary
+
+`/help` analyzes what has been done and what comes next in the project workflow.
+It runs on the Haiku model (read-only, formatting task) and reads `production/stage.txt`,
+the active sprint file, and recent session state to produce a concise situational
+guidance summary. The skill optionally accepts a context query (e.g., `/help testing`)
+to surface relevant skills for a specific topic.
+
+The output is always informational — no files are written and no director gates
+are invoked. The verdict is always HELP COMPLETE. The skill serves as a workflow
+navigator, suggesting 2-3 next skills based on the current project state.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: HELP COMPLETE
+- [ ] Does NOT contain "May I write" language (skill is read-only)
+- [ ] Has a next-step handoff (suggests 2-3 relevant skills based on state)
+
+---
+
+## Director Gate Checks
+
+None. `/help` is a read-only navigation skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Production stage with active sprint
+
+**Fixture:**
+- `production/stage.txt` contains `Production`
+- `production/sprints/sprint-004.md` exists with in-progress stories
+- `production/session-state/active.md` has a recent checkpoint
+
+**Input:** `/help`
+
+**Expected behavior:**
+1. Skill reads stage.txt and active sprint
+2. Skill identifies current sprint number and in-progress story count
+3. Skill outputs: current stage, sprint summary, and 3 suggested next skills
+   (e.g., `/sprint-status`, `/dev-story`, `/story-done`)
+4. Suggestions are ranked by relevance to current sprint state
+5. Verdict is HELP COMPLETE
+
+**Assertions:**
+- [ ] Current stage is shown (Production)
+- [ ] Active sprint number and story count are mentioned
+- [ ] Exactly 2-3 next-skill suggestions are given (not a list of all skills)
+- [ ] Suggestions are appropriate for Production stage
+- [ ] Verdict is HELP COMPLETE
+- [ ] No files are written
+
+---
+
+### Case 2: Concept Stage — Shows concept-to-systems-design workflow path
+
+**Fixture:**
+- `production/stage.txt` contains `Concept`
+- No sprint files, no GDD files
+- `technical-preferences.md` is configured (engine selected)
+
+**Input:** `/help`
+
+**Expected behavior:**
+1. Skill reads stage.txt — detects Concept stage
+2. Skill outputs the Concept-stage workflow: brainstorm → map-systems → design-system
+3. Suggested skills are: `/brainstorm`, `/map-systems` (if concept exists)
+4. Current progress is noted: "Engine configured, concept not yet created"
+
+**Assertions:**
+- [ ] Stage is identified as Concept
+- [ ] Workflow path shows the expected sequence for this stage
+- [ ] Suggestions do not include Production-stage skills (e.g., `/dev-story`)
+- [ ] Verdict is HELP COMPLETE
+
+---
+
+### Case 3: No stage.txt — Shows full workflow overview
+
+**Fixture:**
+- No `production/stage.txt`
+- No sprint files
+- `technical-preferences.md` has placeholders
+
+**Input:** `/help`
+
+**Expected behavior:**
+1. Skill cannot determine stage from stage.txt
+2. Skill runs project-stage-detect logic to infer stage from artifacts
+3. If stage cannot be inferred: outputs the full workflow overview from
+   Concept through Release as a reference map
+4. Primary suggestion is `/start` to begin configuration
+
+**Assertions:**
+- [ ] Skill does not crash when stage.txt is absent
+- [ ] Full workflow overview is shown when stage cannot be determined
+- [ ] `/start` or `/project-stage-detect` is a top suggestion
+- [ ] Verdict is HELP COMPLETE
+
+---
+
+### Case 4: Context Query — User asks for help with testing
+
+**Fixture:**
+- `production/stage.txt` contains `Production`
+- Active sprint has a story with `Status: In Review`
+
+**Input:** `/help testing`
+
+**Expected behavior:**
+1. Skill reads context query: "testing"
+2. Skill surfaces skills relevant to testing: `/qa-plan`, `/smoke-check`,
+   `/regression-suite`, `/test-setup`, `/test-evidence-review`
+3. Output is focused on testing workflow, not general sprint navigation
+4. Currently in-review story is highlighted as a testing candidate
+
+**Assertions:**
+- [ ] Context query is acknowledged in output ("Help topic: testing")
+- [ ] At least 3 testing-relevant skills are listed
+- [ ] General sprint skills (e.g., `/sprint-plan`) are not the primary suggestions
+- [ ] Verdict is HELP COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; help is read-only navigation
+
+**Fixture:**
+- Any project state
+
+**Input:** `/help`
+
+**Expected behavior:**
+1. Skill produces workflow guidance summary
+2. No director agents are spawned
+3. No gate IDs appear in output
+4. No write tool is called
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No write tool is called
+- [ ] No gate skip messages appear
+- [ ] Verdict is HELP COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads stage, sprint, and session state before generating suggestions
+- [ ] Suggestions are specific to the current project state (not generic)
+- [ ] Context query (if provided) narrows the suggestion set
+- [ ] Does not write any files
+- [ ] Verdict is HELP COMPLETE in all cases
+
+---
+
+## Coverage Notes
+
+- The case where the active sprint is complete (all stories Done) is not
+  separately tested; the skill would suggest `/sprint-plan` for the next sprint.
+- The `/help` skill does not validate whether suggested skills are available —
+  it assumes standard skill catalog availability.
+- Stage detection fallback (when stage.txt is absent) delegates to the same
+  logic as `/project-stage-detect` and is not re-tested here in detail.
--- a/Framework/skills/utility/hotfix.md
+++ b/Framework/skills/utility/hotfix.md
@@ -0,0 +1,173 @@
+# Skill Test Spec: /hotfix
+
+## Skill Summary
+
+`/hotfix` manages an emergency fix workflow: it creates a hotfix branch from
+main, applies a targeted fix to the identified file(s), runs `/smoke-check` to
+validate the fix doesn't introduce regressions, and prompts the user to confirm
+merge back to main. Each code change requires a "May I write to [filepath]?" ask.
+Git operations (branch creation, merge) are presented as Bash commands for user
+confirmation before execution.
+
+The skill is time-sensitive — director review is optional post-hoc, not a
+blocking gate. Verdicts: HOTFIX COMPLETE (fix applied, smoke check passed, merged)
+or HOTFIX BLOCKED (fix introduced regression or user declined).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: HOTFIX COMPLETE, HOTFIX BLOCKED
+- [ ] Contains "May I write" language for code changes
+- [ ] Has a next-step handoff (e.g., `/bug-report` to document the issue, or version bump)
+
+---
+
+## Director Gate Checks
+
+None. Hotfixes are time-critical. Director review may follow separately as a
+post-hoc step. No gate is invoked within this skill.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Critical crash bug fixed, smoke check passes
+
+**Fixture:**
+- `main` branch is clean
+- Bug is identified in `src/gameplay/arena.gd` (crash on boss arena entry)
+- Repro steps are provided by user
+
+**Input:** `/hotfix` (user describes the crash and affected file)
+
+**Expected behavior:**
+1. Skill proposes creating a hotfix branch: `hotfix/boss-arena-crash`
+2. User confirms; Bash command for branch creation is shown and confirmed
+3. Skill identifies the fix location in `arena.gd` and drafts the change
+4. Skill asks "May I write to `src/gameplay/arena.gd`?" and applies fix on approval
+5. Skill runs `/smoke-check` — PASS
+6. Skill presents the merge command and asks user to confirm merge to `main`
+7. User confirms; merge executes; verdict is HOTFIX COMPLETE
+
+**Assertions:**
+- [ ] Hotfix branch is created before any code changes
+- [ ] "May I write" is asked before modifying any source file
+- [ ] `/smoke-check` runs after the fix is applied
+- [ ] Merge requires explicit user confirmation (not automatic)
+- [ ] Verdict is HOTFIX COMPLETE after successful merge
+
+---
+
+### Case 2: Smoke Check Fails — HOTFIX BLOCKED
+
+**Fixture:**
+- Fix has been applied to `src/gameplay/arena.gd`
+- `/smoke-check` returns FAIL: "Player health clamping regression detected"
+
+**Input:** `/hotfix`
+
+**Expected behavior:**
+1. Skill applies the fix and runs `/smoke-check`
+2. Smoke check returns FAIL with specific regression identified
+3. Skill reports: "HOTFIX BLOCKED — smoke check failed: [regression detail]"
+4. Skill presents options: attempt revised fix, revert changes, or merge with
+   known regression (user acknowledges risk)
+5. No automatic merge occurs when smoke check fails
+
+**Assertions:**
+- [ ] Verdict is HOTFIX BLOCKED
+- [ ] Smoke check failure is shown verbatim to user
+- [ ] Merge is NOT performed automatically when smoke check fails
+- [ ] User is given explicit options for how to proceed
+
+---
+
+### Case 3: Fix to Already-Released Build — Version tag noted, patch bump prompted
+
+**Fixture:**
+- Latest git tag is `v1.2.0`
+- Hotfix targets a bug in the v1.2.0 release
+
+**Input:** `/hotfix`
+
+**Expected behavior:**
+1. Skill detects that the current HEAD is a tagged release (v1.2.0)
+2. Skill notes: "Hotfix targeting tagged release v1.2.0"
+3. After smoke check passes, skill prompts: "Should version be bumped to v1.2.1?"
+4. If user confirms version bump: skill asks "May I write to VERSION or equivalent?"
+5. After version update and merge: verdict is HOTFIX COMPLETE with version noted
+
+**Assertions:**
+- [ ] Version tag context is detected and surfaced to user
+- [ ] Patch version bump is suggested (not required) after merge
+- [ ] Version bump requires its own "May I write" confirmation
+- [ ] Verdict is HOTFIX COMPLETE
+
+---
+
+### Case 4: No Repro Steps — Skill Asks Before Applying Fix
+
+**Fixture:**
+- User invokes `/hotfix` with a vague description: "something is broken on level 3"
+- No repro steps provided
+
+**Input:** `/hotfix` (vague description)
+
+**Expected behavior:**
+1. Skill detects insufficient information to identify the fix location
+2. Skill asks: "Please provide reproduction steps and the affected file or system"
+3. Skill does NOT create a branch or modify any file until repro steps are provided
+4. After user provides repro steps: normal hotfix flow begins
+
+**Assertions:**
+- [ ] No branch is created without repro steps
+- [ ] No code changes are made without a clearly identified fix location
+- [ ] Repro step request is specific (not a generic "please provide more info")
+- [ ] Normal hotfix flow resumes after user provides repro steps
+
+---
+
+### Case 5: Director Gate Check — No gate; hotfixes are time-critical
+
+**Fixture:**
+- Critical bug with repro steps identified
+
+**Input:** `/hotfix`
+
+**Expected behavior:**
+1. Skill completes the hotfix workflow
+2. No director agents are spawned during execution
+3. No gate IDs appear in output
+4. Post-hoc director review (if needed) is a manual follow-up, not invoked here
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is HOTFIX COMPLETE or HOTFIX BLOCKED — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Creates hotfix branch before making any code changes
+- [ ] Asks "May I write" before modifying any source files
+- [ ] Runs `/smoke-check` after applying the fix
+- [ ] Requires explicit user confirmation before merging
+- [ ] HOTFIX BLOCKED when smoke check fails — no automatic merge
+- [ ] Verdict is HOTFIX COMPLETE or HOTFIX BLOCKED
+
+---
+
+## Coverage Notes
+
+- The case where multiple files need to be modified for one fix follows the same
+  "May I write" per-file pattern and is not separately tested.
+- The post-hotfix steps (create bug report, update changelog) are suggested in
+  the handoff but not tested as part of this skill's execution.
+- Conflict resolution during the merge (if main has diverged) is not tested;
+  the skill would surface the conflict and ask the user to resolve it manually.
--- a/Framework/skills/utility/launch-checklist.md
+++ b/Framework/skills/utility/launch-checklist.md
@@ -0,0 +1,180 @@
+# Skill Test Spec: /launch-checklist
+
+## Skill Summary
+
+`/launch-checklist` generates and evaluates a complete launch readiness checklist
+covering: legal compliance (EULA, privacy policy, ESRB/PEGI ratings), platform
+certification status, store page completeness (screenshots, description, metadata),
+build validation (version tag, reproducible build), analytics and crash reporting
+configuration, and first-run experience verification.
+
+The skill produces a checklist report written to `production/launch/launch-checklist-[date].md`
+after a "May I write" ask. If a previous launch checklist exists, it compares the
+new results against the old to highlight newly resolved and newly blocked items. No
+director gates apply — `/team-release` orchestrates the full release pipeline. Verdicts:
+LAUNCH READY, LAUNCH BLOCKED, or CONCERNS.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: LAUNCH READY, LAUNCH BLOCKED, CONCERNS
+- [ ] Contains "May I write" collaborative protocol language before writing the checklist
+- [ ] Has a next-step handoff (e.g., `/team-release` or `/day-one-patch`)
+
+---
+
+## Director Gate Checks
+
+None. `/launch-checklist` is a readiness audit utility. The full release pipeline
+is managed by `/team-release`.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All Checklist Items Verified, LAUNCH READY
+
+**Fixture:**
+- Legal docs present: EULA, privacy policy in `production/legal/`
+- Platform certification: marked as submitted and approved in production notes
+- Store page assets: screenshots, description, metadata all present in `production/store/`
+- Build: version tag `v1.0.0` exists, reproducible build confirmed
+- Crash reporting: configured in `technical-preferences.md`
+
+**Input:** `/launch-checklist`
+
+**Expected behavior:**
+1. Skill checks all checklist categories
+2. All items pass their verification checks
+3. Skill produces checklist report with all items marked PASS
+4. Skill asks "May I write to `production/launch/launch-checklist-2026-04-06.md`?"
+5. Report written on approval; verdict is LAUNCH READY
+
+**Assertions:**
+- [ ] All checklist categories are checked (legal, platform, store, build, analytics, UX)
+- [ ] All items appear in the report with PASS markers
+- [ ] Verdict is LAUNCH READY
+- [ ] "May I write" is asked with the correct dated filename
+
+---
+
+### Case 2: Platform Certification Not Submitted — LAUNCH BLOCKED
+
+**Fixture:**
+- All other checklist items pass
+- Platform certification section: "not submitted" (no submission record found)
+
+**Input:** `/launch-checklist`
+
+**Expected behavior:**
+1. Skill checks all items
+2. Platform certification check fails: no submission record
+3. Skill reports: "LAUNCH BLOCKED — Platform certification not submitted"
+4. Specific platform(s) missing certification are named
+5. Verdict is LAUNCH BLOCKED
+
+**Assertions:**
+- [ ] Verdict is LAUNCH BLOCKED (not CONCERNS)
+- [ ] Platform certification is identified as the blocking item
+- [ ] Missing platform names are specified
+- [ ] All other passing items are still shown in the report
+
+---
+
+### Case 3: Manual Check Required — CONCERNS Verdict
+
+**Fixture:**
+- All critical checklist items pass
+- First-run experience item: "MANUAL CHECK NEEDED — human must play the first 5
+  minutes and verify tutorial completion flow"
+- Store screenshots item: "MANUAL CHECK NEEDED — art team must verify screenshot
+  quality matches current build"
+
+**Input:** `/launch-checklist`
+
+**Expected behavior:**
+1. Skill checks all items
+2. 2 items are flagged as requiring human verification
+3. Skill reports: "CONCERNS — 2 items require manual verification before launch"
+4. Both items are listed with instructions for what to manually verify
+5. Verdict is CONCERNS (not LAUNCH BLOCKED, since these are advisory)
+
+**Assertions:**
+- [ ] Verdict is CONCERNS (not LAUNCH READY or LAUNCH BLOCKED)
+- [ ] Both manual check items are listed with verification instructions
+- [ ] Skill does not auto-block on MANUAL CHECK items
+
+---
+
+### Case 4: Previous Checklist Exists — Delta Comparison
+
+**Fixture:**
+- `production/launch/launch-checklist-2026-03-25.md` exists with previous results:
+  - 2 items were BLOCKED (platform cert, crash reporting)
+  - 1 item had a MANUAL CHECK
+- New checklist: platform cert is now PASS, crash reporting is now PASS,
+  manual check still open; 1 new item flagged (EULA last updated date)
+
+**Input:** `/launch-checklist`
+
+**Expected behavior:**
+1. Skill finds the previous checklist and loads it for comparison
+2. Skill produces the new checklist and compares:
+   - Newly resolved: "Platform cert — was BLOCKED, now PASS"
+   - Newly resolved: "Crash reporting — was BLOCKED, now PASS"
+   - Still open: manual check (unchanged)
+   - New issue: EULA last updated date (not in previous checklist)
+3. Delta is shown prominently in the report
+4. Verdict is CONCERNS (manual check + new EULA question)
+
+**Assertions:**
+- [ ] Delta section shows newly resolved items
+- [ ] Delta section shows new issues (not present in previous checklist)
+- [ ] Still-open items from the previous checklist are noted as persistent
+- [ ] Verdict reflects the current state (not the previous state)
+
+---
+
+### Case 5: Director Gate Check — No gate; launch-checklist is an audit utility
+
+**Fixture:**
+- All checklist dependencies present
+
+**Input:** `/launch-checklist`
+
+**Expected behavior:**
+1. Skill runs the full checklist and writes the report
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is LAUNCH READY, LAUNCH BLOCKED, or CONCERNS — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Checks all required categories (legal, platform, store, build, analytics, UX)
+- [ ] LAUNCH BLOCKED for hard failures (uncompleted certifications, missing legal docs)
+- [ ] CONCERNS for advisory items requiring manual verification
+- [ ] Compares against previous checklist when one exists
+- [ ] Asks "May I write" before creating the checklist report
+- [ ] Verdict is LAUNCH READY, LAUNCH BLOCKED, or CONCERNS
+
+---
+
+## Coverage Notes
+
+- Region-specific compliance (GDPR data handling, COPPA for under-13 audiences)
+  is checked but the specific requirements are not enumerated in test assertions.
+- The store page completeness check (screenshots, description) relies on the
+  presence of files in `production/store/`; it cannot verify visual quality.
+- Build reproducibility check validates the presence of a version tag and build
+  configuration but does not execute the build process.
--- a/Framework/skills/utility/localize.md
+++ b/Framework/skills/utility/localize.md
@@ -0,0 +1,176 @@
+# Skill Test Spec: /localize
+
+## Skill Summary
+
+`/localize` manages the full localization pipeline: it extracts all player-facing
+strings from source files, manages translation files in `assets/localization/`,
+and validates completeness across all locale files. For new languages, it creates
+a locale file skeleton with all current strings as keys and empty values. For
+existing locale files, it produces a diff showing additions, removals, and
+changed keys.
+
+Translation files are written to `assets/localization/[locale-code].csv` (or
+engine-appropriate format) after a "May I write" ask. No director gates apply.
+Verdicts: LOCALIZATION COMPLETE (all locales are complete) or GAPS FOUND (at
+least one locale is missing string keys).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: LOCALIZATION COMPLETE, GAPS FOUND
+- [ ] Contains "May I write" collaborative protocol language before writing locale files
+- [ ] Has a next-step handoff (e.g., send locale skeletons to translators)
+
+---
+
+## Director Gate Checks
+
+None. `/localize` is a pipeline utility. No director gates apply. Localization
+lead agent may review separately but is not invoked within this skill.
+
+---
+
+## Test Cases
+
+### Case 1: New Language — String Extraction and Locale Skeleton Created
+
+**Fixture:**
+- Source code in `src/` contains player-facing strings (UI text, tutorial messages)
+- Existing locale: `assets/localization/en.csv`
+- No French locale exists
+
+**Input:** `/localize fr`
+
+**Expected behavior:**
+1. Skill extracts all player-facing strings from source files
+2. Skill finds the same strings in `en.csv` as a reference
+3. Skill generates `fr.csv` skeleton with all string keys and empty values
+4. Skill asks "May I write to `assets/localization/fr.csv`?"
+5. File written on approval; verdict is GAPS FOUND (file created but empty values)
+6. Skill notes: "fr.csv created — send to translator to fill values"
+
+**Assertions:**
+- [ ] All string keys from `en.csv` are present in `fr.csv`
+- [ ] All values in `fr.csv` are empty (not copied from English)
+- [ ] "May I write" is asked before creating the file
+- [ ] Verdict is GAPS FOUND (file is created but untranslated)
+
+---
+
+### Case 2: Existing Locale Diff — Additions, Removals, and Changes Listed
+
+**Fixture:**
+- `assets/localization/fr.csv` exists with 20 string keys translated
+- Source code has changed: 3 new strings added, 1 string removed, 2 strings
+  with changed English source text
+
+**Input:** `/localize fr`
+
+**Expected behavior:**
+1. Skill extracts current strings from source
+2. Skill diffs against existing `fr.csv`
+3. Skill produces diff report:
+   - 3 new keys (need translation — listed as empty in fr.csv)
+   - 1 removed key (marked as obsolete — suggest removal)
+   - 2 changed keys (English source changed — French may need update, flagged)
+4. Skill asks "May I update `assets/localization/fr.csv`?"
+5. File updated with new empty keys added, obsolete keys marked; verdict is GAPS FOUND
+
+**Assertions:**
+- [ ] New keys appear as empty in the updated file (not auto-translated)
+- [ ] Removed keys are flagged as obsolete (not silently deleted)
+- [ ] Changed source strings are flagged for translator review
+- [ ] Verdict is GAPS FOUND (new empty keys exist)
+
+---
+
+### Case 3: String Missing in One Locale — GAPS FOUND With Missing Key List
+
+**Fixture:**
+- 3 locale files exist: `en.csv`, `fr.csv`, `de.csv`
+- `de.csv` is missing 4 keys that exist in both `en.csv` and `fr.csv`
+
+**Input:** `/localize`
+
+**Expected behavior:**
+1. Skill reads all 3 locale files and cross-references keys
+2. `de.csv` is missing 4 keys
+3. Skill produces GAPS FOUND report listing the 4 missing keys by locale:
+   "de.csv missing: [key1], [key2], [key3], [key4]"
+4. Skill offers to add the missing keys as empty values to `de.csv`
+5. After approval: file updated; verdict remains GAPS FOUND (values still empty)
+
+**Assertions:**
+- [ ] Missing keys are listed explicitly (not just a count)
+- [ ] Missing keys are attributed to the specific locale file
+- [ ] Verdict is GAPS FOUND (not LOCALIZATION COMPLETE)
+- [ ] Missing keys are added as empty (not auto-translated from English)
+
+---
+
+### Case 4: Translation File Has Syntax Error — Error With Line Reference
+
+**Fixture:**
+- `assets/localization/fr.csv` has a malformed line at line 47
+  (missing quote closure)
+
+**Input:** `/localize fr`
+
+**Expected behavior:**
+1. Skill reads `fr.csv` and encounters a parse error at line 47
+2. Skill outputs: "Parse error in fr.csv at line 47: [error detail]"
+3. Skill cannot diff or validate the file until the error is fixed
+4. Skill does NOT attempt to overwrite or auto-fix the malformed file
+5. Skill suggests fixing the file manually and re-running `/localize`
+
+**Assertions:**
+- [ ] Error message includes line number (line 47)
+- [ ] Error detail describes the nature of the parse error
+- [ ] Skill does NOT overwrite or modify the malformed file
+- [ ] Manual fix + re-run is suggested as remediation
+
+---
+
+### Case 5: Director Gate Check — No gate; localization is a pipeline utility
+
+**Fixture:**
+- Source code with player-facing strings
+
+**Input:** `/localize fr`
+
+**Expected behavior:**
+1. Skill extracts strings and manages locale files
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is LOCALIZATION COMPLETE or GAPS FOUND — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Extracts strings from source before operating on locale files
+- [ ] Creates new locale files with all keys as empty values (not auto-translated)
+- [ ] Diffs existing locale files against current source strings
+- [ ] Flags missing keys by locale and by key name
+- [ ] Asks "May I write" before creating or updating any locale file
+- [ ] Verdict is LOCALIZATION COMPLETE (all locales fully translated) or GAPS FOUND
+
+---
+
+## Coverage Notes
+
+- LOCALIZATION COMPLETE is only achievable when all locale files have all keys
+  with non-empty values; new-language skeleton creation always results in GAPS FOUND.
+- Engine-specific locale formats (Godot `.translation`, Unity `.po` files) are
+  handled by the skill body; `.csv` is used as the canonical format in tests.
+- The case where source strings change at a very high rate (continuous integration
+  of new UI text) is not tested; the diff logic handles this case.
--- a/Framework/skills/utility/onboard.md
+++ b/Framework/skills/utility/onboard.md
@@ -0,0 +1,179 @@
+# Skill Test Spec: /onboard
+
+## Skill Summary
+
+`/onboard` generates a contextual project onboarding summary tailored for a new
+team member. It reads CLAUDE.md, `technical-preferences.md`, the active sprint
+file, recent git commits, and `production/stage.txt` to produce a structured
+orientation document. The skill runs on the Haiku model (read-only, formatting
+task) and produces no file writes — all output is conversational.
+
+The skill optionally accepts a role argument (e.g., `/onboard artist`) to tailor
+the summary to a specific discipline. When the project is in an early stage or
+unconfigured, the output adapts to reflect what little is known. The verdict is
+always ONBOARDING COMPLETE — the skill is purely informational.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: ONBOARDING COMPLETE
+- [ ] Does NOT contain "May I write" language (skill is read-only)
+- [ ] Has a next-step handoff suggesting a relevant follow-on skill
+
+---
+
+## Director Gate Checks
+
+None. `/onboard` is a read-only orientation skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Configured project in Production stage with active sprint
+
+**Fixture:**
+- `production/stage.txt` contains `Production`
+- `technical-preferences.md` has engine, language, and specialists populated
+- `production/sprints/sprint-005.md` exists with stories in progress
+- Git log contains 5 recent commits
+
+**Input:** `/onboard`
+
+**Expected behavior:**
+1. Skill reads stage.txt, technical-preferences.md, active sprint, and git log
+2. Skill produces an onboarding summary with sections: Project Overview, Tech Stack,
+   Current Stage, Active Sprint Summary, Recent Activity
+3. Summary is formatted for readability (headers, bullet points)
+4. Next-step suggestions are appropriate for Production stage (e.g., `/sprint-status`,
+   `/dev-story`)
+5. Verdict ONBOARDING COMPLETE is stated
+
+**Assertions:**
+- [ ] Output includes current stage name from stage.txt
+- [ ] Output includes engine and language from technical-preferences.md
+- [ ] Active sprint stories are summarized (not just the sprint file name)
+- [ ] Recent commit context is present
+- [ ] Verdict is ONBOARDING COMPLETE
+- [ ] No files are written
+
+---
+
+### Case 2: Fresh Project — No engine, no sprint, suggests /start
+
+**Fixture:**
+- `technical-preferences.md` contains only placeholders (`[TO BE CONFIGURED]`)
+- No `production/stage.txt`
+- No sprint files
+- No CLAUDE.md overrides beyond defaults
+
+**Input:** `/onboard`
+
+**Expected behavior:**
+1. Skill reads all config files and detects unconfigured state
+2. Skill produces a minimal summary: "This project has not been configured yet"
+3. Output explains the onboarding workflow: `/start` → `/setup-engine` → `/brainstorm`
+4. Skill suggests running `/start` as the immediate next step
+5. Verdict is ONBOARDING COMPLETE (informational, not a failure)
+
+**Assertions:**
+- [ ] Output explicitly mentions the project is not yet configured
+- [ ] `/start` is recommended as the next step
+- [ ] Skill does NOT error out — it gracefully handles an empty project state
+- [ ] Verdict is still ONBOARDING COMPLETE
+
+---
+
+### Case 3: No CLAUDE.md Found — Error with remediation
+
+**Fixture:**
+- `CLAUDE.md` file does not exist (deleted or never created)
+- All other files may or may not exist
+
+**Input:** `/onboard`
+
+**Expected behavior:**
+1. Skill attempts to read CLAUDE.md and fails
+2. Skill outputs an error: "CLAUDE.md not found — cannot generate onboarding summary"
+3. Skill provides remediation: "Run `/start` to initialize the project configuration"
+4. No partial summary is generated
+
+**Assertions:**
+- [ ] Error message clearly identifies the missing file as CLAUDE.md
+- [ ] Remediation step (`/start`) is explicitly named
+- [ ] Skill does NOT produce a partial output when the root config is missing
+- [ ] Verdict is ONBOARDING COMPLETE (with error context, not a crash)
+
+---
+
+### Case 4: Role-Specific Onboarding — User specifies "artist" role
+
+**Fixture:**
+- Fully configured project in Production stage
+- `art-bible.md` exists in `design/`
+- Active sprint has visual story types (animation, VFX)
+
+**Input:** `/onboard artist`
+
+**Expected behavior:**
+1. Skill reads all standard files plus any art-relevant docs (art bible, asset specs)
+2. Summary is tailored to the artist role: art bible overview, asset pipeline,
+   current visual stories in the active sprint
+3. Technical architecture details (code structure, ADRs) are de-emphasized
+4. Specialist agents for art/audio are highlighted in the summary
+5. Verdict is ONBOARDING COMPLETE
+
+**Assertions:**
+- [ ] Role argument is acknowledged in the output ("Onboarding for: Artist")
+- [ ] Art bible summary is included if the file exists
+- [ ] Current visual stories from the active sprint are shown
+- [ ] Technical implementation details are not the primary focus
+- [ ] Verdict is ONBOARDING COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; onboard is read-only orientation
+
+**Fixture:**
+- Any configured project state
+
+**Input:** `/onboard`
+
+**Expected behavior:**
+1. Skill completes the full onboarding summary
+2. No director agents are spawned at any point
+3. No gate IDs appear in the output
+4. No "May I write" prompts appear
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No write tool is called
+- [ ] No gate skip messages appear
+- [ ] Verdict is ONBOARDING COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads all source files before generating output (no hallucinated project state)
+- [ ] Adapts output to project stage (Production ≠ Concept)
+- [ ] Respects role argument when provided
+- [ ] Does not write any files
+- [ ] Ends with ONBOARDING COMPLETE verdict in all paths
+
+---
+
+## Coverage Notes
+
+- The case where `technical-preferences.md` is missing entirely (as opposed to
+  having placeholders) is not separately tested; behavior follows the graceful
+  error pattern of Case 3.
+- Git history reading is assumed available; offline/no-git scenarios are not
+  tested here.
+- Discipline roles beyond "artist" (e.g., programmer, designer, producer) follow
+  the same tailoring pattern as Case 4 and are not separately tested.
--- a/Framework/skills/utility/playtest-report.md
+++ b/Framework/skills/utility/playtest-report.md
@@ -0,0 +1,178 @@
+# Skill Test Spec: /playtest-report
+
+## Skill Summary
+
+`/playtest-report` generates a structured playtest report from session notes or
+user input. The report is organized into four sections: Feel/Accessibility,
+Bugs Observed, Design Feedback, and Next Steps. When multiple testers participated,
+the skill aggregates feedback and distinguishes majority opinions from minority
+ones. The skill links to existing bug reports when a reported bug matches a file
+in `production/bugs/`.
+
+Reports are written to `production/qa/playtest-[date].md` after a "May I write"
+ask. No director gates apply here — the CD-PLAYTEST director gate (if needed) is
+a separate invocation. The verdict is COMPLETE when the report is written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before writing the report
+- [ ] Has a next-step handoff (e.g., `/bug-report` for new issues found, `/design-review` for feedback)
+
+---
+
+## Director Gate Checks
+
+None. `/playtest-report` is a documentation utility. The CD-PLAYTEST gate is a
+separate invocation and not part of this skill.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — User provides playtest notes, structured report produced
+
+**Fixture:**
+- User provides typed playtest notes from a single session
+- Notes cover: game feel, one bug (framerate drop), and a design concern
+  (tutorial too long)
+- `production/bugs/` exists but is empty (bug not yet reported)
+
+**Input:** `/playtest-report` (user pastes session notes)
+
+**Expected behavior:**
+1. Skill reads the provided notes and structures them into the 4-section template
+2. Feel/Accessibility: extracts feel observations
+3. Bugs: notes the framerate drop with available repro details
+4. Design Feedback: notes the tutorial length concern
+5. Next Steps: suggests `/bug-report` for the framerate issue and `/design-review`
+   for the tutorial feedback
+6. Skill asks "May I write to `production/qa/playtest-2026-04-06.md`?"
+7. Report is written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 4 sections are present in the report
+- [ ] Bug is listed in the Bugs section (not the Design Feedback section)
+- [ ] Next Steps are appropriate (bug report for crash, design review for feedback)
+- [ ] "May I write" is asked before writing
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Empty Input — Guided prompting through each section
+
+**Fixture:**
+- No notes provided by user at invocation
+
+**Input:** `/playtest-report`
+
+**Expected behavior:**
+1. Skill detects empty input
+2. Skill prompts through each section:
+   a. "Describe the overall feel and any accessibility observations"
+   b. "Were any bugs observed? Describe them"
+   c. "What design feedback did testers provide?"
+3. User answers each prompt
+4. Skill compiles report from answers and asks "May I write"
+5. Report written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] At least 3 guiding questions are asked (one per main section)
+- [ ] Report is not created until all sections have input (or user explicitly skips one)
+- [ ] Verdict is COMPLETE after file is written
+
+---
+
+### Case 3: Multiple Testers — Aggregated feedback with majority/minority notes
+
+**Fixture:**
+- User provides notes from 3 testers
+- 2/3 testers found the controls "intuitive"
+- 1/3 tester found the UI font too small
+- All 3 noted the same bug (player stuck on ledge)
+
+**Input:** `/playtest-report` (3-tester session)
+
+**Expected behavior:**
+1. Skill identifies 3 distinct tester perspectives in the input
+2. Control intuitiveness → noted as "Majority (2/3): controls intuitive"
+3. Font size → noted as "Minority (1/3): UI font size concern"
+4. Stuck-on-ledge bug → noted as "All testers: player stuck on ledge (confirmed)"
+5. Skill generates aggregated report with majority/minority labels
+6. Report written after "May I write" approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Majority opinion (2/3) is labeled as majority
+- [ ] Minority opinion (1/3) is labeled as minority
+- [ ] Unanimously reported bug is noted as confirmed by all testers
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: Bug Matches Existing Report — Links to existing file
+
+**Fixture:**
+- `production/bugs/bug-2026-03-30-player-stuck-ledge.md` exists
+- User's playtest notes describe "player gets stuck on ledges near walls"
+
+**Input:** `/playtest-report`
+
+**Expected behavior:**
+1. Skill structures the report and identifies the stuck-on-ledge bug
+2. Skill scans `production/bugs/` and finds `bug-2026-03-30-player-stuck-ledge.md`
+3. In the Bugs section, the report includes: "See existing report:
+   production/bugs/bug-2026-03-30-player-stuck-ledge.md"
+4. Skill does NOT suggest creating a new bug report for this issue
+5. Report written; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Existing bug report is found and linked in the playtest report
+- [ ] `/bug-report` is NOT suggested for the already-reported issue
+- [ ] Cross-reference to existing file appears in the Bugs section
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; CD-PLAYTEST is a separate invocation
+
+**Fixture:**
+- Playtest notes provided
+
+**Input:** `/playtest-report`
+
+**Expected behavior:**
+1. Skill generates and writes the playtest report
+2. No director agents are spawned (CD-PLAYTEST is not invoked here)
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No CD-PLAYTEST gate skip message appears
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Structures output into all 4 sections (Feel, Bugs, Design Feedback, Next Steps)
+- [ ] Labels majority vs. minority opinions when multiple testers are involved
+- [ ] Cross-references existing bug reports when bugs match
+- [ ] Asks "May I write to `production/qa/playtest-[date].md`?" before writing
+- [ ] Verdict is COMPLETE when report is written
+
+---
+
+## Coverage Notes
+
+- The CD-PLAYTEST director gate (creative director reviews playtest insights
+  for design implications) is a separate invocation and is not tested here.
+- Video recording or screenshot attachments are not tested; the report is a
+  text-only document.
+- The case where a tester's identity is unknown (anonymous feedback) follows
+  the same aggregation pattern as Case 3 without tester labels.
--- a/Framework/skills/utility/project-stage-detect.md
+++ b/Framework/skills/utility/project-stage-detect.md
@@ -0,0 +1,183 @@
+# Skill Test Spec: /project-stage-detect
+
+## Skill Summary
+
+`/project-stage-detect` automatically analyzes project artifacts to determine
+the current development stage. It runs on the Haiku model (read-only) and
+examines `production/stage.txt` (if present), design documents in `design/`,
+source code in `src/`, sprint and milestone files in `production/`, and the
+presence of engine configuration to classify the project into one of seven
+stages: Concept, Systems Design, Technical Setup, Pre-Production, Production,
+Polish, or Release.
+
+The skill is advisory — it never writes `stage.txt`. That file is only updated
+when `/gate-check` passes and the user confirms advancement. The skill reports
+its confidence level (HIGH if stage.txt was read directly, MEDIUM if inferred
+from artifacts, LOW if conflicting signals were found).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains all seven stage names: Concept, Systems Design, Technical Setup, Pre-Production, Production, Polish, Release
+- [ ] Does NOT contain "May I write" language (skill is detection-only)
+- [ ] Has a next-step handoff (e.g., `/gate-check` to formally advance stage)
+
+---
+
+## Director Gate Checks
+
+None. `/project-stage-detect` is a read-only detection utility. No director
+gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: stage.txt Exists — Reads directly and cross-checks artifacts
+
+**Fixture:**
+- `production/stage.txt` contains `Production`
+- `design/gdd/` has 4 GDD files
+- `src/` has source code files
+- `production/sprints/sprint-002.md` exists
+
+**Input:** `/project-stage-detect`
+
+**Expected behavior:**
+1. Skill reads `production/stage.txt` — detects stage `Production`
+2. Skill cross-checks artifacts: GDDs present, source code present, sprint present
+3. Artifacts are consistent with Production stage
+4. Skill reports: Stage = Production, Confidence = HIGH (from stage.txt, confirmed by artifacts)
+5. Next step: continue with `/sprint-plan` or `/dev-story`
+
+**Assertions:**
+- [ ] Detected stage is Production
+- [ ] Confidence is reported as HIGH when stage.txt is present
+- [ ] Cross-check result (consistent vs. discrepant) is noted
+- [ ] No files are written
+- [ ] Verdict clearly states the detected stage
+
+---
+
+### Case 2: No stage.txt but GDDs and Epics Exist — Infers Production
+
+**Fixture:**
+- No `production/stage.txt`
+- `design/gdd/` has 3 GDD files
+- `production/epics/` has 2 epic files
+- `src/` has source code files
+- `production/sprints/sprint-001.md` exists
+
+**Input:** `/project-stage-detect`
+
+**Expected behavior:**
+1. Skill finds no stage.txt — switches to artifact inference mode
+2. Skill finds GDDs (Systems Design complete), epics (Pre-Production complete),
+   source code and sprints (Production active)
+3. Skill infers: Stage = Production
+4. Confidence is MEDIUM (inferred from artifacts, not from stage.txt)
+5. Skill recommends running `/gate-check` to formalize and write stage.txt
+
+**Assertions:**
+- [ ] Inferred stage is Production
+- [ ] Confidence is MEDIUM (not HIGH, since stage.txt is absent)
+- [ ] Recommendation to run `/gate-check` is present
+- [ ] No stage.txt is written by this skill
+
+---
+
+### Case 3: No stage.txt, No Docs, No Source — Infers Concept
+
+**Fixture:**
+- No `production/stage.txt`
+- `design/` directory exists but is empty
+- `src/` exists but contains no code files
+- `technical-preferences.md` has placeholders only
+
+**Input:** `/project-stage-detect`
+
+**Expected behavior:**
+1. Skill finds no stage.txt
+2. Artifact scan: no GDDs, no source, no epics, no sprints, engine unconfigured
+3. Skill infers: Stage = Concept
+4. Confidence is MEDIUM
+5. Skill suggests `/start` to begin the onboarding workflow
+
+**Assertions:**
+- [ ] Inferred stage is Concept
+- [ ] Output lists the artifacts that were checked (and found absent)
+- [ ] `/start` is suggested as the next step
+- [ ] No files are written
+
+---
+
+### Case 4: Discrepancy — stage.txt says Production but no source code
+
+**Fixture:**
+- `production/stage.txt` contains `Production`
+- `design/gdd/` has GDD files
+- `src/` directory exists but contains no source code files
+- No sprint files exist
+
+**Input:** `/project-stage-detect`
+
+**Expected behavior:**
+1. Skill reads stage.txt — detects `Production`
+2. Cross-check finds: no source code, no sprints — inconsistent with Production
+3. Skill flags discrepancy: "stage.txt says Production but no source code or sprints found"
+4. Skill reports detected stage as Production (honoring stage.txt) but
+   confidence drops to LOW due to artifact mismatch
+5. Skill suggests reviewing stage.txt manually or running `/gate-check`
+
+**Assertions:**
+- [ ] Discrepancy is flagged explicitly in the output
+- [ ] Confidence is LOW when artifacts contradict stage.txt
+- [ ] stage.txt value is not silently overridden
+- [ ] User is advised to verify the discrepancy manually
+
+---
+
+### Case 5: Director Gate Check — No gate; detection is advisory
+
+**Fixture:**
+- Any project state with or without stage.txt
+
+**Input:** `/project-stage-detect`
+
+**Expected behavior:**
+1. Skill completes full stage detection
+2. No director agents are spawned at any point
+3. No gate IDs appear in output
+4. No write tool is called
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No write tool is called
+- [ ] Detection output is purely advisory
+- [ ] Verdict names the detected stage without triggering any gate
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads stage.txt if present; falls back to artifact inference if absent
+- [ ] Always reports a confidence level (HIGH / MEDIUM / LOW)
+- [ ] Cross-checks stage.txt against artifacts and flags discrepancies
+- [ ] Does not write stage.txt (that is `/gate-check`'s responsibility)
+- [ ] Ends with a next-step recommendation appropriate to the detected stage
+
+---
+
+## Coverage Notes
+
+- The Technical Setup stage (engine configured, no GDDs yet) and Pre-Production
+  stage (GDDs complete, no epics yet) follow the same artifact-inference pattern
+  as Cases 2 and 3 and are not separately fixture-tested.
+- The Polish and Release stages are not fixture-tested here; they follow the
+  same high-confidence (stage.txt present) or inference logic.
+- Confidence levels are advisory — the skill does not gate any actions on them.
--- a/Framework/skills/utility/prototype.md
+++ b/Framework/skills/utility/prototype.md
@@ -0,0 +1,178 @@
+# Skill Test Spec: /prototype
+
+## Skill Summary
+
+`/prototype` manages a rapid prototyping workflow for validating a game mechanic
+before committing to full production implementation. Prototypes are created in
+`prototypes/[mechanic-name]/` and are intentionally disposable — coding standards
+are relaxed (no ADR required, AC can be minimal, hardcoded values acceptable).
+After implementation, the skill produces a findings document summarizing what
+was learned and recommending next steps.
+
+The skill asks "May I write to `prototypes/[name]/`?" before creating files. If a
+prototype already exists, the skill offers to extend, replace, or archive. No
+director gates apply. Verdicts: PROTOTYPE COMPLETE (prototype built and findings
+documented) or PROTOTYPE ABANDONED (mechanic found to be unworkable).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: PROTOTYPE COMPLETE, PROTOTYPE ABANDONED
+- [ ] Contains "May I write" language before creating prototype files
+- [ ] Has a next-step handoff (e.g., `/design-system` to formalize, or archive)
+
+---
+
+## Director Gate Checks
+
+None. Prototypes are throwaway validation artifacts. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Mechanic concept prototyped, findings documented
+
+**Fixture:**
+- `prototypes/` directory exists
+- No existing prototype for "grapple-hook"
+
+**Input:** `/prototype grapple-hook`
+
+**Expected behavior:**
+1. Skill asks "May I write to `prototypes/grapple-hook/`?"
+2. After approval: creates `prototypes/grapple-hook/` directory and basic
+   implementation skeleton (main scene, player controller extension)
+3. Skill implements a minimal grapple hook mechanic (intentionally rough — no
+   polish, hardcoded values acceptable)
+4. Skill produces `prototypes/grapple-hook/findings.md` with:
+   - What was tested
+   - What worked
+   - What didn't work
+   - Recommendation (proceed / abandon / revise concept)
+5. Verdict is PROTOTYPE COMPLETE
+
+**Assertions:**
+- [ ] "May I write to `prototypes/grapple-hook/`?" is asked before any files are created
+- [ ] Implementation is isolated to `prototypes/` (not `src/`)
+- [ ] `findings.md` is created with at minimum: tested/worked/didn't-work/recommendation
+- [ ] Verdict is PROTOTYPE COMPLETE
+
+---
+
+### Case 2: Prototype Already Exists — Offers Extend, Replace, or Archive
+
+**Fixture:**
+- `prototypes/grapple-hook/` already exists from a previous prototype session
+- It contains a basic implementation and a findings.md
+
+**Input:** `/prototype grapple-hook`
+
+**Expected behavior:**
+1. Skill detects existing `prototypes/grapple-hook/` directory
+2. Skill reports: "Prototype already exists for grapple-hook"
+3. Skill presents 3 options:
+   - Extend: add new features to the existing prototype
+   - Replace: start fresh (asks "May I replace `prototypes/grapple-hook/`?")
+   - Archive: move to `prototypes/archive/grapple-hook/` and start fresh
+4. User selects; skill proceeds accordingly
+
+**Assertions:**
+- [ ] Existing prototype is detected and reported
+- [ ] Exactly 3 options are presented (extend, replace, archive)
+- [ ] Replace path includes a "May I replace" confirmation
+- [ ] Archive path moves (not deletes) the existing prototype
+
+---
+
+### Case 3: Prototype Validates Mechanic — Recommends Proceeding to Production
+
+**Fixture:**
+- Prototype implementation complete
+- Findings: grapple hook mechanic is fun and technically feasible
+
+**Input:** `/prototype grapple-hook` (prototype session complete)
+
+**Expected behavior:**
+1. After prototype is built and tested, findings are summarized
+2. Recommendation in findings.md: "Mechanic validated — recommend proceeding
+   to `/design-system` for full specification"
+3. Skill handoff message explicitly suggests `/design-system grapple-hook`
+4. Verdict is PROTOTYPE COMPLETE
+
+**Assertions:**
+- [ ] `findings.md` contains an explicit recommendation
+- [ ] Recommendation references `/design-system` when mechanic is validated
+- [ ] Handoff message echoes the recommendation
+- [ ] Verdict is PROTOTYPE COMPLETE (not PROTOTYPE ABANDONED)
+
+---
+
+### Case 4: Prototype Reveals Mechanic is Unworkable — PROTOTYPE ABANDONED
+
+**Fixture:**
+- Prototype implemented for "procedural-dialogue"
+- After testing: the mechanic creates incoherent dialogue trees and is
+  frustrating to play
+
+**Input:** `/prototype procedural-dialogue`
+
+**Expected behavior:**
+1. Prototype is built
+2. Findings document the failure: incoherent output, player confusion, technical complexity
+3. Recommendation in findings.md: "Mechanic not viable — abandoning"
+4. `findings.md` documents the specific reasons the mechanic failed
+5. Skill suggests alternatives in the handoff (e.g., curated dialogue instead)
+6. Verdict is PROTOTYPE ABANDONED
+
+**Assertions:**
+- [ ] Verdict is PROTOTYPE ABANDONED (not PROTOTYPE COMPLETE)
+- [ ] `findings.md` documents specific failure reasons (not vague)
+- [ ] Alternative approaches are suggested in the handoff
+- [ ] Prototype files are retained (not deleted) for reference
+
+---
+
+### Case 5: Director Gate Check — No gate; prototypes are validation artifacts
+
+**Fixture:**
+- Mechanic concept provided
+
+**Input:** `/prototype wall-jump`
+
+**Expected behavior:**
+1. Skill creates and documents the prototype
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is PROTOTYPE COMPLETE or PROTOTYPE ABANDONED — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Asks "May I write to `prototypes/[name]/`?" before creating any files
+- [ ] Creates all files under `prototypes/` (not `src/`)
+- [ ] Produces `findings.md` with tested/worked/didn't-work/recommendation
+- [ ] Notes that production coding standards are intentionally relaxed
+- [ ] Offers extend/replace/archive when prototype already exists
+- [ ] Verdict is PROTOTYPE COMPLETE or PROTOTYPE ABANDONED
+
+---
+
+## Coverage Notes
+
+- Prototype implementation quality (code style) is intentionally not tested —
+  prototypes are throwaway artifacts and quality standards do not apply.
+- The archiving mechanism is mentioned in Case 2 but the archive format is
+  not assertion-tested in detail.
+- Engine-specific prototype scaffolding (GDScript scenes vs. C# MonoBehaviour)
+  follows the same flow with engine-appropriate file types.
--- a/Framework/skills/utility/qa-plan.md
+++ b/Framework/skills/utility/qa-plan.md
@@ -0,0 +1,175 @@
+# Skill Test Spec: /qa-plan
+
+## Skill Summary
+
+`/qa-plan` generates a structured QA test plan for a feature or sprint milestone.
+It reads story files for the specified sprint, extracts acceptance criteria from
+each story, cross-references test standards from `coding-standards.md` to assign
+the appropriate test type (unit, integration, visual, UI, or config/data), and
+produces a prioritized QA plan document.
+
+The skill asks "May I write to `production/qa/qa-plan-sprint-NNN.md`?" before
+persisting the output. If an existing test plan for the same sprint is found, the
+skill offers to update rather than replace. The verdict is COMPLETE when the plan
+is written. No director gates are used — gate-level story readiness is handled by
+`/story-readiness`.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before writing the plan
+- [ ] Has a next-step handoff (e.g., `/smoke-check` or `/story-readiness`)
+
+---
+
+## Director Gate Checks
+
+None. `/qa-plan` is a planning utility. Story readiness gates are separate.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Sprint with 4 stories generates full test plan
+
+**Fixture:**
+- `production/sprints/sprint-003.md` lists 4 stories with defined acceptance criteria
+- Stories span types: 1 logic (formula), 1 integration, 1 visual, 1 UI
+- `coding-standards.md` is present with test evidence table
+
+**Input:** `/qa-plan sprint-003`
+
+**Expected behavior:**
+1. Skill reads sprint-003.md and identifies 4 stories
+2. Skill reads each story's acceptance criteria
+3. Skill assigns test types per coding-standards.md table:
+   - Logic story → Unit test (BLOCKING)
+   - Integration story → Integration test (BLOCKING)
+   - Visual story → Screenshot + lead sign-off (ADVISORY)
+   - UI story → Manual walkthrough doc (ADVISORY)
+4. Skill drafts QA plan with story-by-story test type breakdown
+5. Skill asks "May I write to `production/qa/qa-plan-sprint-003.md`?"
+6. File is written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 4 stories are included in the plan
+- [ ] Test type is assigned per coding-standards.md (not guessed)
+- [ ] Gate level (BLOCKING vs ADVISORY) is noted for each story
+- [ ] "May I write" is asked with the correct file path
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Story With No Acceptance Criteria — Flagged as UNTESTABLE
+
+**Fixture:**
+- `production/sprints/sprint-004.md` lists 3 stories; one story has empty
+  acceptance criteria section
+
+**Input:** `/qa-plan sprint-004`
+
+**Expected behavior:**
+1. Skill reads all 3 stories
+2. Skill detects the story with no AC
+3. Story is flagged as `UNTESTABLE — Acceptance Criteria required` in the plan
+4. Other 2 stories receive normal test type assignments
+5. Plan is written with the UNTESTABLE story flagged; verdict is COMPLETE
+
+**Assertions:**
+- [ ] UNTESTABLE label appears for the story with no AC
+- [ ] Plan is not blocked — the other stories are still planned
+- [ ] Output suggests adding AC to the flagged story (next step)
+- [ ] Verdict is COMPLETE (the plan is still generated)
+
+---
+
+### Case 3: Existing Test Plan Found — Offers update rather than replace
+
+**Fixture:**
+- `production/qa/qa-plan-sprint-003.md` already exists from a previous run
+- Sprint-003 has 2 new stories added since the last plan
+
+**Input:** `/qa-plan sprint-003`
+
+**Expected behavior:**
+1. Skill reads sprint-003.md and detects 2 stories not in the existing plan
+2. Skill reports: "Existing QA plan found for sprint-003 — offering to update"
+3. Skill presents the 2 new stories and their proposed test assignments
+4. Skill asks "May I update `production/qa/qa-plan-sprint-003.md`?" (not overwrite)
+5. Updated plan is written on approval
+
+**Assertions:**
+- [ ] Skill detects the existing plan file
+- [ ] "update" language is used (not "overwrite")
+- [ ] Only new stories are proposed for addition — existing entries preserved
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: No Stories Found for Sprint — Error with guidance
+
+**Fixture:**
+- `production/sprints/sprint-007.md` does not exist
+- No other sprint file matching sprint-007
+
+**Input:** `/qa-plan sprint-007`
+
+**Expected behavior:**
+1. Skill attempts to read sprint-007.md — file not found
+2. Skill outputs: "No sprint file found for sprint-007"
+3. Skill suggests running `/sprint-plan` to create the sprint first
+4. No plan is written; no "May I write" is asked
+
+**Assertions:**
+- [ ] Error message names the missing sprint file
+- [ ] `/sprint-plan` is suggested as the remediation step
+- [ ] No write tool is called
+- [ ] Verdict is not COMPLETE (error state)
+
+---
+
+### Case 5: Director Gate Check — No gate; QA planning is a utility
+
+**Fixture:**
+- Sprint with valid stories and AC
+
+**Input:** `/qa-plan sprint-003`
+
+**Expected behavior:**
+1. Skill generates and writes QA plan
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Skill reaches COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads coding-standards.md test evidence table before assigning test types
+- [ ] Assigns BLOCKING or ADVISORY gate level per story type
+- [ ] Flags stories with no AC as UNTESTABLE (does not silently skip them)
+- [ ] Detects existing plan and offers update path
+- [ ] Asks "May I write" before creating or updating the plan file
+- [ ] Verdict is COMPLETE when plan is written
+
+---
+
+## Coverage Notes
+
+- The case where `coding-standards.md` is missing (skill cannot assign test types)
+  is not fixture-tested; behavior would follow the BLOCKED pattern with a note
+  to restore the standards file.
+- Multi-sprint planning (spanning 2 sprints) is not tested; the skill is designed
+  for one sprint at a time.
+- Config/data story type (balance tuning → smoke check) follows the same
+  assignment pattern as other types in Case 1 and is not separately tested.
--- a/Framework/skills/utility/regression-suite.md
+++ b/Framework/skills/utility/regression-suite.md
@@ -0,0 +1,172 @@
+# Skill Test Spec: /regression-suite
+
+## Skill Summary
+
+`/regression-suite` maps test coverage to GDD requirements: it reads the
+acceptance criteria from story files in the current sprint (or a specified epic),
+then scans `tests/` for corresponding test files and checks whether each AC has
+a matching assertion. It produces a coverage report identifying which ACs are
+fully covered, partially covered, or untested, and which test files have no
+matching AC (orphan tests).
+
+The skill may write a coverage report to `production/qa/` after a "May I write"
+ask. No director gates apply. Verdicts: FULL COVERAGE (all ACs have tests),
+GAPS FOUND (some ACs are untested), or CRITICAL GAPS (a critical-priority AC
+has no test).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: FULL COVERAGE, GAPS FOUND, CRITICAL GAPS
+- [ ] Contains "May I write" language (skill may write coverage report)
+- [ ] Has a next-step handoff (e.g., `/test-setup` if framework missing, `/qa-plan` if plan missing)
+
+---
+
+## Director Gate Checks
+
+None. `/regression-suite` is a QA analysis utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Full Coverage — All ACs in sprint have corresponding tests
+
+**Fixture:**
+- `production/sprints/sprint-004.md` lists 3 stories with 2 ACs each (6 total)
+- `tests/unit/` and `tests/integration/` contain test files that match all 6 ACs
+  (by system name and scenario description)
+
+**Input:** `/regression-suite sprint-004`
+
+**Expected behavior:**
+1. Skill reads all 6 ACs from sprint-004 stories
+2. Skill scans test files and matches each AC to at least one test assertion
+3. All 6 ACs have coverage
+4. Skill produces coverage report: "6/6 ACs covered"
+5. Skill asks "May I write to `production/qa/regression-sprint-004.md`?"
+6. File is written on approval; verdict is FULL COVERAGE
+
+**Assertions:**
+- [ ] All 6 ACs appear in the coverage report
+- [ ] Each AC is marked as covered with the matching test file referenced
+- [ ] Verdict is FULL COVERAGE
+- [ ] "May I write" is asked before writing the report
+
+---
+
+### Case 2: Gaps Found — 3 ACs have no tests
+
+**Fixture:**
+- Sprint has 5 stories with 8 total ACs
+- Tests exist for 5 of the 8 ACs; 3 ACs have no corresponding test file or assertion
+
+**Input:** `/regression-suite`
+
+**Expected behavior:**
+1. Skill reads all 8 ACs
+2. Skill scans tests — 5 matched, 3 unmatched
+3. Coverage report lists the 3 untested ACs by story and AC text
+4. Skill asks "May I write to `production/qa/regression-[sprint]-[date].md`?"
+5. Report is written; verdict is GAPS FOUND
+
+**Assertions:**
+- [ ] The 3 untested ACs are listed by name in the report
+- [ ] Matched ACs are also shown (not only the gaps)
+- [ ] Verdict is GAPS FOUND (not FULL COVERAGE)
+- [ ] Report is written after "May I write" approval
+
+---
+
+### Case 3: Critical AC Untested — CRITICAL GAPS verdict, flagged prominently
+
+**Fixture:**
+- Sprint has 4 stories; one story is Priority: Critical with 2 ACs
+- One of the critical-priority ACs has no test
+
+**Input:** `/regression-suite`
+
+**Expected behavior:**
+1. Skill reads all stories and ACs, noting which stories are critical priority
+2. Skill scans tests — the critical AC has no match
+3. Report prominently flags: "CRITICAL GAP: [AC text] — no test found (Critical priority story)"
+4. Skill recommends blocking story completion until test is added
+5. Verdict is CRITICAL GAPS
+
+**Assertions:**
+- [ ] Verdict is CRITICAL GAPS (not GAPS FOUND)
+- [ ] Critical priority AC is flagged more prominently than normal gaps
+- [ ] Recommendation to block story completion is included
+- [ ] Non-critical gaps (if any) are also listed
+
+---
+
+### Case 4: Orphan Tests — Test file has no matching AC
+
+**Fixture:**
+- `tests/unit/save_system_test.gd` exists with assertions for scenarios
+  not present in any current story's AC list
+- Current sprint stories do not reference save system
+
+**Input:** `/regression-suite`
+
+**Expected behavior:**
+1. Skill scans tests and cross-references ACs
+2. `save_system_test.gd` assertions do not match any current AC
+3. Test file is flagged as ORPHAN TEST in the coverage report
+4. Report notes: "Orphan tests may belong to a past or future sprint, or AC was renamed"
+5. Verdict is FULL COVERAGE or GAPS FOUND depending on overall AC coverage
+   (orphan tests do not affect verdict, they are advisory)
+
+**Assertions:**
+- [ ] Orphan test is flagged in the report
+- [ ] Orphan flag includes the filename and suggestion (past sprint / renamed AC)
+- [ ] Orphan tests do not cause a GAPS FOUND verdict on their own
+- [ ] Overall verdict reflects AC coverage only
+
+---
+
+### Case 5: Director Gate Check — No gate; regression-suite is a QA utility
+
+**Fixture:**
+- Sprint with stories and test files
+
+**Input:** `/regression-suite`
+
+**Expected behavior:**
+1. Skill produces coverage report and writes it
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is FULL COVERAGE, GAPS FOUND, or CRITICAL GAPS — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads story ACs from sprint files before scanning tests
+- [ ] Matches ACs to tests by system name and scenario (not file name alone)
+- [ ] Flags critical-priority untested ACs as CRITICAL GAPS
+- [ ] Flags orphan tests (exist in tests/ but no AC matches)
+- [ ] Asks "May I write" before persisting the coverage report
+- [ ] Verdict is FULL COVERAGE, GAPS FOUND, or CRITICAL GAPS
+
+---
+
+## Coverage Notes
+
+- The heuristic for matching an AC to a test (by system name + scenario keywords)
+  is approximate; exact matching logic is defined in the skill body.
+- Integration test coverage is mapped the same way as unit test coverage; no
+  distinction in verdicts is made between the two.
+- This skill does not run the tests — it maps AC text to test assertions. Test
+  execution is handled by the CI pipeline.
--- a/Framework/skills/utility/release-checklist.md
+++ b/Framework/skills/utility/release-checklist.md
@@ -0,0 +1,177 @@
+# Skill Test Spec: /release-checklist
+
+## Skill Summary
+
+`/release-checklist` generates an internal release readiness checklist covering:
+sprint story completion, open bug severity, QA sign-off status, build stability,
+and changelog readiness. It is an internal gate — not a platform/store checklist
+(that is `/launch-checklist`). When a previous release checklist exists, it shows
+a delta of resolved and newly introduced issues.
+
+The skill writes its checklist report to `production/releases/release-checklist-[date].md`
+after a "May I write" ask. No director gates apply — `/gate-check` handles
+formal phase gate logic. Verdicts: RELEASE READY, RELEASE BLOCKED, or CONCERNS.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: RELEASE READY, RELEASE BLOCKED, CONCERNS
+- [ ] Contains "May I write" collaborative protocol language before writing the report
+- [ ] Has a next-step handoff (e.g., `/launch-checklist` for external or `/gate-check` for phase)
+
+---
+
+## Director Gate Checks
+
+None. `/release-checklist` is an internal audit utility. Formal phase advancement
+is managed by `/gate-check`.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — All Sprint Stories Complete, QA Passed, RELEASE READY
+
+**Fixture:**
+- `production/sprints/sprint-008.md` — all stories are `Status: Done`
+- No open bugs with severity HIGH or CRITICAL in `production/bugs/`
+- `production/qa/qa-plan-sprint-008.md` has QA sign-off annotation
+- Changelog entry for this version exists
+- `production/stage.txt` contains `Polish`
+
+**Input:** `/release-checklist`
+
+**Expected behavior:**
+1. Skill reads sprint-008: all stories Done
+2. Skill reads bugs: no HIGH or CRITICAL open bugs
+3. Skill confirms QA plan has sign-off
+4. Skill confirms changelog entry exists
+5. All checks pass; skill asks "May I write to
+   `production/releases/release-checklist-2026-04-06.md`?"
+6. Report written; verdict is RELEASE READY
+
+**Assertions:**
+- [ ] All 4 check categories are evaluated (stories, bugs, QA, changelog)
+- [ ] All items appear with PASS markers
+- [ ] Verdict is RELEASE READY
+- [ ] "May I write" is asked before writing
+
+---
+
+### Case 2: Open HIGH Severity Bugs — RELEASE BLOCKED
+
+**Fixture:**
+- All sprint stories are Done
+- `production/bugs/` contains 2 open bugs with severity HIGH
+
+**Input:** `/release-checklist`
+
+**Expected behavior:**
+1. Skill reads sprint — stories complete
+2. Skill reads bugs — 2 HIGH severity bugs open
+3. Skill reports: "RELEASE BLOCKED — 2 open HIGH severity bugs must be resolved"
+4. Both bug filenames are listed in the report
+5. Verdict is RELEASE BLOCKED
+
+**Assertions:**
+- [ ] Verdict is RELEASE BLOCKED (not CONCERNS)
+- [ ] Both bug filenames are listed explicitly
+- [ ] Skill makes clear HIGH severity bugs are blocking (not advisory)
+
+---
+
+### Case 3: Changelog Not Generated — CONCERNS
+
+**Fixture:**
+- All stories Done, no HIGH/CRITICAL bugs
+- No changelog entry found for the current version/sprint
+
+**Input:** `/release-checklist`
+
+**Expected behavior:**
+1. Skill checks all items
+2. Changelog check fails: no changelog entry found
+3. Skill reports: "CONCERNS — Changelog not generated for this release"
+4. Skill suggests running `/changelog` to generate it
+5. Verdict is CONCERNS (advisory — not a hard block)
+
+**Assertions:**
+- [ ] Verdict is CONCERNS (not RELEASE BLOCKED — changelog is advisory)
+- [ ] `/changelog` is suggested as the remediation
+- [ ] Other passing checks are shown in the report
+- [ ] Missing changelog is described as advisory, not blocking
+
+---
+
+### Case 4: Previous Release Checklist Exists — Delta From Last Release
+
+**Fixture:**
+- `production/releases/release-checklist-2026-03-20.md` exists
+- Previous: 1 story was incomplete, 1 HIGH bug open
+- Current: all stories Done, HIGH bug resolved, but now 1 MEDIUM bug appeared
+
+**Input:** `/release-checklist`
+
+**Expected behavior:**
+1. Skill finds the previous checklist and loads it
+2. New checklist is generated and compared:
+   - Newly resolved: "Story [X] — was open, now Done"
+   - Newly resolved: "HIGH bug [filename] — was open, now closed"
+   - New item: "1 MEDIUM bug appeared (advisory)"
+3. Delta section shows all changes prominently
+4. Verdict is CONCERNS (MEDIUM bug is advisory, not blocking)
+
+**Assertions:**
+- [ ] Delta section appears in the report with resolved and new items
+- [ ] Newly resolved items from the previous checklist are noted
+- [ ] New items not present in the previous checklist are highlighted
+- [ ] Verdict reflects current state (not previous state)
+
+---
+
+### Case 5: Director Gate Check — No gate; release-checklist is an internal audit
+
+**Fixture:**
+- Active sprint with stories and bug reports
+
+**Input:** `/release-checklist`
+
+**Expected behavior:**
+1. Skill runs the full checklist and writes the report
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is RELEASE READY, RELEASE BLOCKED, or CONCERNS — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Checks sprint story completion status
+- [ ] Checks open bug severity (CRITICAL/HIGH = BLOCKED; MEDIUM/LOW = CONCERNS)
+- [ ] Checks QA plan sign-off status
+- [ ] Checks changelog existence
+- [ ] Compares against previous checklist when one exists
+- [ ] Asks "May I write" before writing the report
+- [ ] Verdict is RELEASE READY, RELEASE BLOCKED, or CONCERNS
+
+---
+
+## Coverage Notes
+
+- Build stability verification (no failed CI runs) is listed as a check category
+  but relies on external CI system state; the skill notes this as a MANUAL CHECK
+  if CI integration is not configured.
+- CRITICAL bugs always result in RELEASE BLOCKED regardless of other items;
+  this is equivalent to the HIGH severity case in Case 2.
+- Stories with `Status: In Review` (not Done) are treated as incomplete
+  and result in RELEASE BLOCKED; this edge case follows the same pattern
+  as the HIGH bug case.
--- a/Framework/skills/utility/reverse-document.md
+++ b/Framework/skills/utility/reverse-document.md
@@ -0,0 +1,180 @@
+# Skill Test Spec: /reverse-document
+
+## Skill Summary
+
+`/reverse-document` generates design or architecture documentation from existing
+source code. It reads the specified source file(s), infers design intent from
+class structure, method names, constants, and comments, and produces either a
+GDD skeleton (for gameplay systems) or an architecture overview (for technical
+systems). The output is a best-effort inference — magic numbers and undocumented
+logic may result in a PARTIAL verdict.
+
+The skill asks "May I write to [inferred path]?" before creating the document.
+No director gates apply. Verdicts: COMPLETE (clean inference), PARTIAL (some
+fields are ambiguous and need human review).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, PARTIAL
+- [ ] Contains "May I write" collaborative protocol language before writing the doc
+- [ ] Has a next-step handoff (e.g., `/design-review` to validate the generated doc)
+
+---
+
+## Director Gate Checks
+
+None. `/reverse-document` is a documentation utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Well-Structured Source — Accurate design doc skeleton produced
+
+**Fixture:**
+- `src/gameplay/health_system.gd` exists with:
+  - `@export var max_health: int = 100`
+  - `func take_damage(amount: int)` with clamping logic
+  - `signal health_changed(new_value: int)`
+  - Docstrings on all public methods
+
+**Input:** `/reverse-document src/gameplay/health_system.gd`
+
+**Expected behavior:**
+1. Skill reads the source file and identifies the health system
+2. Skill infers design intent: max health, take_damage behavior, health signal
+3. Skill produces GDD skeleton for health system with 8 required sections:
+   Overview, Player Fantasy, Detailed Rules, Formulas, Edge Cases, Dependencies,
+   Tuning Knobs, Acceptance Criteria
+4. Formulas section includes the inferred clamping formula
+5. Tuning Knobs notes `max_health = 100` as a configurable value
+6. Skill asks "May I write to `design/gdd/health-system.md`?"
+7. File written; verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 8 required GDD sections are present in the output
+- [ ] `max_health = 100` appears as a Tuning Knob
+- [ ] Clamping formula is captured in the Formulas section
+- [ ] "May I write" is asked with the inferred path
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Ambiguous Source — Magic Numbers, PARTIAL Verdict
+
+**Fixture:**
+- `src/gameplay/enemy_ai.gd` exists with:
+  - Inline magic numbers: `if distance < 150:`, `speed = 3.5`
+  - No comments or docstrings
+  - Complex state machine logic that is not self-explanatory
+
+**Input:** `/reverse-document src/gameplay/enemy_ai.gd`
+
+**Expected behavior:**
+1. Skill reads the file and detects magic numbers with no context
+2. Skill produces a GDD skeleton with notes: "AMBIGUOUS VALUE: 150 (unknown units —
+   is this pixels, world units, or tiles?)"
+3. Skill marks the Formulas and Tuning Knobs sections as requiring human review
+4. Skill asks "May I write to `design/gdd/enemy-ai.md`?" with PARTIAL advisory
+5. File written with PARTIAL markers; verdict is PARTIAL
+
+**Assertions:**
+- [ ] AMBIGUOUS VALUE annotations appear for magic numbers
+- [ ] Sections needing human review are marked explicitly
+- [ ] Verdict is PARTIAL (not COMPLETE)
+- [ ] File is still written — PARTIAL is not a blocking failure
+
+---
+
+### Case 3: Multiple Interdependent Files — Cross-System Overview Produced
+
+**Fixture:**
+- User provides 2 source files: `combat_system.gd` and `damage_resolver.gd`
+- The files reference each other (combat calls damage_resolver)
+
+**Input:** `/reverse-document src/gameplay/combat_system.gd src/gameplay/damage_resolver.gd`
+
+**Expected behavior:**
+1. Skill reads both files and detects the dependency relationship
+2. Skill produces a cross-system architecture overview (not individual GDDs)
+3. Overview describes: Combat System → Damage Resolver interaction, shared
+   interfaces, data flow between the two
+4. Skill asks "May I write to `docs/architecture/combat-damage-overview.md`?"
+5. Overview written after approval; verdict is COMPLETE (or PARTIAL if ambiguous)
+
+**Assertions:**
+- [ ] Both files are analyzed together (not as two separate docs)
+- [ ] Cross-system dependency is documented in the output
+- [ ] Output file is written to `docs/architecture/` (not `design/gdd/`)
+- [ ] Verdict is COMPLETE or PARTIAL
+
+---
+
+### Case 4: Source File Not Found — Error
+
+**Fixture:**
+- `src/gameplay/inventory_system.gd` does not exist
+
+**Input:** `/reverse-document src/gameplay/inventory_system.gd`
+
+**Expected behavior:**
+1. Skill attempts to read the specified file — not found
+2. Skill outputs: "Source file not found: src/gameplay/inventory_system.gd"
+3. Skill suggests checking the path or running `/map-systems` to identify
+   the correct source file
+4. No document is created
+
+**Assertions:**
+- [ ] Error message names the missing file with the full path
+- [ ] Alternative suggestion (check path or `/map-systems`) is provided
+- [ ] No write tool is called
+- [ ] No verdict is issued (error state)
+
+---
+
+### Case 5: Director Gate Check — No gate; reverse-document is a utility
+
+**Fixture:**
+- Well-structured source file exists
+
+**Input:** `/reverse-document src/gameplay/health_system.gd`
+
+**Expected behavior:**
+1. Skill generates and writes the design doc
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE or PARTIAL — no gate verdict involved
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads source file(s) before generating any content
+- [ ] Produces all 8 required GDD sections when target is a gameplay system
+- [ ] Annotates ambiguous values with AMBIGUOUS VALUE markers
+- [ ] Produces cross-system overview (not individual GDDs) for multiple files
+- [ ] Asks "May I write" before creating any output file
+- [ ] Verdict is COMPLETE (clean inference) or PARTIAL (ambiguous fields)
+
+---
+
+## Coverage Notes
+
+- Architecture overview format (for technical/infrastructure systems) differs
+  from GDD format; the inferred output type is determined by the nature of the
+  source file (gameplay logic → GDD; engine/infra code → architecture doc).
+- The case where a source file is readable but contains only auto-generated
+  boilerplate with no meaningful logic is not tested; skill would likely produce
+  a near-empty skeleton with a PARTIAL verdict.
+- C# and Blueprint source files follow the same inference pattern as GDScript;
+  language-specific differences are handled in the skill body.
--- a/Framework/skills/utility/setup-engine.md
+++ b/Framework/skills/utility/setup-engine.md
@@ -0,0 +1,182 @@
+# Skill Test Spec: /setup-engine
+
+## Skill Summary
+
+`/setup-engine` configures the project's engine, language, rendering backend,
+physics engine, specialist agent assignments, and naming conventions by
+populating `technical-preferences.md`. It accepts an optional engine argument
+(e.g., `/setup-engine godot`) to skip the engine-selection step. For each
+section of `technical-preferences.md`, the skill presents a draft and asks
+"May I write to `technical-preferences.md`?" before updating.
+
+The skill also populates the specialist routing table (file extension → agent
+mappings) based on the chosen engine. It has no director gates — configuration
+is a technical utility task. The verdict is always COMPLETE when the file is
+fully written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before updating technical-preferences.md
+- [ ] Has a next-step handoff (e.g., `/brainstorm` or `/start` depending on flow)
+
+---
+
+## Director Gate Checks
+
+None. `/setup-engine` is a technical configuration skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Godot 4 + GDScript — Full engine configuration
+
+**Fixture:**
+- `technical-preferences.md` contains only placeholders
+- Engine argument provided: `godot`
+
+**Input:** `/setup-engine godot`
+
+**Expected behavior:**
+1. Skill skips engine-selection step (argument provided)
+2. Skill presents language options for Godot: GDScript or C#
+3. User selects GDScript
+4. Skill drafts all engine sections: engine/language/rendering/physics fields,
+   naming conventions (snake_case for GDScript), specialist assignments
+   (godot-specialist, gdscript-specialist, godot-shader-specialist, etc.)
+5. Skill populates the routing table: `.gd` → gdscript-specialist, `.gdshader` →
+   godot-shader-specialist, `.tscn` → godot-specialist
+6. Skill asks "May I write to `technical-preferences.md`?"
+7. File is written after approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Engine field is set to Godot 4 (not a placeholder)
+- [ ] Language field is set to GDScript
+- [ ] Naming conventions are GDScript-appropriate (snake_case)
+- [ ] Routing table includes `.gd`, `.gdshader`, and `.tscn` entries
+- [ ] Specialists are assigned (not placeholders)
+- [ ] "May I write" is asked before writing
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Unity + C# — Unity-specific configuration
+
+**Fixture:**
+- `technical-preferences.md` contains only placeholders
+- Engine argument provided: `unity`
+
+**Input:** `/setup-engine unity`
+
+**Expected behavior:**
+1. Skill sets engine to Unity, language to C#
+2. Naming conventions are C#-appropriate (PascalCase for classes, camelCase for fields)
+3. Specialist assignments reference unity-specialist, csharp-specialist
+4. Routing table: `.cs` → csharp-specialist, `.asmdef` → unity-specialist,
+   `.unity` (scene) → unity-specialist
+5. Skill asks "May I write to `technical-preferences.md`?" and writes on approval
+
+**Assertions:**
+- [ ] Engine field is set to Unity (not Godot or Unreal)
+- [ ] Language field is set to C#
+- [ ] Naming conventions reflect C# conventions
+- [ ] Routing table includes `.cs` and `.unity` entries
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 3: Unreal + Blueprint — Unreal-specific configuration
+
+**Fixture:**
+- `technical-preferences.md` contains only placeholders
+- Engine argument provided: `unreal`
+
+**Input:** `/setup-engine unreal`
+
+**Expected behavior:**
+1. Skill sets engine to Unreal Engine 5, primary language to Blueprint (Visual Scripting)
+2. Specialist assignments reference unreal-specialist, blueprint-specialist
+3. Routing table: `.uasset` → blueprint-specialist or unreal-specialist,
+   `.umap` → unreal-specialist
+4. Performance budgets are pre-set with Unreal defaults (e.g., higher draw call budget)
+5. Skill asks "May I write" and writes on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Engine field is set to Unreal Engine 5
+- [ ] Routing table includes `.uasset` and `.umap` entries
+- [ ] Blueprint specialist is assigned
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: Engine Already Configured — Offers to reconfigure specific sections
+
+**Fixture:**
+- `technical-preferences.md` has engine set to Godot 4 with all fields populated
+- No engine argument provided
+
+**Input:** `/setup-engine`
+
+**Expected behavior:**
+1. Skill reads `technical-preferences.md` and detects fully configured engine (Godot 4)
+2. Skill reports: "Engine already configured as Godot 4 + GDScript"
+3. Skill presents options: reconfigure all, reconfigure specific section only
+   (Engine/Language, Naming Conventions, Specialists, Performance Budgets)
+4. User selects "Reconfigure Performance Budgets only"
+5. Only the performance budget section is updated; all other fields unchanged
+6. Skill asks "May I write to `technical-preferences.md`?" and writes on approval
+
+**Assertions:**
+- [ ] Skill does NOT overwrite all fields when only a section update was requested
+- [ ] User is offered section-specific reconfiguration
+- [ ] Only the selected section is modified in the written file
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; setup-engine is a utility skill
+
+**Fixture:**
+- Fresh project with no engine configured
+
+**Input:** `/setup-engine godot`
+
+**Expected behavior:**
+1. Skill completes full engine configuration
+2. No director agents are spawned at any point
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Presents draft configuration before asking to write
+- [ ] Asks "May I write to `technical-preferences.md`?" before writing
+- [ ] Respects engine argument when provided (skips selection step)
+- [ ] Detects existing config and offers partial reconfigure
+- [ ] Routing table is populated for all key file types for the chosen engine
+- [ ] Verdict is COMPLETE after file is written
+
+---
+
+## Coverage Notes
+
+- Godot 4 + C# (instead of GDScript) follows the same flow as Case 1 with
+  different naming conventions and the godot-csharp-specialist assignment.
+  This variant is not separately tested.
+- The engine-version-specific guidance (e.g., Godot 4.6 knowledge gap warning
+  from VERSION.md) is surfaced by the skill but not assertion-tested here.
+- Performance budget defaults per engine are noted as engine-specific but
+  exact default values are not assertion-tested.
--- a/Framework/skills/utility/skill-improve.md
+++ b/Framework/skills/utility/skill-improve.md
@@ -0,0 +1,185 @@
+# Skill Test Spec: /skill-improve
+
+## Skill Summary
+
+`/skill-improve` runs an automated test-fix-retest improvement loop on a skill
+file. It invokes `/skill-test static` (and optionally `/skill-test category`) to
+establish a baseline score, diagnoses the failing checks, proposes targeted fixes
+to the SKILL.md file, asks "May I write the improvements to [skill path]?", applies
+the fixes, and re-runs the tests to confirm improvement.
+
+If the proposed fix makes the skill worse (regression), the fix is reverted (with
+user confirmation) rather than applied. If the skill is already perfect (0 failures),
+the skill exits immediately without making changes. No director gates apply. Verdicts:
+IMPROVED (score went up), NO CHANGE (no improvements possible or user declined), or
+REVERTED (fix was applied but caused regression and was reverted).
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: IMPROVED, NO CHANGE, REVERTED
+- [ ] Contains "May I write" collaborative protocol language before applying fixes
+- [ ] Has a next-step handoff (e.g., run `/skill-test spec` to validate behavioral compliance)
+
+---
+
+## Director Gate Checks
+
+None. `/skill-improve` is a meta-utility skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Skill With 2 Static Failures, Both Fixed, IMPROVED
+
+**Fixture:**
+- `.claude/skills/some-skill/SKILL.md` has 2 static failures:
+  - Check 4: no "May I write" language despite having Write in allowed-tools
+  - Check 5: no next-step handoff at the end
+
+**Input:** `/skill-improve some-skill`
+
+**Expected behavior:**
+1. Skill runs `/skill-test static some-skill` — baseline: 5/7 checks pass
+2. Skill diagnoses the 2 failing checks (4 and 5)
+3. Skill proposes fixes:
+   - Add "May I write" language to the appropriate phase
+   - Add a next-step handoff section at the end
+4. Skill asks "May I write improvements to `.claude/skills/some-skill/SKILL.md`?"
+5. Fixes applied; `/skill-test static some-skill` re-run — now 7/7 checks pass
+6. Verdict is IMPROVED (5→7)
+
+**Assertions:**
+- [ ] Baseline score is established before any changes (5/7)
+- [ ] Both failing checks are diagnosed and addressed in the proposed fix
+- [ ] "May I write" is asked before applying the fix
+- [ ] Re-test confirms improvement (7/7)
+- [ ] Verdict is IMPROVED with before/after score shown
+
+---
+
+### Case 2: Fix Causes Regression — Score Comparison Shows Regression, REVERTED
+
+**Fixture:**
+- `.claude/skills/some-skill/SKILL.md` has 1 static failure (missing handoff)
+- Proposed fix inadvertently removes the verdict keywords section
+  (introducing a new failure)
+
+**Input:** `/skill-improve some-skill`
+
+**Expected behavior:**
+1. Baseline: 6/7 checks pass (1 failure: missing handoff)
+2. Skill proposes fix and asks "May I write improvements?"
+3. Fix is applied; re-test runs
+4. Re-test result: 5/7 (fixed the handoff but broke verdict keywords)
+5. Skill detects regression: score went DOWN
+6. Skill asks user: "Fix caused a regression (6→5). May I revert the changes?"
+7. User confirms; changes are reverted; verdict is REVERTED
+
+**Assertions:**
+- [ ] Re-test score is compared to baseline before finalizing
+- [ ] Regression is detected when score decreases
+- [ ] User is asked to confirm revert (not automatic)
+- [ ] File is reverted on user confirmation
+- [ ] Verdict is REVERTED
+
+---
+
+### Case 3: Skill With Category Assignment — Baseline Captures Both Scores
+
+**Fixture:**
+- `.claude/skills/gate-check/SKILL.md` is a gate skill with 1 static failure
+  and 2 category (G-criteria) failures
+- `tests/skills/quality-rubric.md` has Gate Skills section
+
+**Input:** `/skill-improve gate-check`
+
+**Expected behavior:**
+1. Skill runs both static and category tests for the baseline:
+   - Static: 6/7 checks pass
+   - Category: 3/5 G-criteria pass
+2. Combined baseline: 9/12
+3. Skill diagnoses all 3 failures and proposes fixes
+4. "May I write improvements to `.claude/skills/gate-check/SKILL.md`?"
+5. Fixes applied; both test types re-run
+6. Re-test: static 7/7, category 5/5 = 12/12
+7. Verdict is IMPROVED (9→12)
+
+**Assertions:**
+- [ ] Both static and category scores are captured in the baseline
+- [ ] Combined score is used for comparison (not just one type)
+- [ ] All 3 failures are addressed in the proposed fix
+- [ ] Re-test confirms improvement in both score types
+- [ ] Verdict is IMPROVED with combined before/after
+
+---
+
+### Case 4: Skill Already Perfect — No Improvements Needed
+
+**Fixture:**
+- `.claude/skills/brainstorm/SKILL.md` has no static failures
+- Category score is also 5/5 (if applicable)
+
+**Input:** `/skill-improve brainstorm`
+
+**Expected behavior:**
+1. Skill runs `/skill-test static brainstorm` — 7/7 checks pass
+2. If category applies: 5/5 criteria pass
+3. Skill outputs: "No improvements needed — brainstorm is fully compliant"
+4. Skill exits without proposing any changes
+5. No "May I write" is asked; no files are modified
+6. Verdict is NO CHANGE
+
+**Assertions:**
+- [ ] Skill exits immediately after confirming 0 failures
+- [ ] "No improvements needed" message is shown
+- [ ] No changes are proposed
+- [ ] No "May I write" is asked
+- [ ] Verdict is NO CHANGE
+
+---
+
+### Case 5: Director Gate Check — No gate; skill-improve is a meta utility
+
+**Fixture:**
+- Skill with at least 1 static failure
+
+**Input:** `/skill-improve some-skill`
+
+**Expected behavior:**
+1. Skill runs the test-fix-retest loop
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is IMPROVED, NO CHANGE, or REVERTED — no gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Always establishes a baseline score before proposing any changes
+- [ ] Shows before/after score comparison in the output
+- [ ] Asks "May I write" before applying any fix
+- [ ] Detects regressions by comparing re-test score to baseline
+- [ ] Asks for user confirmation before reverting (not automatic)
+- [ ] Ends with IMPROVED, NO CHANGE, or REVERTED verdict
+
+---
+
+## Coverage Notes
+
+- The improvement loop is designed to run only one fix-retest cycle per
+  invocation; running multiple iterations requires re-invoking `/skill-improve`.
+- Behavioral compliance (spec-mode test results) is not included in the
+  improvement loop — only structural (static) and category scores are automated.
+- The case where the skill file cannot be read (permissions error or missing file)
+  is not tested; this would result in an error before the baseline is established.
--- a/Framework/skills/utility/skill-test.md
+++ b/Framework/skills/utility/skill-test.md
@@ -0,0 +1,188 @@
+# Skill Test Spec: /skill-test
+
+## Skill Summary
+
+`/skill-test` validates skill files for structural correctness, behavioral
+compliance, and category-rubric scoring. It operates in three modes:
+
+- **static**: Checks a single skill file for structural requirements
+  (frontmatter fields, phase headings, verdict keywords, "May I write" language,
+  next-step handoff) without needing a fixture. Produces a per-check PASS/FAIL
+  table.
+- **spec**: Reads a test spec file from `tests/skills/` and evaluates the skill
+  against each test case assertion, producing a case-by-case verdict.
+- **audit**: Produces a coverage table of all skills in `.claude/skills/` and
+  all agents in `.claude/agents/`, showing which have spec files and which do not.
+
+An additional **category** mode reads the quality rubric for a skill category
+(e.g., gate skills) and scores the skill against rubric criteria. The verdict
+system differs by mode.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdicts: COMPLIANT, NON-COMPLIANT, WARNINGS (static mode); PASS, FAIL, PARTIAL (spec mode); COMPLETE (audit mode)
+- [ ] Does NOT contain "May I write" language (skill is read-only in all modes)
+- [ ] Has a next-step handoff (e.g., `/skill-improve` to fix issues found)
+
+---
+
+## Director Gate Checks
+
+None. `/skill-test` is a meta-utility skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Static Mode — Well-formed skill, all 7 checks pass, COMPLIANT
+
+**Fixture:**
+- `.claude/skills/brainstorm/SKILL.md` exists and is well-formed:
+  - Has all required frontmatter fields
+  - Has ≥2 phase headings
+  - Has verdict keywords
+  - Has "May I write" language
+  - Has a next-step handoff
+  - Documents director gates
+  - Documents gate mode behavior (lean/solo skips)
+
+**Input:** `/skill-test static brainstorm`
+
+**Expected behavior:**
+1. Skill reads `.claude/skills/brainstorm/SKILL.md`
+2. Skill runs all 7 structural checks
+3. All 7 checks pass
+4. Skill outputs a PASS/FAIL table with all 7 checks marked PASS
+5. Verdict is COMPLIANT
+
+**Assertions:**
+- [ ] Exactly 7 structural checks are reported
+- [ ] All 7 are marked PASS
+- [ ] Verdict is COMPLIANT
+- [ ] No files are written
+
+---
+
+### Case 2: Static Mode — Skill Missing "May I Write" Despite Write Tool in allowed-tools
+
+**Fixture:**
+- `.claude/skills/some-skill/SKILL.md` has `Write` in `allowed-tools` frontmatter
+- The skill body has no "May I write" or "May I update" language
+
+**Input:** `/skill-test static some-skill`
+
+**Expected behavior:**
+1. Skill reads `some-skill/SKILL.md`
+2. Check 4 (collaborative write protocol) fails: `Write` in allowed-tools but no
+   "May I write" language found
+3. All other checks may pass
+4. Verdict is NON-COMPLIANT with Check 4 as the failing assertion
+5. Output lists Check 4 as FAIL with explanation
+
+**Assertions:**
+- [ ] Check 4 is marked FAIL
+- [ ] Explanation identifies the specific mismatch (Write tool without "May I write" language)
+- [ ] Verdict is NON-COMPLIANT
+- [ ] Other passing checks are shown (not only the failure)
+
+---
+
+### Case 3: Spec Mode — gate-check Skill Evaluated Against Spec
+
+**Fixture:**
+- `tests/skills/gate-check.md` exists with 5 test cases
+- `.claude/skills/gate-check/SKILL.md` exists
+
+**Input:** `/skill-test spec gate-check`
+
+**Expected behavior:**
+1. Skill reads both the skill file and the spec file
+2. Skill evaluates each of the 5 test case assertions against the skill's behavior
+3. For each case: PASS if skill behavior matches spec assertions, FAIL if not
+4. Skill produces a case-by-case result table
+5. Overall verdict: PASS (all 5), PARTIAL (some), or FAIL (majority failing)
+
+**Assertions:**
+- [ ] All 5 test cases from the spec are evaluated
+- [ ] Each case has an individual PASS/FAIL result
+- [ ] Overall verdict is PASS, PARTIAL, or FAIL based on case results
+- [ ] No files are written
+
+---
+
+### Case 4: Audit Mode — Coverage Table of All Skills and Agents
+
+**Fixture:**
+- `.claude/skills/` contains 72+ skill directories
+- `.claude/agents/` contains 49+ agent files
+- `tests/skills/` contains spec files for a subset of skills
+
+**Input:** `/skill-test audit`
+
+**Expected behavior:**
+1. Skill enumerates all skills in `.claude/skills/` and all agents in `.claude/agents/`
+2. Skill checks `tests/skills/` for a corresponding spec file for each
+3. Skill produces a coverage table:
+   - Each skill/agent listed
+   - "Has Spec" column: YES or NO
+   - Summary: "X of Y skills have specs; A of B agents have specs"
+4. Verdict is COMPLETE
+
+**Assertions:**
+- [ ] All skill directories are enumerated (not just a sample)
+- [ ] "Has Spec" column is accurate for each entry
+- [ ] Summary counts are correct
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Category Mode — Gate Skill Evaluated Against Quality Rubric
+
+**Fixture:**
+- `tests/skills/quality-rubric.md` exists with a "Gate Skills" section defining
+  criteria G1-G5 (e.g., G1: has mode guard, G2: has verdict table, etc.)
+- `.claude/skills/gate-check/SKILL.md` is a gate skill
+
+**Input:** `/skill-test category gate-check`
+
+**Expected behavior:**
+1. Skill reads `quality-rubric.md` and identifies the Gate Skills section
+2. Skill evaluates `gate-check/SKILL.md` against criteria G1-G5
+3. Each criterion is scored: PASS, PARTIAL, or FAIL
+4. Overall category score is computed (e.g., 4/5 criteria pass)
+5. Verdict is COMPLIANT (all pass), WARNINGS (some partial), or NON-COMPLIANT (failures)
+
+**Assertions:**
+- [ ] All gate criteria (G1-G5) from quality-rubric.md are evaluated
+- [ ] Each criterion has an individual score
+- [ ] Overall verdict reflects the score distribution
+- [ ] No files are written
+
+---
+
+## Protocol Compliance
+
+- [ ] Static mode checks exactly 7 structural assertions
+- [ ] Spec mode evaluates each test case from the spec file individually
+- [ ] Audit mode covers all skills AND agents (not just one category)
+- [ ] Category mode reads quality-rubric.md to get criteria (not hardcoded)
+- [ ] Does not write any files in any mode
+- [ ] Suggests `/skill-improve` as the next step when issues are found
+
+---
+
+## Coverage Notes
+
+- The skill-test skill is self-referential (it can test itself). The static
+  mode case for skill-test's own SKILL.md is not separately fixture-tested to
+  avoid infinite recursion in test design.
+- The specific 7 structural checks are defined in the skill body; only Check 4
+  (May I write) is individually tested here because it has the most nuanced logic.
+- Audit mode counts are approximate — the exact number of skills and agents will
+  change as the system grows; assertions use "all" rather than fixed counts.
--- a/Framework/skills/utility/smoke-check.md
+++ b/Framework/skills/utility/smoke-check.md
@@ -0,0 +1,193 @@
+# Skill Test Spec: /smoke-check
+
+## Skill Summary
+
+`/smoke-check` is the gate between implementation and QA hand-off. It detects the
+test environment, runs the automated test suite (via Bash), scans test coverage
+against sprint stories, and uses `AskUserQuestion` to batch-verify manual smoke
+checks with the developer. It writes a report to `production/qa/smoke-[date].md`
+after explicit user approval.
+
+Verdicts: PASS (tests pass, all smoke checks pass, no missing test evidence),
+PASS WITH WARNINGS (tests pass or NOT RUN, all critical checks pass, but advisory
+gaps exist such as missing test coverage), or FAIL (any automated test failure or
+any Batch 1/Batch 2 smoke check returns FAIL).
+
+No director gates apply. The skill does NOT invoke any director agents.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: PASS, PASS WITH WARNINGS, FAIL
+- [ ] Contains "May I write" collaborative protocol language before writing the report
+- [ ] Has a next-step handoff (e.g., `/bug-report` on FAIL, QA hand-off guidance on PASS)
+
+---
+
+## Director Gate Checks
+
+None. `/smoke-check` is a pre-QA utility skill. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Automated tests pass, manual items confirmed, PASS
+
+**Fixture:**
+- `tests/` directory exists with a GDUnit4 runner script
+- Engine detected as Godot from `technical-preferences.md`
+- `production/qa/qa-plan-sprint-005.md` exists
+- Automated test runner reports 12 tests, 12 passing, 0 failing
+- Developer confirms all Batch 1 and Batch 2 smoke checks as PASS
+- All sprint stories have matching test files (no MISSING coverage)
+
+**Input:** `/smoke-check`
+
+**Expected behavior:**
+1. Skill detects test directory and engine, notes QA plan found
+2. Runs `godot --headless --script tests/gdunit4_runner.gd` via Bash
+3. Parses output: 12/12 passing
+4. Scans test coverage — all stories COVERED or EXPECTED
+5. Uses `AskUserQuestion` for Batch 1 (core stability) and Batch 2 (sprint mechanics)
+6. Developer selects PASS for all items
+7. Report assembled: automated tests PASS, all smoke checks PASS, no MISSING coverage
+8. Asks "May I write this smoke check report to `production/qa/smoke-[date].md`?"
+9. Writes report after approval
+10. Delivers verdict: PASS
+
+**Assertions:**
+- [ ] Automated test runner is invoked via Bash
+- [ ] `AskUserQuestion` is used for manual smoke check batches
+- [ ] "May I write" is asked before writing the report file
+- [ ] Report is written to `production/qa/smoke-[date].md`
+- [ ] Verdict is PASS
+
+---
+
+### Case 2: Failure Path — Automated test fails, FAIL verdict
+
+**Fixture:**
+- `tests/` directory exists, engine is Godot
+- Automated test runner reports 10 tests run: 8 passing, 2 failing
+  - Failing tests: `test_health_clamp_at_zero`, `test_damage_calculation_negative`
+- QA plan exists
+
+**Input:** `/smoke-check`
+
+**Expected behavior:**
+1. Skill runs automated tests via Bash
+2. Parses output — 2 failures detected
+3. Records failing test names
+4. Proceeds through manual smoke check batches
+5. Report shows automated tests as FAIL with failing test names listed
+6. Asks to write report; writes after approval
+7. Delivers FAIL verdict with message: "The smoke check failed. Do not hand off to
+   QA until these failures are resolved." Lists failing tests and suggests fixing
+   then re-running `/smoke-check`
+
+**Assertions:**
+- [ ] Failing test names are listed in the report
+- [ ] Verdict is FAIL
+- [ ] Post-verdict message directs developer to fix failures before QA hand-off
+- [ ] `/smoke-check` re-run is suggested after fixing
+
+---
+
+### Case 3: Manual Confirmation — AskUserQuestion used, PASS WITH WARNINGS
+
+**Fixture:**
+- `tests/` directory exists, engine is Godot
+- Automated test runner reports all tests passing (8/8)
+- One Logic story has no matching test file (MISSING coverage)
+- Developer confirms all Batch 1 and Batch 2 smoke checks as PASS
+
+**Input:** `/smoke-check`
+
+**Expected behavior:**
+1. Automated tests PASS
+2. Coverage scan finds 1 MISSING entry for a Logic story
+3. `AskUserQuestion` is used for Batch 1 and Batch 2 — developer confirms all PASS
+4. Report shows: automated tests PASS, manual checks all PASS, 1 MISSING coverage entry
+5. Verdict is PASS WITH WARNINGS — build ready for QA, but MISSING entry must be
+   resolved before `/story-done` closes the affected story
+6. Asks to write report; writes after approval
+
+**Assertions:**
+- [ ] `AskUserQuestion` is used for manual smoke check batches (not inline text prompts)
+- [ ] MISSING test coverage entry appears in the report
+- [ ] Verdict is PASS WITH WARNINGS (not PASS, not FAIL)
+- [ ] Advisory note explains MISSING entry must be resolved before `/story-done`
+- [ ] Report file is written to `production/qa/smoke-[date].md`
+
+---
+
+### Case 4: No Test Directory — Skill stops with guidance
+
+**Fixture:**
+- `tests/` directory does not exist
+- Engine is configured as Godot
+
+**Input:** `/smoke-check`
+
+**Expected behavior:**
+1. Phase 1 checks for `tests/` directory — not found
+2. Skill outputs: "No test directory found at `tests/`. Run `/test-setup` to
+   scaffold the testing infrastructure, or create the directory manually if
+   tests live elsewhere."
+3. Skill stops — no automated tests run, no manual smoke checks, no report written
+
+**Assertions:**
+- [ ] Error message references the missing `tests/` directory
+- [ ] `/test-setup` is suggested as the remediation step
+- [ ] Skill stops after this message (no further phases run)
+- [ ] No report file is written
+
+---
+
+### Case 5: Director Gate Check — No gate; smoke-check is a QA pre-check utility
+
+**Fixture:**
+- Valid test setup, automated tests pass, manual smoke checks confirmed
+
+**Input:** `/smoke-check`
+
+**Expected behavior:**
+1. Skill runs all phases and produces a PASS or PASS WITH WARNINGS verdict
+2. No director agents are spawned at any point
+3. No gate IDs (CD-*, TD-*, AD-*, PR-*) appear in output
+4. No `/gate-check` is invoked
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is PASS, PASS WITH WARNINGS, or FAIL — no gate verdict involved
+
+---
+
+## Protocol Compliance
+
+- [ ] Uses `AskUserQuestion` for all manual smoke check batches (Batch 1, Batch 2, Batch 3)
+- [ ] Runs automated tests via Bash before asking any manual questions
+- [ ] Asks "May I write" before creating the report file — never writes without approval
+- [ ] Verdict vocabulary is strictly PASS / PASS WITH WARNINGS / FAIL — no other verdicts
+- [ ] FAIL is triggered by automated test failures or Batch 1/Batch 2 FAIL responses
+- [ ] PASS WITH WARNINGS is triggered when MISSING test coverage exists but no critical failures
+- [ ] NOT RUN (engine binary unavailable) is recorded as a warning, not a FAIL
+- [ ] Does not invoke director gates at any point
+
+---
+
+## Coverage Notes
+
+- The `quick` argument (skips Phase 3 coverage scan and Batch 3) is not separately
+  fixture-tested; it follows the same pattern as Case 1 with a coverage-skip note in output.
+- The `--platform` argument adds platform-specific AskUserQuestion batches and a
+  per-platform verdict table; not separately tested here.
+- The case where the engine binary is not on PATH (NOT RUN) follows the PASS WITH
+  WARNINGS pattern and is covered by the protocol compliance assertions above.
--- a/Framework/skills/utility/soak-test.md
+++ b/Framework/skills/utility/soak-test.md
@@ -0,0 +1,178 @@
+# Skill Test Spec: /soak-test
+
+## Skill Summary
+
+`/soak-test` generates a structured soak test protocol — an extended runtime
+test plan designed to surface memory leaks, performance drift, and stability
+issues that only appear under sustained gameplay. The skill produces a document
+specifying the test duration, system under test, monitoring checkpoints (e.g.,
+memory sample every 30 minutes), pass/fail thresholds, and conditions for early
+termination.
+
+The skill asks "May I write to `production/qa/soak-[slug]-[date].md`?" before
+persisting. If a previous soak test for the same system exists, the skill offers
+to extend the duration or add new conditions. No director gates apply. The verdict
+is COMPLETE when the soak test protocol is written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before writing the protocol
+- [ ] Has a next-step handoff (e.g., `/regression-suite` or `/release-checklist`)
+
+---
+
+## Director Gate Checks
+
+None. `/soak-test` is a QA planning utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Online gameplay feature, 2-hour soak protocol
+
+**Fixture:**
+- User specifies: system = "online multiplayer lobby", duration = "2 hours"
+- `technical-preferences.md` has engine configured
+
+**Input:** `/soak-test online-lobby 2h`
+
+**Expected behavior:**
+1. Skill generates a 2-hour soak test protocol for the online lobby system
+2. Protocol includes: monitoring checkpoints every 30 minutes, metrics to track
+   (memory usage, connection count, packet loss), pass thresholds, early termination
+   conditions (crash or >20% memory growth)
+3. Networking-specific checks are included (session drop rate, reconnect handling)
+4. Skill asks "May I write to `production/qa/soak-online-lobby-2026-04-06.md`?"
+5. File is written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Protocol duration matches the requested 2 hours
+- [ ] Monitoring checkpoints are at reasonable intervals (e.g., every 30 minutes)
+- [ ] Network-specific checks are included (not just generic memory checks)
+- [ ] "May I write" is asked with the correct file path
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: No Target Defined — Prompts for system, duration, and conditions
+
+**Fixture:**
+- No arguments provided
+- No soak test config in session state
+
+**Input:** `/soak-test`
+
+**Expected behavior:**
+1. Skill detects no target system or duration specified
+2. Skill asks: "What system or feature should be soak-tested?"
+3. After user responds with system: Skill asks: "What duration? (e.g., 1h, 4h, 8h)"
+4. After user responds with duration: Skill asks for specific conditions or
+   uses defaults (normal gameplay loop, default player count)
+5. Skill generates protocol from collected inputs and asks "May I write"
+
+**Assertions:**
+- [ ] At minimum 2 follow-up questions are asked (system + duration)
+- [ ] Default conditions are applied when user doesn't specify custom ones
+- [ ] Protocol is not generated until system and duration are known
+- [ ] Verdict is COMPLETE after file is written
+
+---
+
+### Case 3: Previous Soak Test Exists — Offers to extend or add conditions
+
+**Fixture:**
+- `production/qa/soak-online-lobby-2026-03-15.md` exists with a 1-hour protocol
+- User wants to extend to 4 hours with new memory threshold conditions
+
+**Input:** `/soak-test online-lobby 4h`
+
+**Expected behavior:**
+1. Skill finds existing soak test for online-lobby
+2. Skill reports: "Previous soak test found: soak-online-lobby-2026-03-15.md (1h)"
+3. Skill presents options: create new protocol (4h standalone), or extend the
+   existing protocol to 4h and add new conditions
+4. User selects extend; existing checkpoints are preserved, new ones added
+5. Skill asks "May I write to `production/qa/soak-online-lobby-2026-04-06.md`?"
+   (new file, not overwriting old one)
+
+**Assertions:**
+- [ ] Existing soak test is surfaced and referenced
+- [ ] User is offered extend vs. new options
+- [ ] New file is created (old file is not overwritten)
+- [ ] Extended protocol includes both old and new checkpoints
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: Mobile Target Platform — Memory-specific checkpoints added
+
+**Fixture:**
+- `technical-preferences.md` specifies target platform: Mobile
+- User requests soak test for "gameplay session" at 30 minutes
+
+**Input:** `/soak-test gameplay 30m`
+
+**Expected behavior:**
+1. Skill reads `technical-preferences.md` and detects mobile target platform
+2. Soak test protocol includes mobile-specific memory checkpoints:
+   - Check heap memory growth vs. device baseline
+   - Check texture memory at checkpoint intervals
+   - Add warning threshold at 300MB (mobile ceiling)
+3. Protocol also includes thermal/battery drain advisory notes
+4. Skill asks "May I write?" and writes on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Mobile platform is detected from technical-preferences.md
+- [ ] Memory checkpoints include mobile-appropriate thresholds (not desktop)
+- [ ] Thermal/battery notes are present in the protocol
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; soak-test is a planning utility
+
+**Fixture:**
+- Valid system and duration provided
+
+**Input:** `/soak-test combat 1h`
+
+**Expected behavior:**
+1. Skill generates and writes the soak test protocol
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Skill reaches COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Collects system, duration, and conditions before generating protocol
+- [ ] Includes monitoring checkpoints at regular intervals
+- [ ] Includes pass/fail thresholds and early termination conditions
+- [ ] Adapts checkpoints to target platform (mobile vs. desktop)
+- [ ] Asks "May I write" before creating the protocol file
+- [ ] Verdict is COMPLETE when file is written
+
+---
+
+## Coverage Notes
+
+- Soak tests for specific engine subsystems (rendering pipeline, physics
+  simulation) follow the same protocol structure and are not separately tested.
+- The case where the user provides a duration shorter than the minimum useful
+  soak period (e.g., 5 minutes) is not tested; the skill would note this is
+  too short for meaningful results.
+- Automated execution of the soak test protocol is outside this skill's scope —
+  this skill generates the plan, not the runner.
--- a/Framework/skills/utility/start.md
+++ b/Framework/skills/utility/start.md
@@ -0,0 +1,173 @@
+# Skill Test Spec: /start
+
+## Skill Summary
+
+`/start` is the first-time onboarding skill for new projects. It guides the
+user through naming the project, choosing a game engine, and setting up the
+initial directory structure. It creates stub configuration files (CLAUDE.md,
+technical-preferences.md) and then routes to `/setup-engine` with the chosen
+engine as an argument. Each file or directory created is gated behind a
+"May I write" ask, following the collaborative protocol.
+
+The skill detects whether a project is already configured and whether a
+partial setup exists, offering to resume or restart as appropriate. It has
+no director gates — it is a utility setup skill that runs before any agent
+hierarchy exists.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keywords: COMPLETE, BLOCKED
+- [ ] Contains "May I write" collaborative protocol language for each config file
+- [ ] Has a next-step handoff at the end (routes to `/setup-engine`)
+
+---
+
+## Director Gate Checks
+
+None. `/start` is a utility setup skill. No director agents exist yet at the
+point this skill runs.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Fresh repo, no engine, full onboarding flow
+
+**Fixture:**
+- Empty repository: no CLAUDE.md overrides, no `production/stage.txt`, no
+  `technical-preferences.md` content beyond placeholders
+- No existing design docs or source code
+
+**Input:** `/start`
+
+**Expected behavior:**
+1. Skill detects no existing configuration and begins fresh onboarding
+2. Skill asks for project name
+3. Skill presents 3 engine options: Godot 4, Unity, Unreal Engine 5
+4. User selects an engine
+5. Skill asks "May I write the initial directory structure?"
+6. Skill creates all directories defined in `directory-structure.md`
+7. Skill asks "May I write CLAUDE.md stub?" and writes it on approval
+8. Skill routes to `/setup-engine [chosen-engine]` to complete technical config
+
+**Assertions:**
+- [ ] Project name is captured before any file is written
+- [ ] Exactly 3 engine options are presented
+- [ ] "May I write" is asked for each config file individually
+- [ ] No file is written without explicit user approval
+- [ ] Handoff to `/setup-engine` occurs at the end with the chosen engine argument
+- [ ] Verdict is COMPLETE after all files are written and handoff is issued
+
+---
+
+### Case 2: Already Configured — Detects existing config, offers to skip or reconfigure
+
+**Fixture:**
+- `technical-preferences.md` has engine already set (not placeholder)
+- `production/stage.txt` exists with `Concept`
+
+**Input:** `/start`
+
+**Expected behavior:**
+1. Skill reads `technical-preferences.md` and detects configured engine
+2. Skill reports: "This project is already configured with [engine]"
+3. Skill presents options: skip (exit), reconfigure engine, or reconfigure specific sections
+4. If user selects skip: skill exits cleanly with a summary of current config
+5. If user selects reconfigure: skill proceeds to the engine-selection step
+
+**Assertions:**
+- [ ] Skill does NOT overwrite existing config without user choosing reconfigure
+- [ ] Detected engine name is shown to the user in the status message
+- [ ] User is offered at least 2 options (skip or reconfigure)
+- [ ] Verdict is COMPLETE whether user skips or reconfigures
+
+---
+
+### Case 3: Engine Choice — User picks Godot 4, routes to /setup-engine godot
+
+**Fixture:**
+- Fresh repo — no existing configuration
+
+**Input:** `/start`
+
+**Expected behavior:**
+1. Skill presents engine options and user selects Godot 4
+2. Skill writes initial stubs (directory structure, CLAUDE.md) after approval
+3. Skill explicitly routes to `/setup-engine godot` as the next step
+4. Handoff message clearly names the engine and the next skill invocation
+
+**Assertions:**
+- [ ] Handoff command is `/setup-engine godot` (not generic `/setup-engine`)
+- [ ] Handoff is issued after all initial stubs are written, not before
+- [ ] Engine choice is echoed back to user before writing begins
+
+---
+
+### Case 4: Interrupted Setup — Partial config detected, offers resume or restart
+
+**Fixture:**
+- Directory structure exists (was created) but `technical-preferences.md` is
+  still all placeholders (engine was never chosen — setup was interrupted)
+- No `production/stage.txt`
+
+**Input:** `/start`
+
+**Expected behavior:**
+1. Skill detects partial state: directories exist but engine is unconfigured
+2. Skill reports: "A partial setup was detected — directories exist but engine is not configured"
+3. Skill offers: resume from engine selection, or restart from scratch
+4. If resume: skill skips directory creation, proceeds to engine choice
+5. If restart: skill asks "May I overwrite existing structure?" before proceeding
+
+**Assertions:**
+- [ ] Partial state is correctly identified (directories present, engine absent)
+- [ ] User is offered resume vs. restart choice — not forced into one path
+- [ ] Resume path skips re-creating directories (no redundant "May I write" for structure)
+- [ ] Restart path asks for permission to overwrite before touching any files
+
+---
+
+### Case 5: Director Gate Check — No gate; start is a utility setup skill
+
+**Fixture:**
+- Any fixture
+
+**Input:** `/start`
+
+**Expected behavior:**
+1. Skill completes full onboarding flow
+2. No director agents are spawned at any point
+3. No gate IDs (CD-*, TD-*, AD-*, PR-*) appear in the output
+
+**Assertions:**
+- [ ] No director gate is invoked during the skill execution
+- [ ] No gate skip messages appear (gates are absent, not suppressed)
+- [ ] Skill reaches COMPLETE without any gate verdict
+
+---
+
+## Protocol Compliance
+
+- [ ] Asks for project name before any file is written
+- [ ] Presents engine options as a structured choice (not free text)
+- [ ] Asks "May I write" separately for directory structure and for CLAUDE.md stub
+- [ ] Ends with a handoff to `/setup-engine` with the engine name as argument
+- [ ] Verdict is clearly stated (COMPLETE or BLOCKED) at end of output
+
+---
+
+## Coverage Notes
+
+- The case where the user rejects all engine options and provides a custom
+  engine name is not tested — the skill is designed for the three supported
+  engines only.
+- Git initialization (if any) is not tested here; that is an infrastructure
+  concern outside the skill boundary.
+- Solo vs. lean mode behavior is not applicable — this skill has no gates and
+  mode selection is irrelevant.
--- a/Framework/skills/utility/test-helpers.md
+++ b/Framework/skills/utility/test-helpers.md
@@ -0,0 +1,175 @@
+# Skill Test Spec: /test-helpers
+
+## Skill Summary
+
+`/test-helpers` generates engine-specific test helper utilities for the project's
+test suite. Helpers include factory functions (for creating test entities with
+known state), fixture loaders, assertion helpers, and mock stubs for external
+dependencies. Generated helpers follow the naming and structure conventions in
+`coding-standards.md` and are written to `tests/helpers/`.
+
+Each helper file is gated behind a "May I write" ask. If a helper file already
+exists, the skill offers to extend it rather than replace. No director gates
+apply. The verdict is COMPLETE when helper files are written.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before writing helpers
+- [ ] Has a next-step handoff (e.g., write a test using the generated helper)
+
+---
+
+## Director Gate Checks
+
+None. `/test-helpers` is a scaffolding utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Player factory helper generated for Godot/GDScript
+
+**Fixture:**
+- `technical-preferences.md` has engine Godot 4, language GDScript
+- `tests/` directory exists (test-setup has been run)
+- `design/gdd/player.md` exists with defined player properties
+- No existing helpers in `tests/helpers/`
+
+**Input:** `/test-helpers player-factory`
+
+**Expected behavior:**
+1. Skill reads engine (Godot 4 / GDScript) and player GDD for property context
+2. Skill generates a deterministic `PlayerFactory` helper in GDScript:
+   - `create_player(health: int = 100, speed: float = 200.0)` function
+   - Returns a player node pre-configured to a known state
+   - Uses dependency injection (no singletons)
+3. Skill asks "May I write to `tests/helpers/player_factory.gd`?"
+4. File is written on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Generated helper is in GDScript (not C# or Blueprint)
+- [ ] Factory function parameters use defaults matching GDD values
+- [ ] Helper uses dependency injection (no Autoload/singleton references)
+- [ ] Filename follows snake_case convention for GDScript
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: No Test Setup Exists — Redirects to /test-setup
+
+**Fixture:**
+- `tests/` directory does not exist
+
+**Input:** `/test-helpers player-factory`
+
+**Expected behavior:**
+1. Skill checks for `tests/` directory — not found
+2. Skill reports: "Test directory not found — test framework must be set up first"
+3. Skill suggests running `/test-setup` before generating helpers
+4. No helper file is created
+
+**Assertions:**
+- [ ] Error message identifies the missing tests/ directory
+- [ ] `/test-setup` is suggested as the prerequisite step
+- [ ] No write tool is called
+- [ ] Verdict is not COMPLETE (blocked state)
+
+---
+
+### Case 3: Helper Already Exists — Offers to extend rather than replace
+
+**Fixture:**
+- `tests/helpers/player_factory.gd` already exists with a `create_player()` function
+- User requests a new `create_enemy()` function be added to the factory
+
+**Input:** `/test-helpers enemy-factory`
+
+**Expected behavior:**
+1. Skill finds an existing `player_factory.gd` and checks if it's the right file
+   to extend (or if a separate `enemy_factory.gd` should be created)
+2. Skill presents options: add `create_enemy()` to existing factory or create
+   `tests/helpers/enemy_factory.gd`
+3. User selects extend; skill drafts the `create_enemy()` function
+4. Skill asks "May I extend `tests/helpers/player_factory.gd`?"
+5. Function is added on approval; verdict is COMPLETE
+
+**Assertions:**
+- [ ] Existing helper is detected and surfaced
+- [ ] User is given extend vs. new file choice
+- [ ] "May I extend" language is used (not "May I write" for replacement)
+- [ ] Existing `create_player()` is preserved in the extended file
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 4: System Has No GDD — Notes missing design context in helper
+
+**Fixture:**
+- `technical-preferences.md` has Godot 4 / GDScript
+- `tests/` exists
+- User requests a helper for the "inventory system" but no `design/gdd/inventory.md` exists
+
+**Input:** `/test-helpers inventory-factory`
+
+**Expected behavior:**
+1. Skill looks for `design/gdd/inventory.md` — not found
+2. Skill notes: "No GDD found for inventory — generating helper with placeholder defaults"
+3. Skill generates an `inventory_factory.gd` with generic placeholder values
+   (item_count = 0, max_capacity = 20) and a comment: "# TODO: align defaults
+   with inventory GDD when written"
+4. Skill asks "May I write to `tests/helpers/inventory_factory.gd`?"
+5. File is written; verdict is COMPLETE with advisory note
+
+**Assertions:**
+- [ ] Skill proceeds without GDD (does not block)
+- [ ] Generated helper has placeholder defaults with TODO comment
+- [ ] Missing GDD is noted in the output (advisory warning)
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 5: Director Gate Check — No gate; test-helpers is a scaffolding utility
+
+**Fixture:**
+- Engine configured, tests/ exists
+
+**Input:** `/test-helpers player-factory`
+
+**Expected behavior:**
+1. Skill generates and writes the helper file
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads engine before generating any helper (helpers are engine-specific)
+- [ ] Reads GDD for default values when available
+- [ ] Notes missing GDD context rather than blocking
+- [ ] Detects existing helper files and offers extend rather than replace
+- [ ] Asks "May I write" (or "May I extend") before any file operation
+- [ ] Verdict is COMPLETE when helper is written
+
+---
+
+## Coverage Notes
+
+- Mock/stub helper generation (for dependencies like save systems or audio buses)
+  follows the same pattern as factory helpers and is not separately tested.
+- Unity C# helper generation (using NSubstitute or custom mocks) follows the
+  same logic as Case 1 with language-appropriate output.
+- The case where the requested helper type is not recognized is not tested;
+  the skill would ask the user to clarify the helper type.
--- a/Framework/skills/utility/test-setup.md
+++ b/Framework/skills/utility/test-setup.md
@@ -0,0 +1,173 @@
+# Skill Test Spec: /test-setup
+
+## Skill Summary
+
+`/test-setup` scaffolds the test framework for the project based on the
+configured engine. It creates the `tests/` directory structure defined in
+`coding-standards.md` (unit/, integration/, performance/, playtest/) and
+generates the appropriate test runner configuration for the detected engine:
+GdUnit4 config for Godot, Unity Test Runner asmdef for Unity, or Unreal headless
+runner for Unreal Engine.
+
+Each file or directory created is gated behind a "May I write" ask. If the test
+framework already exists, the skill verifies the configuration rather than
+reinitializing. No director gates apply. The verdict is COMPLETE when the
+scaffold is in place.
+
+---
+
+## Static Assertions (Structural)
+
+Verified automatically by `/skill-test static` — no fixture needed.
+
+- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
+- [ ] Has ≥2 phase headings
+- [ ] Contains verdict keyword: COMPLETE
+- [ ] Contains "May I write" collaborative protocol language before creating files
+- [ ] Has a next-step handoff (e.g., `/test-helpers` to generate helper utilities)
+
+---
+
+## Director Gate Checks
+
+None. `/test-setup` is a scaffolding utility. No director gates apply.
+
+---
+
+## Test Cases
+
+### Case 1: Happy Path — Godot project, scaffolds GdUnit4 test structure
+
+**Fixture:**
+- `technical-preferences.md` has engine set to Godot 4, language GDScript
+- `tests/` directory does not exist yet
+
+**Input:** `/test-setup`
+
+**Expected behavior:**
+1. Skill reads engine from `technical-preferences.md` → Godot 4 + GDScript
+2. Skill drafts the test directory structure: tests/unit/, tests/integration/,
+   tests/performance/, tests/playtest/, and a GdUnit4 runner config file
+3. Skill asks "May I write the tests/ directory structure?"
+4. Directories and GdUnit4 runner script created on approval
+5. Skill confirms the runner script matches the CI command in coding-standards.md:
+   `godot --headless --script tests/gdunit4_runner.gd`
+6. Verdict is COMPLETE
+
+**Assertions:**
+- [ ] All 4 subdirectories (unit/, integration/, performance/, playtest/) are created
+- [ ] GdUnit4 runner config is generated
+- [ ] Runner script path matches coding-standards.md CI command
+- [ ] "May I write" is asked before creating any files
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 2: Unity Project — Scaffolds Unity Test Runner with asmdef
+
+**Fixture:**
+- `technical-preferences.md` has engine set to Unity, language C#
+- `tests/` directory does not exist
+
+**Input:** `/test-setup`
+
+**Expected behavior:**
+1. Skill reads engine → Unity + C#
+2. Skill creates `Tests/` directory with Unity conventions (capitalized)
+3. Skill generates `Tests/Tests.asmdef` and `Tests/Editor/EditorTests.asmdef`
+4. EditMode and PlayMode test runner modes are configured
+5. Skill asks "May I write the Tests/ directory structure?"
+6. Verdict is COMPLETE
+
+**Assertions:**
+- [ ] Unity-specific `Tests/` structure is created (not the Godot structure)
+- [ ] `.asmdef` files are generated
+- [ ] EditMode and PlayMode runner config is present
+- [ ] Verdict is COMPLETE
+
+---
+
+### Case 3: Test Framework Already Exists — Verifies config, not re-initialized
+
+**Fixture:**
+- `tests/unit/`, `tests/integration/` exist
+- GdUnit4 runner script exists (Godot project)
+
+**Input:** `/test-setup`
+
+**Expected behavior:**
+1. Skill detects existing tests/ structure
+2. Skill reports: "Test framework already exists — verifying configuration"
+3. Skill checks: runner script path, directory completeness, CI command alignment
+4. If all checks pass: reports "Configuration verified — no changes needed"
+5. If checks fail (e.g., missing tests/performance/): reports specific gap and
+   asks "May I add the missing directories?"
+
+**Assertions:**
+- [ ] Skill does NOT reinitialize when framework exists
+- [ ] Verification checks are performed on existing structure
+- [ ] Only missing parts trigger a "May I write" ask
+- [ ] Verdict is COMPLETE whether everything was OK or gaps were fixed
+
+---
+
+### Case 4: No Engine Configured — Redirects to /setup-engine
+
+**Fixture:**
+- `technical-preferences.md` contains only placeholders (engine not set)
+
+**Input:** `/test-setup`
+
+**Expected behavior:**
+1. Skill reads `technical-preferences.md` and finds engine placeholder
+2. Skill reports: "Engine not configured — cannot scaffold engine-specific test framework"
+3. Skill suggests running `/setup-engine` first
+4. No directories or files are created
+
+**Assertions:**
+- [ ] Error message explicitly states engine is not configured
+- [ ] `/setup-engine` is suggested as the next step
+- [ ] No write tool is called
+- [ ] Verdict is not COMPLETE (blocked state)
+
+---
+
+### Case 5: Director Gate Check — No gate; test-setup is a scaffolding utility
+
+**Fixture:**
+- Engine configured, tests/ does not exist
+
+**Input:** `/test-setup`
+
+**Expected behavior:**
+1. Skill scaffolds and writes all test framework files
+2. No director agents are spawned
+3. No gate IDs appear in output
+
+**Assertions:**
+- [ ] No director gate is invoked
+- [ ] No gate skip messages appear
+- [ ] Verdict is COMPLETE without any gate check
+
+---
+
+## Protocol Compliance
+
+- [ ] Reads engine from `technical-preferences.md` before generating any scaffold
+- [ ] Generates engine-appropriate test runner config (not generic)
+- [ ] Creates all 4 subdirectories from coding-standards.md
+- [ ] Asks "May I write" before creating files
+- [ ] Detects existing framework and offers verification (not reinitialization)
+- [ ] Verdict is COMPLETE when scaffold is in place
+
+---
+
+## Coverage Notes
+
+- Unreal Engine test scaffolding (headless runner with `-nullrhi`) follows the
+  same pattern as Cases 1 and 2 and is not separately fixture-tested.
+- CI integration file generation (e.g., `.github/workflows/test.yml`) is
+  referenced but not assertion-tested here — it may be a separate skill concern.
+- The case where tests/ exists but is from a different engine (e.g., Unity tests
+  in a now-Godot project) is not tested; the skill would detect the mismatch
+  and offer to reconcile.