添加 claude code game studios 到项目

This commit is contained in:
panw
2026-05-15 14:52:29 +08:00
parent dff559462d
commit a16fe4bff7
415 changed files with 78609 additions and 0 deletions

View File

@@ -0,0 +1,214 @@
# Skill Test Spec: /adopt
## Skill Summary
`/adopt` audits an existing project's artifacts — GDDs, ADRs, stories, infrastructure
files, and `technical-preferences.md` — for format compliance with the template's
skill pipeline. It classifies every gap by severity (BLOCKING / HIGH / MEDIUM / LOW),
composes a numbered, ordered migration plan, and writes it to `docs/adoption-plan-[date].md`
after explicit user approval via `AskUserQuestion`.
This skill is distinct from `/project-stage-detect` (which checks what exists).
`/adopt` checks whether what exists will actually work with the template's skills.
No director gates apply. The skill does NOT invoke any director agents.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains severity tier keywords: BLOCKING, HIGH, MEDIUM, LOW
- [ ] Contains "May I write" or `AskUserQuestion` language before writing the adoption plan
- [ ] Has a next-step handoff at the end (e.g., offering to fix the highest-priority gap immediately)
---
## Director Gate Checks
None. `/adopt` is a brownfield audit utility. No director gates apply.
---
## Test Cases
### Case 1: Happy Path — All GDDs compliant, no gaps, COMPLIANT
**Fixture:**
- `design/gdd/` contains 3 GDD files; each has all 8 required sections with content
- `docs/architecture/adr-0001.md` exists with `## Status`, `## Engine Compatibility`,
and all other required sections
- `production/stage.txt` exists
- `docs/architecture/tr-registry.yaml` and `docs/architecture/control-manifest.md` exist
- Engine configured in `technical-preferences.md`
**Input:** `/adopt`
**Expected behavior:**
1. Skill emits "Scanning project artifacts..." then reads all artifacts silently
2. Reports detected phase, GDD count, ADR count, story count
3. Phase 2 audit: all 3 GDDs have all 8 sections, Status field present and valid
4. ADR audit: all required sections present
5. Infrastructure audit: all critical files exist
6. Phase 3: zero BLOCKING, zero HIGH, zero MEDIUM, zero LOW gaps
7. Summary reports: "No blocking gaps — this project is template-compatible"
8. Uses `AskUserQuestion` to ask about writing the plan; user selects write
9. Adoption plan is written to `docs/adoption-plan-[date].md`
10. Phase 7 offers next action: no blocking gaps, offers options for next steps
**Assertions:**
- [ ] Skill reads silently before presenting any output
- [ ] "Scanning project artifacts..." appears before the silent read phase
- [ ] Gap counts show 0 BLOCKING, 0 HIGH, 0 MEDIUM (or only LOW)
- [ ] `AskUserQuestion` is used before writing the adoption plan
- [ ] Adoption plan file is written to `docs/adoption-plan-[date].md`
- [ ] Phase 7 offers a specific next action (not just a list)
---
### Case 2: Non-Compliant Documents — GDDs missing sections, NEEDS MIGRATION
**Fixture:**
- `design/gdd/` contains 2 GDD files:
- `combat.md` — missing `## Acceptance Criteria` and `## Formulas` sections
- `movement.md` — all 8 sections present
- One ADR (`adr-0001.md`) is missing `## Status` section
- `docs/architecture/tr-registry.yaml` does not exist
**Input:** `/adopt`
**Expected behavior:**
1. Skill scans all artifacts
2. Phase 2 audit finds:
- `combat.md`: 2 missing sections (Acceptance Criteria, Formulas)
- `adr-0001.md`: missing `## Status` — BLOCKING impact
- `tr-registry.yaml`: missing — HIGH impact
3. Phase 3 classifies:
- BLOCKING: `adr-0001.md` missing `## Status` (story-readiness silently passes)
- HIGH: `tr-registry.yaml` missing; `combat.md` missing Acceptance Criteria (can't generate stories)
- MEDIUM: `combat.md` missing Formulas
4. Phase 4 builds ordered migration plan:
- Step 1 (BLOCKING): Add `## Status` to `adr-0001.md` — command: `/architecture-decision retrofit`
- Step 2 (HIGH): Run `/architecture-review` to bootstrap tr-registry.yaml
- Step 3 (HIGH): Add Acceptance Criteria to `combat.md` — command: `/design-system retrofit`
- Step 4 (MEDIUM): Add Formulas to `combat.md`
5. Gap Preview shows BLOCKING items as bullets (actual file names), HIGH/MEDIUM as counts
6. `AskUserQuestion` asks to write the plan; writes after approval
7. Phase 7 offers to fix the highest-priority gap (ADR Status) immediately
**Assertions:**
- [ ] BLOCKING gaps are listed as explicit file-name bullets in the Gap Preview
- [ ] HIGH and MEDIUM shown as counts in Gap Preview
- [ ] Migration plan items are in BLOCKING-first order
- [ ] Each plan item includes the fix command or manual steps
- [ ] `AskUserQuestion` is used before writing
- [ ] Phase 7 offers to immediately retrofit the first BLOCKING item
---
### Case 3: Mixed State — Some docs compliant, some not, partial report
**Fixture:**
- 4 GDD files: 2 fully compliant, 2 with gaps (one missing Tuning Knobs, one missing Edge Cases)
- ADRs: 3 files — 2 compliant, 1 missing `## ADR Dependencies`
- Stories: 5 files — 3 have TR-ID references, 2 do not
- Infrastructure: all critical files present; `technical-preferences.md` fully configured
**Input:** `/adopt`
**Expected behavior:**
1. Skill audits all artifact types
2. Audit summary shows totals: "4 GDDs (2 fully compliant, 2 with gaps); 3 ADRs
(2 fully compliant, 1 with gaps); 5 stories (3 with TR-IDs, 2 without)"
3. Gap classification:
- No BLOCKING gaps
- HIGH: 1 ADR missing `## ADR Dependencies`
- MEDIUM: 2 GDDs with missing sections; 2 stories missing TR-IDs
- LOW: none
4. Migration plan lists HIGH gap first, then MEDIUM gaps in order
5. Note included: "Existing stories continue to work — do not regenerate stories
that are in progress or done"
6. `AskUserQuestion` to write plan; writes after approval
**Assertions:**
- [ ] Per-artifact compliance tallies are shown (N compliant, M with gaps)
- [ ] Existing story compatibility note is included in the plan
- [ ] No BLOCKING gaps results in no BLOCKING section in migration plan
- [ ] HIGH gap precedes MEDIUM gaps in plan ordering
- [ ] `AskUserQuestion` is used before writing
---
### Case 4: No Artifacts Found — Fresh project, guidance to run /start
**Fixture:**
- Repository has no files in `design/gdd/`, `docs/architecture/`, `production/epics/`
- `production/stage.txt` does not exist
- `src/` directory does not exist or has fewer than 10 files
- No game-concept.md, no systems-index.md
**Input:** `/adopt`
**Expected behavior:**
1. Phase 1 existence check finds no artifacts
2. Skill infers "Fresh" — no brownfield work to migrate
3. Uses `AskUserQuestion`:
- "This looks like a fresh project — no existing artifacts found. `/adopt` is for
projects with work to migrate. What would you like to do?"
- Options: "Run `/start`", "My artifacts are in a non-standard location", "Cancel"
4. Skill stops — does not proceed to audit regardless of user selection
**Assertions:**
- [ ] `AskUserQuestion` is used (not a plain text message) when no artifacts are found
- [ ] `/start` is presented as a named option
- [ ] Skill stops after the question — no audit phases run
- [ ] No adoption plan file is written
---
### Case 5: Director Gate Check — No gate; adopt is a utility audit skill
**Fixture:**
- Project with a mix of compliant and non-compliant GDDs
**Input:** `/adopt`
**Expected behavior:**
1. Skill completes full audit and produces migration plan
2. No director agents are spawned at any point
3. No gate IDs (CD-*, TD-*, AD-*, PR-*) appear in output
4. No `/gate-check` is invoked during the skill run
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Skill reaches plan-writing or cancellation without any gate verdict
---
## Protocol Compliance
- [ ] Emits "Scanning project artifacts..." before silent read phase
- [ ] Reads all artifacts silently before presenting any results
- [ ] Shows Adoption Audit Summary and Gap Preview before asking to write
- [ ] Uses `AskUserQuestion` before writing the adoption plan file
- [ ] Adoption plan written to `docs/adoption-plan-[date].md` — not to any other path
- [ ] Migration plan items ordered: BLOCKING first, HIGH second, MEDIUM third, LOW last
- [ ] Phase 7 always offers a single specific next action (not a generic list)
- [ ] Never regenerates existing artifacts — only fills gaps in what exists
- [ ] Does not invoke director gates at any point
---
## Coverage Notes
- The `gdds`, `adrs`, `stories`, and `infra` argument modes narrow the audit scope;
each follows the same pattern as the full audit but limited to that artifact type.
Not separately fixture-tested here.
- The systems-index.md parenthetical status value check (BLOCKING) is a special case
that triggers an immediate fix offer before writing the plan; not separately tested.
- The review-mode.txt prompt (Phase 6b) runs after plan writing if `production/review-mode.txt`
does not exist; not separately tested here.

View File

@@ -0,0 +1,179 @@
# Skill Test Spec: /asset-spec
## Skill Summary
`/asset-spec` generates per-asset visual specification documents from design
requirements. It reads the relevant GDD, art bible, and design system to produce
a structured asset spec sheet that defines: dimensions, animation states (if
applicable), color palette reference, style notes, technical constraints
(format, file size budget), and deliverable checklist.
Spec sheets are written to `assets/specs/[asset-name]-spec.md` after a "May I write"
ask. If a spec already exists, the skill offers to update it. When multiple assets
are requested in a single invocation, a "May I write" ask is made per asset. No
director gates apply. The verdict is COMPLETE when all requested specs are written.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keyword: COMPLETE
- [ ] Contains "May I write" collaborative protocol language (per asset)
- [ ] Has a next-step handoff (e.g., assign to an artist, or `/asset-audit` later)
---
## Director Gate Checks
None. `/asset-spec` is a design documentation utility. Technical artists may
review specs separately but this is not a gate within this skill.
---
## Test Cases
### Case 1: Happy Path — Enemy sprite spec with full GDD and art bible
**Fixture:**
- `design/gdd/enemies.md` exists with enemy variants defined
- `design/art-bible.md` exists with color palette and style notes
- No existing asset spec for "goblin-enemy"
**Input:** `/asset-spec goblin-enemy`
**Expected behavior:**
1. Skill reads enemies GDD and art bible
2. Skill generates a spec for the goblin enemy sprite:
- Dimensions: inferred from engine defaults or explicitly from GDD
- Animation states: idle, walk, attack, hurt, death
- Color palette reference: links to art-bible palette section
- Style notes: from art bible character design rules
- Technical constraints: format (PNG), size budget
- Deliverable checklist
3. Skill asks "May I write to `assets/specs/goblin-enemy-spec.md`?"
4. File written on approval; verdict is COMPLETE
**Assertions:**
- [ ] All 6 spec components are present (dimensions, animations, palette, style, tech, checklist)
- [ ] Color palette reference links to art bible (not duplicated)
- [ ] Animation states are drawn from GDD (not invented)
- [ ] "May I write" is asked with the correct path
- [ ] Verdict is COMPLETE
---
### Case 2: No Art Bible Found — Spec with Placeholder Style Notes, Dependency Flagged
**Fixture:**
- `design/gdd/player.md` exists
- `design/art-bible.md` does NOT exist
**Input:** `/asset-spec player-sprite`
**Expected behavior:**
1. Skill reads player GDD but cannot find the art bible
2. Skill generates spec with placeholder style notes: "DEPENDENCY GAP: art bible
not found — style notes are placeholders"
3. Color palette section uses: "TBD — see art bible when created"
4. Skill asks "May I write to `assets/specs/player-sprite-spec.md`?"
5. File written with placeholders and dependency flag; verdict is COMPLETE with advisory
**Assertions:**
- [ ] DEPENDENCY GAP is flagged for the missing art bible
- [ ] Spec is still generated (not blocked)
- [ ] Style notes contain placeholder markers, not invented styles
- [ ] Verdict is COMPLETE with advisory note
---
### Case 3: Asset Spec Already Exists — Offers to Update
**Fixture:**
- `assets/specs/goblin-enemy-spec.md` already exists
- GDD has been updated since the spec was written (new attack animation added)
**Input:** `/asset-spec goblin-enemy`
**Expected behavior:**
1. Skill detects existing spec file
2. Skill reports: "Asset spec already exists for goblin-enemy — checking for updates"
3. Skill diffs GDD against existing spec and identifies: new "charge-attack" animation
state added in GDD but not in spec
4. Skill presents the diff: "1 new animation state found — offering to update spec"
5. Skill asks "May I update `assets/specs/goblin-enemy-spec.md`?" (not overwrite)
6. Spec is updated; verdict is COMPLETE
**Assertions:**
- [ ] Existing spec is detected and "update" path is offered
- [ ] Diff between GDD and existing spec is shown
- [ ] "May I update" language is used (not "May I write")
- [ ] Existing spec content is preserved; only the diff is applied
- [ ] Verdict is COMPLETE
---
### Case 4: Multiple Assets Requested — May-I-Write Per Asset
**Fixture:**
- GDD and art bible exist
- User requests specs for 3 assets: goblin-enemy, orc-enemy, treasure-chest
**Input:** `/asset-spec goblin-enemy orc-enemy treasure-chest`
**Expected behavior:**
1. Skill generates all 3 specs in sequence
2. For each asset, skill shows the draft and asks "May I write to
`assets/specs/[name]-spec.md`?" individually
3. User can approve all 3 or skip individual assets
4. All approved specs are written; verdict is COMPLETE
**Assertions:**
- [ ] "May I write" is asked 3 times (once per asset), not once for all
- [ ] User can decline one asset without blocking the others
- [ ] All 3 spec files are written for approved assets
- [ ] Verdict is COMPLETE when all approved specs are written
---
### Case 5: Director Gate Check — No gate; asset-spec is a design utility
**Fixture:**
- GDD and art bible exist
**Input:** `/asset-spec goblin-enemy`
**Expected behavior:**
1. Skill generates and writes the asset spec
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is COMPLETE without any gate check
---
## Protocol Compliance
- [ ] Reads GDD, art bible, and design system before generating spec
- [ ] Includes all 6 spec components (dimensions, animations, palette, style, tech, checklist)
- [ ] Flags missing dependencies (art bible, GDD) with DEPENDENCY GAP notes
- [ ] Asks "May I write" (or "May I update") per asset
- [ ] Handles multiple assets with individual write confirmations
- [ ] Verdict is COMPLETE when all approved specs are written
---
## Coverage Notes
- Audio asset specs (sound effects, music) follow the same structure with
different fields (duration, sample rate, looping) and are not separately tested.
- UI asset specs (icons, button states) follow the same flow with interaction
state requirements aligned to the UX spec.
- The case where GDD is also missing (neither GDD nor art bible exists) is not
separately tested; spec would be generated with both dependency gaps flagged.

View File

@@ -0,0 +1,189 @@
# Skill Test Spec: /brainstorm
## Skill Summary
`/brainstorm` facilitates guided game concept ideation. It presents 2-4 concept
options with pros/cons, lets the user choose and refine a concept, and produces
a structured `design/gdd/game-concept.md` document. The skill is collaborative —
it asks questions before proposing options and iterates until the user approves
a concept direction.
In `full` review mode, four director gates spawn in parallel after the concept
is drafted: CD-PILLARS (creative-director), AD-CONCEPT-VISUAL (art-director),
TD-FEASIBILITY (technical-director), and PR-SCOPE (producer). In `lean` mode,
all 4 inline gates are skipped (lean mode only runs PHASE-GATEs, and brainstorm
has none). In `solo` mode, all gates are skipped. The skill asks "May I write"
before writing `design/gdd/game-concept.md`.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keywords: APPROVED, REJECTED, CONCERNS
- [ ] Contains "May I write" collaborative protocol language (for game-concept.md)
- [ ] Has a next-step handoff at the end (`/map-systems`)
- [ ] Documents 4 director gates in full mode: CD-PILLARS, AD-CONCEPT-VISUAL, TD-FEASIBILITY, PR-SCOPE
- [ ] Documents that all 4 gates are skipped in lean and solo modes
---
## Director Gate Checks
In `full` mode: CD-PILLARS, AD-CONCEPT-VISUAL, TD-FEASIBILITY, and PR-SCOPE
spawn in parallel after the concept draft is approved by the user.
In `lean` mode: all 4 inline gates are skipped (brainstorm has no PHASE-GATEs,
so lean mode skips everything). Output notes all 4 as: "[GATE-ID] skipped — lean mode".
In `solo` mode: all 4 gates are skipped. Output notes all 4 as: "[GATE-ID] skipped — solo mode".
---
## Test Cases
### Case 1: Happy Path — Full mode, 3 concepts, user picks one, all 4 directors approve
**Fixture:**
- No existing `design/gdd/game-concept.md`
- `production/session-state/review-mode.txt` contains `full`
**Input:** `/brainstorm`
**Expected behavior:**
1. Skill asks the user questions about genre, scope, and target feeling
2. Skill presents 3 concept options with pros/cons each
3. User selects one concept
4. Skill elaborates the chosen concept into a structured draft
5. All 4 director gates spawn in parallel: CD-PILLARS, AD-CONCEPT-VISUAL, TD-FEASIBILITY, PR-SCOPE
6. All 4 return APPROVED
7. Skill asks "May I write `design/gdd/game-concept.md`?"
8. Concept written after approval
**Assertions:**
- [ ] Exactly 3 concept options are presented (not 1, not 5+)
- [ ] All 4 director gates spawn in parallel (not sequentially)
- [ ] All 4 gates complete before the "May I write" ask
- [ ] "May I write `design/gdd/game-concept.md`?" is asked before writing
- [ ] Concept file is NOT written without user approval
- [ ] Next-step handoff to `/map-systems` is present
---
### Case 2: Failure Path — CD-PILLARS returns REJECT
**Fixture:**
- Concept draft is complete
- `production/session-state/review-mode.txt` contains `full`
- CD-PILLARS gate returns REJECT: "The concept has no identifiable creative pillar"
**Input:** `/brainstorm`
**Expected behavior:**
1. CD-PILLARS gate returns REJECT with specific feedback
2. Skill surfaces the rejection to the user
3. Concept is NOT written to file
4. User is asked: rethink the concept direction, or override the rejection
5. If rethinking: skill returns to the concept options phase
**Assertions:**
- [ ] Concept is NOT written when CD-PILLARS returns REJECT
- [ ] Rejection feedback is shown to the user verbatim
- [ ] User is given the option to rethink or override
- [ ] Skill returns to concept ideation phase if user chooses to rethink
---
### Case 3: Lean Mode — All 4 gates skipped; concept written after user confirms
**Fixture:**
- No existing game concept
- `production/session-state/review-mode.txt` contains `lean`
**Input:** `/brainstorm`
**Expected behavior:**
1. Concept options are presented and user selects one
2. Concept is elaborated into a structured draft
3. All 4 director gates are skipped — each noted: "[GATE-ID] skipped — lean mode"
4. Skill asks user to confirm the concept is ready to write
5. "May I write `design/gdd/game-concept.md`?" asked after confirmation
6. Concept written after approval
**Assertions:**
- [ ] All 4 gate skip notes appear: "CD-PILLARS skipped — lean mode", "AD-CONCEPT-VISUAL skipped — lean mode", "TD-FEASIBILITY skipped — lean mode", "PR-SCOPE skipped — lean mode"
- [ ] Concept is written after user confirmation only (no director approval needed in lean)
- [ ] "May I write" is still asked before writing
---
### Case 4: Solo Mode — All gates skipped; concept written with only user approval
**Fixture:**
- No existing game concept
- `production/session-state/review-mode.txt` contains `solo`
**Input:** `/brainstorm`
**Expected behavior:**
1. Concept options are presented and user selects one
2. Concept draft is shown to user
3. All 4 director gates are skipped — each noted with "solo mode"
4. "May I write `design/gdd/game-concept.md`?" asked
5. Concept written after user approval
**Assertions:**
- [ ] All 4 skip notes appear with "solo mode" label
- [ ] No director agents are spawned
- [ ] Concept is written with only user approval
- [ ] Behavior is otherwise equivalent to lean mode for this skill
---
### Case 5: Director Gate — PR-SCOPE returns CONCERNS (scope too large)
**Fixture:**
- Concept draft is complete
- `production/session-state/review-mode.txt` contains `full`
- PR-SCOPE gate returns CONCERNS: "The concept scope would require 18+ months for a solo developer"
**Input:** `/brainstorm`
**Expected behavior:**
1. PR-SCOPE gate returns CONCERNS with specific scope feedback
2. Skill surfaces the scope concerns to the user
3. Scope concerns are documented in the concept draft before writing
4. User is asked: reduce scope, accept concerns and document them, or rethink
5. If concerns are accepted: concept is written with a "Scope Risk" note embedded
**Assertions:**
- [ ] PR-SCOPE concerns are shown to the user before the "May I write" ask
- [ ] Skill does NOT write concept without surfacing scope concerns
- [ ] If user accepts: scope concerns are documented in the concept file
- [ ] Skill does NOT auto-reject a concept due to PR-SCOPE CONCERNS (user decides)
---
## Protocol Compliance
- [ ] Presents 2-4 concept options with pros/cons before user commits
- [ ] User confirms concept direction before director gates are invoked
- [ ] All 4 director gates spawn in parallel in full mode
- [ ] All 4 gates skipped in lean AND solo mode — each noted by name
- [ ] "May I write `design/gdd/game-concept.md`?" asked before writing
- [ ] Ends with next-step handoff: `/map-systems`
---
## Coverage Notes
- AD-CONCEPT-VISUAL gate (art director feasibility) is grouped with the other
3 gates in the parallel spawn — not independently fixture-tested.
- The iterative concept refinement loop (user rejects all options, skill
generates new ones) is not fixture-tested — it follows the same pattern as
the option selection phase.
- The game-concept.md document structure (required sections) is defined in the
skill body and not re-enumerated in test assertions.

View File

@@ -0,0 +1,174 @@
# Skill Test Spec: /bug-report
## Skill Summary
`/bug-report` creates a structured bug report document from a user description.
It produces a report with the following required fields: Title, Repro Steps,
Expected Behavior, Actual Behavior, Severity (CRITICAL/HIGH/MEDIUM/LOW), Affected
System(s), and Build/Version. If the user's initial description is missing any
required field, the skill asks follow-up questions to fill the gaps before
producing the draft.
The skill checks for possibly duplicate reports (by comparing to existing files
in `production/bugs/`) and offers to link rather than create a new report. Each
report is written to `production/bugs/bug-[date]-[slug].md` after a "May I write"
ask. No director gates are used — bug reporting is an operational utility.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keyword: COMPLETE
- [ ] Contains "May I write" collaborative protocol language before writing the report
- [ ] Has a next-step handoff (e.g., `/bug-triage` to reprioritize, `/hotfix` for critical)
---
## Director Gate Checks
None. `/bug-report` is an operational documentation skill. No director gates apply.
---
## Test Cases
### Case 1: Happy Path — User describes a crash, full report produced
**Fixture:**
- `production/bugs/` directory exists and is empty
- No similar existing reports
**Input:** `/bug-report` (user describes: "Game crashes when player enters the boss arena")
**Expected behavior:**
1. Skill extracts: Title = "Game crashes when entering boss arena"
2. Skill recognizes crash reports as CRITICAL severity
3. Skill confirms repro steps, expected (no crash), actual (crash), affected system
(arena/boss), and build version with the user
4. Skill drafts the full structured report
5. Skill asks "May I write to `production/bugs/bug-2026-04-06-game-crashes-boss-arena.md`?"
6. File is written on approval; verdict is COMPLETE
**Assertions:**
- [ ] All 7 required fields are present in the report
- [ ] Severity is CRITICAL for a crash report
- [ ] Filename follows the `bug-[date]-[slug].md` convention
- [ ] "May I write" is asked with the full file path
- [ ] Verdict is COMPLETE
---
### Case 2: Minimal Input — Skill asks follow-up questions for missing fields
**Fixture:**
- User provides: "Sometimes the audio cuts out"
- No existing reports
**Input:** `/bug-report`
**Expected behavior:**
1. Skill identifies missing required fields: repro steps, expected vs. actual,
severity, affected system, build
2. Skill asks targeted follow-up questions for each missing field (one at a time
or in a structured prompt)
3. User provides answers
4. Skill compiles complete report from answers
5. Skill asks "May I write?" and writes on approval
**Assertions:**
- [ ] At least 3 follow-up questions are asked to fill missing fields
- [ ] Each required field is filled before the report is finalized
- [ ] Report is not written until all required fields are present
- [ ] Verdict is COMPLETE after all fields are filled and file is written
---
### Case 3: Possible Duplicate — Offers to link rather than create new
**Fixture:**
- `production/bugs/bug-2026-03-20-audio-cut-out.md` already exists with
similar title and MEDIUM severity
**Input:** `/bug-report` (user describes: "Audio randomly stops working")
**Expected behavior:**
1. Skill scans existing reports and finds the similar audio bug
2. Skill reports: "A similar bug report exists: bug-2026-03-20-audio-cut-out.md"
3. Skill presents options: link as duplicate (add note to existing), create new anyway
4. If user chooses link: skill adds a cross-reference note to the existing file
(asks "May I update the existing report?")
5. If user chooses create new: normal report creation proceeds
**Assertions:**
- [ ] Existing similar report is surfaced before creating a new one
- [ ] User is given the choice (not forced to link or create)
- [ ] If linking: "May I update" is asked before modifying the existing file
- [ ] Verdict is COMPLETE in either path
---
### Case 4: Multi-System Bug — Report created with multiple system tags
**Fixture:**
- No existing reports
**Input:** `/bug-report` (user describes: "After finishing a level, the save system
freezes and the UI doesn't show the completion screen")
**Expected behavior:**
1. Skill identifies 2 affected systems from the description: Save System and UI
2. Report is drafted with both systems listed under Affected System(s)
3. Severity is assessed (likely HIGH — data loss risk from save freeze)
4. Skill asks "May I write" with the appropriate filename
5. Report is written with both systems tagged; verdict is COMPLETE
**Assertions:**
- [ ] Both affected systems are listed in the report
- [ ] Single report is created (not one per system)
- [ ] Severity reflects the most impactful component (save freeze → HIGH or CRITICAL)
- [ ] Verdict is COMPLETE
---
### Case 5: Director Gate Check — No gate; bug reporting is operational
**Fixture:**
- Any bug description provided
**Input:** `/bug-report`
**Expected behavior:**
1. Skill creates and writes the bug report
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Skill reaches COMPLETE without any gate check
---
## Protocol Compliance
- [ ] Collects all 7 required fields before drafting the report
- [ ] Asks follow-up questions for any missing required fields
- [ ] Checks for similar existing reports before creating a new one
- [ ] Asks "May I write to `production/bugs/bug-[date]-[slug].md`?" before writing
- [ ] Verdict is COMPLETE when the report file is written
---
## Coverage Notes
- The case where the user provides a severity that seems too low for the
described impact (e.g., LOW for a crash) is not tested; the skill may suggest
a higher severity but ultimately respects user input.
- Build/version field is required but may be "unknown" if the user doesn't know —
this is accepted as a valid value and not tested separately.
- Report slug generation (sanitizing the title into a filename) is an
implementation detail not assertion-tested here.

View File

@@ -0,0 +1,174 @@
# Skill Test Spec: /bug-triage
## Skill Summary
`/bug-triage` reads all open bug reports in `production/bugs/` and produces a
prioritized triage table sorted by severity (CRITICAL → HIGH → MEDIUM → LOW).
It runs on the Haiku model (read-only, formatting/sorting task) and produces no
file writes — the triage output is conversational. The skill flags bugs missing
reproduction steps and identifies possible duplicates by comparing titles and
affected systems.
The verdict is always TRIAGED — the skill is advisory and informational. No
director gates apply. The output is intended to help a producer or QA lead
prioritize which bugs to address next.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keyword: TRIAGED
- [ ] Does NOT contain "May I write" language (skill is read-only)
- [ ] Has a next-step handoff (e.g., `/bug-report` to create new reports, `/hotfix` for critical bugs)
---
## Director Gate Checks
None. `/bug-triage` is a read-only advisory skill. No director gates apply.
---
## Test Cases
### Case 1: Happy Path — 5 bugs of varying severity, sorted table produced
**Fixture:**
- `production/bugs/` contains 5 bug report files:
- bug-2026-03-10-audio-crash.md (CRITICAL)
- bug-2026-03-12-score-overflow.md (HIGH)
- bug-2026-03-14-ui-overlap.md (MEDIUM)
- bug-2026-03-15-typo-tutorial.md (LOW)
- bug-2026-03-16-vfx-flicker.md (HIGH)
**Input:** `/bug-triage`
**Expected behavior:**
1. Skill reads all 5 bug report files
2. Skill extracts severity, title, system, and repro status from each
3. Skill produces a triage table sorted: CRITICAL first, then HIGH, MEDIUM, LOW
4. Within the same severity, bugs are ordered by date (oldest first)
5. Verdict is TRIAGED
**Assertions:**
- [ ] Triage table has exactly 5 rows
- [ ] CRITICAL bug appears before both HIGH bugs
- [ ] HIGH bugs appear before MEDIUM and LOW bugs
- [ ] Verdict is TRIAGED
- [ ] No files are written
---
### Case 2: No Bug Reports Found — Guidance to run /bug-report
**Fixture:**
- `production/bugs/` directory exists but is empty (or does not exist)
**Input:** `/bug-triage`
**Expected behavior:**
1. Skill scans `production/bugs/` and finds no reports
2. Skill outputs: "No open bug reports found in production/bugs/"
3. Skill suggests running `/bug-report` to create a bug report
4. No triage table is produced
**Assertions:**
- [ ] Output explicitly states no bugs were found
- [ ] `/bug-report` is suggested as the next step
- [ ] Skill does not error out — it handles empty directory gracefully
- [ ] Verdict is TRIAGED (with "no bugs found" context)
---
### Case 3: Bug Missing Reproduction Steps — Flagged as NEEDS REPRO INFO
**Fixture:**
- `production/bugs/` contains 3 bug reports; one has an empty "Repro Steps" section
**Input:** `/bug-triage`
**Expected behavior:**
1. Skill reads all 3 reports
2. Skill detects the report with no repro steps
3. That bug appears in the triage table with a `NEEDS REPRO INFO` tag
4. Other bugs are triaged normally
5. Verdict is TRIAGED
**Assertions:**
- [ ] `NEEDS REPRO INFO` tag appears next to the bug missing repro steps
- [ ] The flagged bug is still included in the table (not excluded)
- [ ] Other bugs are unaffected
- [ ] Verdict is TRIAGED
---
### Case 4: Possible Duplicate Bugs — Flagged in triage output
**Fixture:**
- `production/bugs/` contains 2 bug reports with similar titles:
- bug-2026-03-18-player-fall-through-floor.md
- bug-2026-03-20-player-clips-through-floor.md
- Both affect the "Physics" system with identical severity
**Input:** `/bug-triage`
**Expected behavior:**
1. Skill reads both reports and detects similar title + same system + same severity
2. Both bugs are included in the triage table
3. Each is tagged with `POSSIBLE DUPLICATE` and cross-references the other report
4. No bugs are merged or deleted — flagging is advisory
5. Verdict is TRIAGED
**Assertions:**
- [ ] Both bugs appear in the table (not merged)
- [ ] Both are tagged `POSSIBLE DUPLICATE`
- [ ] Each cross-references the other (by filename or title)
- [ ] Verdict is TRIAGED
---
### Case 5: Director Gate Check — No gate; triage is advisory
**Fixture:**
- `production/bugs/` contains any number of reports
**Input:** `/bug-triage`
**Expected behavior:**
1. Skill produces the triage table
2. No director agents are spawned
3. No gate IDs appear in output
4. No write tool is called
**Assertions:**
- [ ] No director gate is invoked
- [ ] No write tool is called
- [ ] No gate skip messages appear
- [ ] Verdict is TRIAGED without any gate check
---
## Protocol Compliance
- [ ] Reads all files in `production/bugs/` before generating the table
- [ ] Sorts by severity (CRITICAL → HIGH → MEDIUM → LOW)
- [ ] Flags bugs missing repro steps
- [ ] Flags possible duplicates by title/system similarity
- [ ] Does not write any files
- [ ] Verdict is TRIAGED in all cases (even empty)
---
## Coverage Notes
- The case where a bug report is malformed (missing severity field entirely)
is not fixture-tested; skill would flag it as `UNKNOWN SEVERITY` and sort it
last in the table.
- Status transitions (marking bugs as resolved) are outside this skill's scope —
bug-triage is read-only.
- The duplicate detection heuristic (title similarity + same system) is
approximate; exact matching logic is defined in the skill body.

View File

@@ -0,0 +1,175 @@
# Skill Test Spec: /day-one-patch
## Skill Summary
`/day-one-patch` prepares a day-one patch plan for issues that are known at
launch but deferred from the v1.0 release. It reads open bug reports in
`production/bugs/`, deferred acceptance criteria from story files (stories
marked `Status: Done` but with noted deferred ACs), and produces a prioritized
patch plan with estimated fix timelines per issue.
The patch plan is written to `production/releases/day-one-patch.md` after a
"May I write" ask. If a P0 (critical post-ship) issue is discovered, the skill
triggers guidance to run `/hotfix` before the patch. No director gates apply.
The verdict is always COMPLETE.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keyword: COMPLETE
- [ ] Contains "May I write" collaborative protocol language before writing the plan
- [ ] Has a next-step handoff (e.g., `/hotfix` for P0 issues, `/release-checklist` for follow-up)
---
## Director Gate Checks
None. `/day-one-patch` is a release planning utility. No director gates apply.
---
## Test Cases
### Case 1: Happy Path — 3 Known Issues, Patch Plan With Fix Estimates
**Fixture:**
- `production/bugs/` contains 3 open bugs with severities: 1 MEDIUM, 2 LOW
- No deferred ACs in sprint stories
- All bugs have repro steps and system identifications
**Input:** `/day-one-patch`
**Expected behavior:**
1. Skill reads all 3 open bugs
2. Skill assigns fix effort estimates: MEDIUM bug = 1-2 days, LOW bugs = 4 hours each
3. Skill produces a patch plan prioritizing MEDIUM bug first
4. Plan includes: priority order, estimated timeline, responsible system, fix description
5. Skill asks "May I write to `production/releases/day-one-patch.md`?"
6. File written; verdict is COMPLETE
**Assertions:**
- [ ] All 3 bugs appear in the plan
- [ ] Bugs are prioritized by severity (MEDIUM before LOW)
- [ ] Fix estimates are provided per issue
- [ ] "May I write" is asked before writing
- [ ] Verdict is COMPLETE
---
### Case 2: Critical Issue Discovered Post-Ship — P0, Triggers /hotfix Guidance
**Fixture:**
- A CRITICAL severity bug is found in `production/bugs/` after the v1.0 release
- The bug causes data loss for all save files
**Input:** `/day-one-patch`
**Expected behavior:**
1. Skill reads bugs and identifies the CRITICAL severity issue
2. Skill escalates: "P0 ISSUE DETECTED — data loss bug requires immediate hotfix
before patch planning can proceed"
3. Skill does NOT include the P0 issue in the patch plan timeline
4. Skill explicitly directs: "Run `/hotfix` to resolve this issue first"
5. After P0 guidance is issued: plan for remaining lower-severity bugs is still
generated and written; verdict is COMPLETE
**Assertions:**
- [ ] P0 escalation message appears prominently before the patch plan
- [ ] `/hotfix` is explicitly directed for the P0 issue
- [ ] P0 issue is NOT scheduled in the patch plan timeline (it needs immediate action)
- [ ] Non-P0 issues are still planned; verdict is COMPLETE
---
### Case 3: Deferred AC From Story-Done — Pulled Into Patch Plan Automatically
**Fixture:**
- `production/sprints/sprint-008.md` has a story with `Status: Done` and a note:
"DEFERRED AC: Gamepad vibration on damage — deferred to post-launch patch"
- No open bugs for the same system
**Input:** `/day-one-patch`
**Expected behavior:**
1. Skill reads sprint stories and detects the deferred AC note
2. Deferred AC is automatically included in the patch plan as a work item
3. Plan entry: "Deferred from sprint-008: Gamepad vibration on damage"
4. Fix estimate is assigned; patch plan written after "May I write" approval
5. Verdict is COMPLETE
**Assertions:**
- [ ] Deferred ACs from story files are automatically pulled into the plan
- [ ] Deferred items are labeled by their source story (sprint-008)
- [ ] Deferred AC gets a fix estimate like bug entries
- [ ] Verdict is COMPLETE
---
### Case 4: No Known Issues — Empty Plan With Template Note
**Fixture:**
- `production/bugs/` is empty
- No stories have deferred ACs
**Input:** `/day-one-patch`
**Expected behavior:**
1. Skill reads bugs — none found
2. Skill reads story deferred ACs — none found
3. Skill produces an empty patch plan with a note: "No known issues at launch"
4. Template structure is preserved (headers intact) for future use
5. Skill asks "May I write to `production/releases/day-one-patch.md`?"
6. File written; verdict is COMPLETE
**Assertions:**
- [ ] "No known issues at launch" note appears in the written file
- [ ] Template headers are present in the empty plan
- [ ] Skill does NOT error out when there are no issues to plan
- [ ] Verdict is COMPLETE
---
### Case 5: Director Gate Check — No gate; day-one-patch is a planning utility
**Fixture:**
- Known issues present in production/bugs/
**Input:** `/day-one-patch`
**Expected behavior:**
1. Skill generates and writes the patch plan
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is COMPLETE without any gate check
---
## Protocol Compliance
- [ ] Reads open bugs from `production/bugs/` before generating the plan
- [ ] Scans story files for deferred AC notes
- [ ] Escalates CRITICAL (P0) bugs with explicit `/hotfix` guidance
- [ ] Produces an empty plan with note when no issues exist (not an error)
- [ ] Asks "May I write to `production/releases/day-one-patch.md`?" before writing
- [ ] Verdict is COMPLETE in all paths
---
## Coverage Notes
- The case where multiple CRITICAL bugs exist is handled the same as Case 2;
all P0 issues are escalated together.
- Timeline estimation for the patch (e.g., "patch available in 3 days")
requires manual QA and build time estimates; the skill uses rough estimates
based on severity, not actual team velocity.
- The patch notes player communication document (`/patch-notes`) is a separate
skill invoked after the patch plan is executed.

View File

@@ -0,0 +1,172 @@
# Skill Test Spec: /help
## Skill Summary
`/help` analyzes what has been done and what comes next in the project workflow.
It runs on the Haiku model (read-only, formatting task) and reads `production/stage.txt`,
the active sprint file, and recent session state to produce a concise situational
guidance summary. The skill optionally accepts a context query (e.g., `/help testing`)
to surface relevant skills for a specific topic.
The output is always informational — no files are written and no director gates
are invoked. The verdict is always HELP COMPLETE. The skill serves as a workflow
navigator, suggesting 2-3 next skills based on the current project state.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keyword: HELP COMPLETE
- [ ] Does NOT contain "May I write" language (skill is read-only)
- [ ] Has a next-step handoff (suggests 2-3 relevant skills based on state)
---
## Director Gate Checks
None. `/help` is a read-only navigation skill. No director gates apply.
---
## Test Cases
### Case 1: Happy Path — Production stage with active sprint
**Fixture:**
- `production/stage.txt` contains `Production`
- `production/sprints/sprint-004.md` exists with in-progress stories
- `production/session-state/active.md` has a recent checkpoint
**Input:** `/help`
**Expected behavior:**
1. Skill reads stage.txt and active sprint
2. Skill identifies current sprint number and in-progress story count
3. Skill outputs: current stage, sprint summary, and 3 suggested next skills
(e.g., `/sprint-status`, `/dev-story`, `/story-done`)
4. Suggestions are ranked by relevance to current sprint state
5. Verdict is HELP COMPLETE
**Assertions:**
- [ ] Current stage is shown (Production)
- [ ] Active sprint number and story count are mentioned
- [ ] Exactly 2-3 next-skill suggestions are given (not a list of all skills)
- [ ] Suggestions are appropriate for Production stage
- [ ] Verdict is HELP COMPLETE
- [ ] No files are written
---
### Case 2: Concept Stage — Shows concept-to-systems-design workflow path
**Fixture:**
- `production/stage.txt` contains `Concept`
- No sprint files, no GDD files
- `technical-preferences.md` is configured (engine selected)
**Input:** `/help`
**Expected behavior:**
1. Skill reads stage.txt — detects Concept stage
2. Skill outputs the Concept-stage workflow: brainstorm → map-systems → design-system
3. Suggested skills are: `/brainstorm`, `/map-systems` (if concept exists)
4. Current progress is noted: "Engine configured, concept not yet created"
**Assertions:**
- [ ] Stage is identified as Concept
- [ ] Workflow path shows the expected sequence for this stage
- [ ] Suggestions do not include Production-stage skills (e.g., `/dev-story`)
- [ ] Verdict is HELP COMPLETE
---
### Case 3: No stage.txt — Shows full workflow overview
**Fixture:**
- No `production/stage.txt`
- No sprint files
- `technical-preferences.md` has placeholders
**Input:** `/help`
**Expected behavior:**
1. Skill cannot determine stage from stage.txt
2. Skill runs project-stage-detect logic to infer stage from artifacts
3. If stage cannot be inferred: outputs the full workflow overview from
Concept through Release as a reference map
4. Primary suggestion is `/start` to begin configuration
**Assertions:**
- [ ] Skill does not crash when stage.txt is absent
- [ ] Full workflow overview is shown when stage cannot be determined
- [ ] `/start` or `/project-stage-detect` is a top suggestion
- [ ] Verdict is HELP COMPLETE
---
### Case 4: Context Query — User asks for help with testing
**Fixture:**
- `production/stage.txt` contains `Production`
- Active sprint has a story with `Status: In Review`
**Input:** `/help testing`
**Expected behavior:**
1. Skill reads context query: "testing"
2. Skill surfaces skills relevant to testing: `/qa-plan`, `/smoke-check`,
`/regression-suite`, `/test-setup`, `/test-evidence-review`
3. Output is focused on testing workflow, not general sprint navigation
4. Currently in-review story is highlighted as a testing candidate
**Assertions:**
- [ ] Context query is acknowledged in output ("Help topic: testing")
- [ ] At least 3 testing-relevant skills are listed
- [ ] General sprint skills (e.g., `/sprint-plan`) are not the primary suggestions
- [ ] Verdict is HELP COMPLETE
---
### Case 5: Director Gate Check — No gate; help is read-only navigation
**Fixture:**
- Any project state
**Input:** `/help`
**Expected behavior:**
1. Skill produces workflow guidance summary
2. No director agents are spawned
3. No gate IDs appear in output
4. No write tool is called
**Assertions:**
- [ ] No director gate is invoked
- [ ] No write tool is called
- [ ] No gate skip messages appear
- [ ] Verdict is HELP COMPLETE without any gate check
---
## Protocol Compliance
- [ ] Reads stage, sprint, and session state before generating suggestions
- [ ] Suggestions are specific to the current project state (not generic)
- [ ] Context query (if provided) narrows the suggestion set
- [ ] Does not write any files
- [ ] Verdict is HELP COMPLETE in all cases
---
## Coverage Notes
- The case where the active sprint is complete (all stories Done) is not
separately tested; the skill would suggest `/sprint-plan` for the next sprint.
- The `/help` skill does not validate whether suggested skills are available —
it assumes standard skill catalog availability.
- Stage detection fallback (when stage.txt is absent) delegates to the same
logic as `/project-stage-detect` and is not re-tested here in detail.

View File

@@ -0,0 +1,173 @@
# Skill Test Spec: /hotfix
## Skill Summary
`/hotfix` manages an emergency fix workflow: it creates a hotfix branch from
main, applies a targeted fix to the identified file(s), runs `/smoke-check` to
validate the fix doesn't introduce regressions, and prompts the user to confirm
merge back to main. Each code change requires a "May I write to [filepath]?" ask.
Git operations (branch creation, merge) are presented as Bash commands for user
confirmation before execution.
The skill is time-sensitive — director review is optional post-hoc, not a
blocking gate. Verdicts: HOTFIX COMPLETE (fix applied, smoke check passed, merged)
or HOTFIX BLOCKED (fix introduced regression or user declined).
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keywords: HOTFIX COMPLETE, HOTFIX BLOCKED
- [ ] Contains "May I write" language for code changes
- [ ] Has a next-step handoff (e.g., `/bug-report` to document the issue, or version bump)
---
## Director Gate Checks
None. Hotfixes are time-critical. Director review may follow separately as a
post-hoc step. No gate is invoked within this skill.
---
## Test Cases
### Case 1: Happy Path — Critical crash bug fixed, smoke check passes
**Fixture:**
- `main` branch is clean
- Bug is identified in `src/gameplay/arena.gd` (crash on boss arena entry)
- Repro steps are provided by user
**Input:** `/hotfix` (user describes the crash and affected file)
**Expected behavior:**
1. Skill proposes creating a hotfix branch: `hotfix/boss-arena-crash`
2. User confirms; Bash command for branch creation is shown and confirmed
3. Skill identifies the fix location in `arena.gd` and drafts the change
4. Skill asks "May I write to `src/gameplay/arena.gd`?" and applies fix on approval
5. Skill runs `/smoke-check` — PASS
6. Skill presents the merge command and asks user to confirm merge to `main`
7. User confirms; merge executes; verdict is HOTFIX COMPLETE
**Assertions:**
- [ ] Hotfix branch is created before any code changes
- [ ] "May I write" is asked before modifying any source file
- [ ] `/smoke-check` runs after the fix is applied
- [ ] Merge requires explicit user confirmation (not automatic)
- [ ] Verdict is HOTFIX COMPLETE after successful merge
---
### Case 2: Smoke Check Fails — HOTFIX BLOCKED
**Fixture:**
- Fix has been applied to `src/gameplay/arena.gd`
- `/smoke-check` returns FAIL: "Player health clamping regression detected"
**Input:** `/hotfix`
**Expected behavior:**
1. Skill applies the fix and runs `/smoke-check`
2. Smoke check returns FAIL with specific regression identified
3. Skill reports: "HOTFIX BLOCKED — smoke check failed: [regression detail]"
4. Skill presents options: attempt revised fix, revert changes, or merge with
known regression (user acknowledges risk)
5. No automatic merge occurs when smoke check fails
**Assertions:**
- [ ] Verdict is HOTFIX BLOCKED
- [ ] Smoke check failure is shown verbatim to user
- [ ] Merge is NOT performed automatically when smoke check fails
- [ ] User is given explicit options for how to proceed
---
### Case 3: Fix to Already-Released Build — Version tag noted, patch bump prompted
**Fixture:**
- Latest git tag is `v1.2.0`
- Hotfix targets a bug in the v1.2.0 release
**Input:** `/hotfix`
**Expected behavior:**
1. Skill detects that the current HEAD is a tagged release (v1.2.0)
2. Skill notes: "Hotfix targeting tagged release v1.2.0"
3. After smoke check passes, skill prompts: "Should version be bumped to v1.2.1?"
4. If user confirms version bump: skill asks "May I write to VERSION or equivalent?"
5. After version update and merge: verdict is HOTFIX COMPLETE with version noted
**Assertions:**
- [ ] Version tag context is detected and surfaced to user
- [ ] Patch version bump is suggested (not required) after merge
- [ ] Version bump requires its own "May I write" confirmation
- [ ] Verdict is HOTFIX COMPLETE
---
### Case 4: No Repro Steps — Skill Asks Before Applying Fix
**Fixture:**
- User invokes `/hotfix` with a vague description: "something is broken on level 3"
- No repro steps provided
**Input:** `/hotfix` (vague description)
**Expected behavior:**
1. Skill detects insufficient information to identify the fix location
2. Skill asks: "Please provide reproduction steps and the affected file or system"
3. Skill does NOT create a branch or modify any file until repro steps are provided
4. After user provides repro steps: normal hotfix flow begins
**Assertions:**
- [ ] No branch is created without repro steps
- [ ] No code changes are made without a clearly identified fix location
- [ ] Repro step request is specific (not a generic "please provide more info")
- [ ] Normal hotfix flow resumes after user provides repro steps
---
### Case 5: Director Gate Check — No gate; hotfixes are time-critical
**Fixture:**
- Critical bug with repro steps identified
**Input:** `/hotfix`
**Expected behavior:**
1. Skill completes the hotfix workflow
2. No director agents are spawned during execution
3. No gate IDs appear in output
4. Post-hoc director review (if needed) is a manual follow-up, not invoked here
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is HOTFIX COMPLETE or HOTFIX BLOCKED — no gate verdict
---
## Protocol Compliance
- [ ] Creates hotfix branch before making any code changes
- [ ] Asks "May I write" before modifying any source files
- [ ] Runs `/smoke-check` after applying the fix
- [ ] Requires explicit user confirmation before merging
- [ ] HOTFIX BLOCKED when smoke check fails — no automatic merge
- [ ] Verdict is HOTFIX COMPLETE or HOTFIX BLOCKED
---
## Coverage Notes
- The case where multiple files need to be modified for one fix follows the same
"May I write" per-file pattern and is not separately tested.
- The post-hotfix steps (create bug report, update changelog) are suggested in
the handoff but not tested as part of this skill's execution.
- Conflict resolution during the merge (if main has diverged) is not tested;
the skill would surface the conflict and ask the user to resolve it manually.

View File

@@ -0,0 +1,180 @@
# Skill Test Spec: /launch-checklist
## Skill Summary
`/launch-checklist` generates and evaluates a complete launch readiness checklist
covering: legal compliance (EULA, privacy policy, ESRB/PEGI ratings), platform
certification status, store page completeness (screenshots, description, metadata),
build validation (version tag, reproducible build), analytics and crash reporting
configuration, and first-run experience verification.
The skill produces a checklist report written to `production/launch/launch-checklist-[date].md`
after a "May I write" ask. If a previous launch checklist exists, it compares the
new results against the old to highlight newly resolved and newly blocked items. No
director gates apply — `/team-release` orchestrates the full release pipeline. Verdicts:
LAUNCH READY, LAUNCH BLOCKED, or CONCERNS.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keywords: LAUNCH READY, LAUNCH BLOCKED, CONCERNS
- [ ] Contains "May I write" collaborative protocol language before writing the checklist
- [ ] Has a next-step handoff (e.g., `/team-release` or `/day-one-patch`)
---
## Director Gate Checks
None. `/launch-checklist` is a readiness audit utility. The full release pipeline
is managed by `/team-release`.
---
## Test Cases
### Case 1: Happy Path — All Checklist Items Verified, LAUNCH READY
**Fixture:**
- Legal docs present: EULA, privacy policy in `production/legal/`
- Platform certification: marked as submitted and approved in production notes
- Store page assets: screenshots, description, metadata all present in `production/store/`
- Build: version tag `v1.0.0` exists, reproducible build confirmed
- Crash reporting: configured in `technical-preferences.md`
**Input:** `/launch-checklist`
**Expected behavior:**
1. Skill checks all checklist categories
2. All items pass their verification checks
3. Skill produces checklist report with all items marked PASS
4. Skill asks "May I write to `production/launch/launch-checklist-2026-04-06.md`?"
5. Report written on approval; verdict is LAUNCH READY
**Assertions:**
- [ ] All checklist categories are checked (legal, platform, store, build, analytics, UX)
- [ ] All items appear in the report with PASS markers
- [ ] Verdict is LAUNCH READY
- [ ] "May I write" is asked with the correct dated filename
---
### Case 2: Platform Certification Not Submitted — LAUNCH BLOCKED
**Fixture:**
- All other checklist items pass
- Platform certification section: "not submitted" (no submission record found)
**Input:** `/launch-checklist`
**Expected behavior:**
1. Skill checks all items
2. Platform certification check fails: no submission record
3. Skill reports: "LAUNCH BLOCKED — Platform certification not submitted"
4. Specific platform(s) missing certification are named
5. Verdict is LAUNCH BLOCKED
**Assertions:**
- [ ] Verdict is LAUNCH BLOCKED (not CONCERNS)
- [ ] Platform certification is identified as the blocking item
- [ ] Missing platform names are specified
- [ ] All other passing items are still shown in the report
---
### Case 3: Manual Check Required — CONCERNS Verdict
**Fixture:**
- All critical checklist items pass
- First-run experience item: "MANUAL CHECK NEEDED — human must play the first 5
minutes and verify tutorial completion flow"
- Store screenshots item: "MANUAL CHECK NEEDED — art team must verify screenshot
quality matches current build"
**Input:** `/launch-checklist`
**Expected behavior:**
1. Skill checks all items
2. 2 items are flagged as requiring human verification
3. Skill reports: "CONCERNS — 2 items require manual verification before launch"
4. Both items are listed with instructions for what to manually verify
5. Verdict is CONCERNS (not LAUNCH BLOCKED, since these are advisory)
**Assertions:**
- [ ] Verdict is CONCERNS (not LAUNCH READY or LAUNCH BLOCKED)
- [ ] Both manual check items are listed with verification instructions
- [ ] Skill does not auto-block on MANUAL CHECK items
---
### Case 4: Previous Checklist Exists — Delta Comparison
**Fixture:**
- `production/launch/launch-checklist-2026-03-25.md` exists with previous results:
- 2 items were BLOCKED (platform cert, crash reporting)
- 1 item had a MANUAL CHECK
- New checklist: platform cert is now PASS, crash reporting is now PASS,
manual check still open; 1 new item flagged (EULA last updated date)
**Input:** `/launch-checklist`
**Expected behavior:**
1. Skill finds the previous checklist and loads it for comparison
2. Skill produces the new checklist and compares:
- Newly resolved: "Platform cert — was BLOCKED, now PASS"
- Newly resolved: "Crash reporting — was BLOCKED, now PASS"
- Still open: manual check (unchanged)
- New issue: EULA last updated date (not in previous checklist)
3. Delta is shown prominently in the report
4. Verdict is CONCERNS (manual check + new EULA question)
**Assertions:**
- [ ] Delta section shows newly resolved items
- [ ] Delta section shows new issues (not present in previous checklist)
- [ ] Still-open items from the previous checklist are noted as persistent
- [ ] Verdict reflects the current state (not the previous state)
---
### Case 5: Director Gate Check — No gate; launch-checklist is an audit utility
**Fixture:**
- All checklist dependencies present
**Input:** `/launch-checklist`
**Expected behavior:**
1. Skill runs the full checklist and writes the report
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is LAUNCH READY, LAUNCH BLOCKED, or CONCERNS — no gate verdict
---
## Protocol Compliance
- [ ] Checks all required categories (legal, platform, store, build, analytics, UX)
- [ ] LAUNCH BLOCKED for hard failures (uncompleted certifications, missing legal docs)
- [ ] CONCERNS for advisory items requiring manual verification
- [ ] Compares against previous checklist when one exists
- [ ] Asks "May I write" before creating the checklist report
- [ ] Verdict is LAUNCH READY, LAUNCH BLOCKED, or CONCERNS
---
## Coverage Notes
- Region-specific compliance (GDPR data handling, COPPA for under-13 audiences)
is checked but the specific requirements are not enumerated in test assertions.
- The store page completeness check (screenshots, description) relies on the
presence of files in `production/store/`; it cannot verify visual quality.
- Build reproducibility check validates the presence of a version tag and build
configuration but does not execute the build process.

View File

@@ -0,0 +1,176 @@
# Skill Test Spec: /localize
## Skill Summary
`/localize` manages the full localization pipeline: it extracts all player-facing
strings from source files, manages translation files in `assets/localization/`,
and validates completeness across all locale files. For new languages, it creates
a locale file skeleton with all current strings as keys and empty values. For
existing locale files, it produces a diff showing additions, removals, and
changed keys.
Translation files are written to `assets/localization/[locale-code].csv` (or
engine-appropriate format) after a "May I write" ask. No director gates apply.
Verdicts: LOCALIZATION COMPLETE (all locales are complete) or GAPS FOUND (at
least one locale is missing string keys).
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keywords: LOCALIZATION COMPLETE, GAPS FOUND
- [ ] Contains "May I write" collaborative protocol language before writing locale files
- [ ] Has a next-step handoff (e.g., send locale skeletons to translators)
---
## Director Gate Checks
None. `/localize` is a pipeline utility. No director gates apply. Localization
lead agent may review separately but is not invoked within this skill.
---
## Test Cases
### Case 1: New Language — String Extraction and Locale Skeleton Created
**Fixture:**
- Source code in `src/` contains player-facing strings (UI text, tutorial messages)
- Existing locale: `assets/localization/en.csv`
- No French locale exists
**Input:** `/localize fr`
**Expected behavior:**
1. Skill extracts all player-facing strings from source files
2. Skill finds the same strings in `en.csv` as a reference
3. Skill generates `fr.csv` skeleton with all string keys and empty values
4. Skill asks "May I write to `assets/localization/fr.csv`?"
5. File written on approval; verdict is GAPS FOUND (file created but empty values)
6. Skill notes: "fr.csv created — send to translator to fill values"
**Assertions:**
- [ ] All string keys from `en.csv` are present in `fr.csv`
- [ ] All values in `fr.csv` are empty (not copied from English)
- [ ] "May I write" is asked before creating the file
- [ ] Verdict is GAPS FOUND (file is created but untranslated)
---
### Case 2: Existing Locale Diff — Additions, Removals, and Changes Listed
**Fixture:**
- `assets/localization/fr.csv` exists with 20 string keys translated
- Source code has changed: 3 new strings added, 1 string removed, 2 strings
with changed English source text
**Input:** `/localize fr`
**Expected behavior:**
1. Skill extracts current strings from source
2. Skill diffs against existing `fr.csv`
3. Skill produces diff report:
- 3 new keys (need translation — listed as empty in fr.csv)
- 1 removed key (marked as obsolete — suggest removal)
- 2 changed keys (English source changed — French may need update, flagged)
4. Skill asks "May I update `assets/localization/fr.csv`?"
5. File updated with new empty keys added, obsolete keys marked; verdict is GAPS FOUND
**Assertions:**
- [ ] New keys appear as empty in the updated file (not auto-translated)
- [ ] Removed keys are flagged as obsolete (not silently deleted)
- [ ] Changed source strings are flagged for translator review
- [ ] Verdict is GAPS FOUND (new empty keys exist)
---
### Case 3: String Missing in One Locale — GAPS FOUND With Missing Key List
**Fixture:**
- 3 locale files exist: `en.csv`, `fr.csv`, `de.csv`
- `de.csv` is missing 4 keys that exist in both `en.csv` and `fr.csv`
**Input:** `/localize`
**Expected behavior:**
1. Skill reads all 3 locale files and cross-references keys
2. `de.csv` is missing 4 keys
3. Skill produces GAPS FOUND report listing the 4 missing keys by locale:
"de.csv missing: [key1], [key2], [key3], [key4]"
4. Skill offers to add the missing keys as empty values to `de.csv`
5. After approval: file updated; verdict remains GAPS FOUND (values still empty)
**Assertions:**
- [ ] Missing keys are listed explicitly (not just a count)
- [ ] Missing keys are attributed to the specific locale file
- [ ] Verdict is GAPS FOUND (not LOCALIZATION COMPLETE)
- [ ] Missing keys are added as empty (not auto-translated from English)
---
### Case 4: Translation File Has Syntax Error — Error With Line Reference
**Fixture:**
- `assets/localization/fr.csv` has a malformed line at line 47
(missing quote closure)
**Input:** `/localize fr`
**Expected behavior:**
1. Skill reads `fr.csv` and encounters a parse error at line 47
2. Skill outputs: "Parse error in fr.csv at line 47: [error detail]"
3. Skill cannot diff or validate the file until the error is fixed
4. Skill does NOT attempt to overwrite or auto-fix the malformed file
5. Skill suggests fixing the file manually and re-running `/localize`
**Assertions:**
- [ ] Error message includes line number (line 47)
- [ ] Error detail describes the nature of the parse error
- [ ] Skill does NOT overwrite or modify the malformed file
- [ ] Manual fix + re-run is suggested as remediation
---
### Case 5: Director Gate Check — No gate; localization is a pipeline utility
**Fixture:**
- Source code with player-facing strings
**Input:** `/localize fr`
**Expected behavior:**
1. Skill extracts strings and manages locale files
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is LOCALIZATION COMPLETE or GAPS FOUND — no gate verdict
---
## Protocol Compliance
- [ ] Extracts strings from source before operating on locale files
- [ ] Creates new locale files with all keys as empty values (not auto-translated)
- [ ] Diffs existing locale files against current source strings
- [ ] Flags missing keys by locale and by key name
- [ ] Asks "May I write" before creating or updating any locale file
- [ ] Verdict is LOCALIZATION COMPLETE (all locales fully translated) or GAPS FOUND
---
## Coverage Notes
- LOCALIZATION COMPLETE is only achievable when all locale files have all keys
with non-empty values; new-language skeleton creation always results in GAPS FOUND.
- Engine-specific locale formats (Godot `.translation`, Unity `.po` files) are
handled by the skill body; `.csv` is used as the canonical format in tests.
- The case where source strings change at a very high rate (continuous integration
of new UI text) is not tested; the diff logic handles this case.

View File

@@ -0,0 +1,179 @@
# Skill Test Spec: /onboard
## Skill Summary
`/onboard` generates a contextual project onboarding summary tailored for a new
team member. It reads CLAUDE.md, `technical-preferences.md`, the active sprint
file, recent git commits, and `production/stage.txt` to produce a structured
orientation document. The skill runs on the Haiku model (read-only, formatting
task) and produces no file writes — all output is conversational.
The skill optionally accepts a role argument (e.g., `/onboard artist`) to tailor
the summary to a specific discipline. When the project is in an early stage or
unconfigured, the output adapts to reflect what little is known. The verdict is
always ONBOARDING COMPLETE — the skill is purely informational.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keyword: ONBOARDING COMPLETE
- [ ] Does NOT contain "May I write" language (skill is read-only)
- [ ] Has a next-step handoff suggesting a relevant follow-on skill
---
## Director Gate Checks
None. `/onboard` is a read-only orientation skill. No director gates apply.
---
## Test Cases
### Case 1: Happy Path — Configured project in Production stage with active sprint
**Fixture:**
- `production/stage.txt` contains `Production`
- `technical-preferences.md` has engine, language, and specialists populated
- `production/sprints/sprint-005.md` exists with stories in progress
- Git log contains 5 recent commits
**Input:** `/onboard`
**Expected behavior:**
1. Skill reads stage.txt, technical-preferences.md, active sprint, and git log
2. Skill produces an onboarding summary with sections: Project Overview, Tech Stack,
Current Stage, Active Sprint Summary, Recent Activity
3. Summary is formatted for readability (headers, bullet points)
4. Next-step suggestions are appropriate for Production stage (e.g., `/sprint-status`,
`/dev-story`)
5. Verdict ONBOARDING COMPLETE is stated
**Assertions:**
- [ ] Output includes current stage name from stage.txt
- [ ] Output includes engine and language from technical-preferences.md
- [ ] Active sprint stories are summarized (not just the sprint file name)
- [ ] Recent commit context is present
- [ ] Verdict is ONBOARDING COMPLETE
- [ ] No files are written
---
### Case 2: Fresh Project — No engine, no sprint, suggests /start
**Fixture:**
- `technical-preferences.md` contains only placeholders (`[TO BE CONFIGURED]`)
- No `production/stage.txt`
- No sprint files
- No CLAUDE.md overrides beyond defaults
**Input:** `/onboard`
**Expected behavior:**
1. Skill reads all config files and detects unconfigured state
2. Skill produces a minimal summary: "This project has not been configured yet"
3. Output explains the onboarding workflow: `/start``/setup-engine``/brainstorm`
4. Skill suggests running `/start` as the immediate next step
5. Verdict is ONBOARDING COMPLETE (informational, not a failure)
**Assertions:**
- [ ] Output explicitly mentions the project is not yet configured
- [ ] `/start` is recommended as the next step
- [ ] Skill does NOT error out — it gracefully handles an empty project state
- [ ] Verdict is still ONBOARDING COMPLETE
---
### Case 3: No CLAUDE.md Found — Error with remediation
**Fixture:**
- `CLAUDE.md` file does not exist (deleted or never created)
- All other files may or may not exist
**Input:** `/onboard`
**Expected behavior:**
1. Skill attempts to read CLAUDE.md and fails
2. Skill outputs an error: "CLAUDE.md not found — cannot generate onboarding summary"
3. Skill provides remediation: "Run `/start` to initialize the project configuration"
4. No partial summary is generated
**Assertions:**
- [ ] Error message clearly identifies the missing file as CLAUDE.md
- [ ] Remediation step (`/start`) is explicitly named
- [ ] Skill does NOT produce a partial output when the root config is missing
- [ ] Verdict is ONBOARDING COMPLETE (with error context, not a crash)
---
### Case 4: Role-Specific Onboarding — User specifies "artist" role
**Fixture:**
- Fully configured project in Production stage
- `art-bible.md` exists in `design/`
- Active sprint has visual story types (animation, VFX)
**Input:** `/onboard artist`
**Expected behavior:**
1. Skill reads all standard files plus any art-relevant docs (art bible, asset specs)
2. Summary is tailored to the artist role: art bible overview, asset pipeline,
current visual stories in the active sprint
3. Technical architecture details (code structure, ADRs) are de-emphasized
4. Specialist agents for art/audio are highlighted in the summary
5. Verdict is ONBOARDING COMPLETE
**Assertions:**
- [ ] Role argument is acknowledged in the output ("Onboarding for: Artist")
- [ ] Art bible summary is included if the file exists
- [ ] Current visual stories from the active sprint are shown
- [ ] Technical implementation details are not the primary focus
- [ ] Verdict is ONBOARDING COMPLETE
---
### Case 5: Director Gate Check — No gate; onboard is read-only orientation
**Fixture:**
- Any configured project state
**Input:** `/onboard`
**Expected behavior:**
1. Skill completes the full onboarding summary
2. No director agents are spawned at any point
3. No gate IDs appear in the output
4. No "May I write" prompts appear
**Assertions:**
- [ ] No director gate is invoked
- [ ] No write tool is called
- [ ] No gate skip messages appear
- [ ] Verdict is ONBOARDING COMPLETE without any gate check
---
## Protocol Compliance
- [ ] Reads all source files before generating output (no hallucinated project state)
- [ ] Adapts output to project stage (Production ≠ Concept)
- [ ] Respects role argument when provided
- [ ] Does not write any files
- [ ] Ends with ONBOARDING COMPLETE verdict in all paths
---
## Coverage Notes
- The case where `technical-preferences.md` is missing entirely (as opposed to
having placeholders) is not separately tested; behavior follows the graceful
error pattern of Case 3.
- Git history reading is assumed available; offline/no-git scenarios are not
tested here.
- Discipline roles beyond "artist" (e.g., programmer, designer, producer) follow
the same tailoring pattern as Case 4 and are not separately tested.

View File

@@ -0,0 +1,178 @@
# Skill Test Spec: /playtest-report
## Skill Summary
`/playtest-report` generates a structured playtest report from session notes or
user input. The report is organized into four sections: Feel/Accessibility,
Bugs Observed, Design Feedback, and Next Steps. When multiple testers participated,
the skill aggregates feedback and distinguishes majority opinions from minority
ones. The skill links to existing bug reports when a reported bug matches a file
in `production/bugs/`.
Reports are written to `production/qa/playtest-[date].md` after a "May I write"
ask. No director gates apply here — the CD-PLAYTEST director gate (if needed) is
a separate invocation. The verdict is COMPLETE when the report is written.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keyword: COMPLETE
- [ ] Contains "May I write" collaborative protocol language before writing the report
- [ ] Has a next-step handoff (e.g., `/bug-report` for new issues found, `/design-review` for feedback)
---
## Director Gate Checks
None. `/playtest-report` is a documentation utility. The CD-PLAYTEST gate is a
separate invocation and not part of this skill.
---
## Test Cases
### Case 1: Happy Path — User provides playtest notes, structured report produced
**Fixture:**
- User provides typed playtest notes from a single session
- Notes cover: game feel, one bug (framerate drop), and a design concern
(tutorial too long)
- `production/bugs/` exists but is empty (bug not yet reported)
**Input:** `/playtest-report` (user pastes session notes)
**Expected behavior:**
1. Skill reads the provided notes and structures them into the 4-section template
2. Feel/Accessibility: extracts feel observations
3. Bugs: notes the framerate drop with available repro details
4. Design Feedback: notes the tutorial length concern
5. Next Steps: suggests `/bug-report` for the framerate issue and `/design-review`
for the tutorial feedback
6. Skill asks "May I write to `production/qa/playtest-2026-04-06.md`?"
7. Report is written on approval; verdict is COMPLETE
**Assertions:**
- [ ] All 4 sections are present in the report
- [ ] Bug is listed in the Bugs section (not the Design Feedback section)
- [ ] Next Steps are appropriate (bug report for crash, design review for feedback)
- [ ] "May I write" is asked before writing
- [ ] Verdict is COMPLETE
---
### Case 2: Empty Input — Guided prompting through each section
**Fixture:**
- No notes provided by user at invocation
**Input:** `/playtest-report`
**Expected behavior:**
1. Skill detects empty input
2. Skill prompts through each section:
a. "Describe the overall feel and any accessibility observations"
b. "Were any bugs observed? Describe them"
c. "What design feedback did testers provide?"
3. User answers each prompt
4. Skill compiles report from answers and asks "May I write"
5. Report written on approval; verdict is COMPLETE
**Assertions:**
- [ ] At least 3 guiding questions are asked (one per main section)
- [ ] Report is not created until all sections have input (or user explicitly skips one)
- [ ] Verdict is COMPLETE after file is written
---
### Case 3: Multiple Testers — Aggregated feedback with majority/minority notes
**Fixture:**
- User provides notes from 3 testers
- 2/3 testers found the controls "intuitive"
- 1/3 tester found the UI font too small
- All 3 noted the same bug (player stuck on ledge)
**Input:** `/playtest-report` (3-tester session)
**Expected behavior:**
1. Skill identifies 3 distinct tester perspectives in the input
2. Control intuitiveness → noted as "Majority (2/3): controls intuitive"
3. Font size → noted as "Minority (1/3): UI font size concern"
4. Stuck-on-ledge bug → noted as "All testers: player stuck on ledge (confirmed)"
5. Skill generates aggregated report with majority/minority labels
6. Report written after "May I write" approval; verdict is COMPLETE
**Assertions:**
- [ ] Majority opinion (2/3) is labeled as majority
- [ ] Minority opinion (1/3) is labeled as minority
- [ ] Unanimously reported bug is noted as confirmed by all testers
- [ ] Verdict is COMPLETE
---
### Case 4: Bug Matches Existing Report — Links to existing file
**Fixture:**
- `production/bugs/bug-2026-03-30-player-stuck-ledge.md` exists
- User's playtest notes describe "player gets stuck on ledges near walls"
**Input:** `/playtest-report`
**Expected behavior:**
1. Skill structures the report and identifies the stuck-on-ledge bug
2. Skill scans `production/bugs/` and finds `bug-2026-03-30-player-stuck-ledge.md`
3. In the Bugs section, the report includes: "See existing report:
production/bugs/bug-2026-03-30-player-stuck-ledge.md"
4. Skill does NOT suggest creating a new bug report for this issue
5. Report written; verdict is COMPLETE
**Assertions:**
- [ ] Existing bug report is found and linked in the playtest report
- [ ] `/bug-report` is NOT suggested for the already-reported issue
- [ ] Cross-reference to existing file appears in the Bugs section
- [ ] Verdict is COMPLETE
---
### Case 5: Director Gate Check — No gate; CD-PLAYTEST is a separate invocation
**Fixture:**
- Playtest notes provided
**Input:** `/playtest-report`
**Expected behavior:**
1. Skill generates and writes the playtest report
2. No director agents are spawned (CD-PLAYTEST is not invoked here)
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No CD-PLAYTEST gate skip message appears
- [ ] Verdict is COMPLETE without any gate check
---
## Protocol Compliance
- [ ] Structures output into all 4 sections (Feel, Bugs, Design Feedback, Next Steps)
- [ ] Labels majority vs. minority opinions when multiple testers are involved
- [ ] Cross-references existing bug reports when bugs match
- [ ] Asks "May I write to `production/qa/playtest-[date].md`?" before writing
- [ ] Verdict is COMPLETE when report is written
---
## Coverage Notes
- The CD-PLAYTEST director gate (creative director reviews playtest insights
for design implications) is a separate invocation and is not tested here.
- Video recording or screenshot attachments are not tested; the report is a
text-only document.
- The case where a tester's identity is unknown (anonymous feedback) follows
the same aggregation pattern as Case 3 without tester labels.

View File

@@ -0,0 +1,183 @@
# Skill Test Spec: /project-stage-detect
## Skill Summary
`/project-stage-detect` automatically analyzes project artifacts to determine
the current development stage. It runs on the Haiku model (read-only) and
examines `production/stage.txt` (if present), design documents in `design/`,
source code in `src/`, sprint and milestone files in `production/`, and the
presence of engine configuration to classify the project into one of seven
stages: Concept, Systems Design, Technical Setup, Pre-Production, Production,
Polish, or Release.
The skill is advisory — it never writes `stage.txt`. That file is only updated
when `/gate-check` passes and the user confirms advancement. The skill reports
its confidence level (HIGH if stage.txt was read directly, MEDIUM if inferred
from artifacts, LOW if conflicting signals were found).
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains all seven stage names: Concept, Systems Design, Technical Setup, Pre-Production, Production, Polish, Release
- [ ] Does NOT contain "May I write" language (skill is detection-only)
- [ ] Has a next-step handoff (e.g., `/gate-check` to formally advance stage)
---
## Director Gate Checks
None. `/project-stage-detect` is a read-only detection utility. No director
gates apply.
---
## Test Cases
### Case 1: stage.txt Exists — Reads directly and cross-checks artifacts
**Fixture:**
- `production/stage.txt` contains `Production`
- `design/gdd/` has 4 GDD files
- `src/` has source code files
- `production/sprints/sprint-002.md` exists
**Input:** `/project-stage-detect`
**Expected behavior:**
1. Skill reads `production/stage.txt` — detects stage `Production`
2. Skill cross-checks artifacts: GDDs present, source code present, sprint present
3. Artifacts are consistent with Production stage
4. Skill reports: Stage = Production, Confidence = HIGH (from stage.txt, confirmed by artifacts)
5. Next step: continue with `/sprint-plan` or `/dev-story`
**Assertions:**
- [ ] Detected stage is Production
- [ ] Confidence is reported as HIGH when stage.txt is present
- [ ] Cross-check result (consistent vs. discrepant) is noted
- [ ] No files are written
- [ ] Verdict clearly states the detected stage
---
### Case 2: No stage.txt but GDDs and Epics Exist — Infers Production
**Fixture:**
- No `production/stage.txt`
- `design/gdd/` has 3 GDD files
- `production/epics/` has 2 epic files
- `src/` has source code files
- `production/sprints/sprint-001.md` exists
**Input:** `/project-stage-detect`
**Expected behavior:**
1. Skill finds no stage.txt — switches to artifact inference mode
2. Skill finds GDDs (Systems Design complete), epics (Pre-Production complete),
source code and sprints (Production active)
3. Skill infers: Stage = Production
4. Confidence is MEDIUM (inferred from artifacts, not from stage.txt)
5. Skill recommends running `/gate-check` to formalize and write stage.txt
**Assertions:**
- [ ] Inferred stage is Production
- [ ] Confidence is MEDIUM (not HIGH, since stage.txt is absent)
- [ ] Recommendation to run `/gate-check` is present
- [ ] No stage.txt is written by this skill
---
### Case 3: No stage.txt, No Docs, No Source — Infers Concept
**Fixture:**
- No `production/stage.txt`
- `design/` directory exists but is empty
- `src/` exists but contains no code files
- `technical-preferences.md` has placeholders only
**Input:** `/project-stage-detect`
**Expected behavior:**
1. Skill finds no stage.txt
2. Artifact scan: no GDDs, no source, no epics, no sprints, engine unconfigured
3. Skill infers: Stage = Concept
4. Confidence is MEDIUM
5. Skill suggests `/start` to begin the onboarding workflow
**Assertions:**
- [ ] Inferred stage is Concept
- [ ] Output lists the artifacts that were checked (and found absent)
- [ ] `/start` is suggested as the next step
- [ ] No files are written
---
### Case 4: Discrepancy — stage.txt says Production but no source code
**Fixture:**
- `production/stage.txt` contains `Production`
- `design/gdd/` has GDD files
- `src/` directory exists but contains no source code files
- No sprint files exist
**Input:** `/project-stage-detect`
**Expected behavior:**
1. Skill reads stage.txt — detects `Production`
2. Cross-check finds: no source code, no sprints — inconsistent with Production
3. Skill flags discrepancy: "stage.txt says Production but no source code or sprints found"
4. Skill reports detected stage as Production (honoring stage.txt) but
confidence drops to LOW due to artifact mismatch
5. Skill suggests reviewing stage.txt manually or running `/gate-check`
**Assertions:**
- [ ] Discrepancy is flagged explicitly in the output
- [ ] Confidence is LOW when artifacts contradict stage.txt
- [ ] stage.txt value is not silently overridden
- [ ] User is advised to verify the discrepancy manually
---
### Case 5: Director Gate Check — No gate; detection is advisory
**Fixture:**
- Any project state with or without stage.txt
**Input:** `/project-stage-detect`
**Expected behavior:**
1. Skill completes full stage detection
2. No director agents are spawned at any point
3. No gate IDs appear in output
4. No write tool is called
**Assertions:**
- [ ] No director gate is invoked
- [ ] No write tool is called
- [ ] Detection output is purely advisory
- [ ] Verdict names the detected stage without triggering any gate
---
## Protocol Compliance
- [ ] Reads stage.txt if present; falls back to artifact inference if absent
- [ ] Always reports a confidence level (HIGH / MEDIUM / LOW)
- [ ] Cross-checks stage.txt against artifacts and flags discrepancies
- [ ] Does not write stage.txt (that is `/gate-check`'s responsibility)
- [ ] Ends with a next-step recommendation appropriate to the detected stage
---
## Coverage Notes
- The Technical Setup stage (engine configured, no GDDs yet) and Pre-Production
stage (GDDs complete, no epics yet) follow the same artifact-inference pattern
as Cases 2 and 3 and are not separately fixture-tested.
- The Polish and Release stages are not fixture-tested here; they follow the
same high-confidence (stage.txt present) or inference logic.
- Confidence levels are advisory — the skill does not gate any actions on them.

View File

@@ -0,0 +1,178 @@
# Skill Test Spec: /prototype
## Skill Summary
`/prototype` manages a rapid prototyping workflow for validating a game mechanic
before committing to full production implementation. Prototypes are created in
`prototypes/[mechanic-name]/` and are intentionally disposable — coding standards
are relaxed (no ADR required, AC can be minimal, hardcoded values acceptable).
After implementation, the skill produces a findings document summarizing what
was learned and recommending next steps.
The skill asks "May I write to `prototypes/[name]/`?" before creating files. If a
prototype already exists, the skill offers to extend, replace, or archive. No
director gates apply. Verdicts: PROTOTYPE COMPLETE (prototype built and findings
documented) or PROTOTYPE ABANDONED (mechanic found to be unworkable).
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keywords: PROTOTYPE COMPLETE, PROTOTYPE ABANDONED
- [ ] Contains "May I write" language before creating prototype files
- [ ] Has a next-step handoff (e.g., `/design-system` to formalize, or archive)
---
## Director Gate Checks
None. Prototypes are throwaway validation artifacts. No director gates apply.
---
## Test Cases
### Case 1: Happy Path — Mechanic concept prototyped, findings documented
**Fixture:**
- `prototypes/` directory exists
- No existing prototype for "grapple-hook"
**Input:** `/prototype grapple-hook`
**Expected behavior:**
1. Skill asks "May I write to `prototypes/grapple-hook/`?"
2. After approval: creates `prototypes/grapple-hook/` directory and basic
implementation skeleton (main scene, player controller extension)
3. Skill implements a minimal grapple hook mechanic (intentionally rough — no
polish, hardcoded values acceptable)
4. Skill produces `prototypes/grapple-hook/findings.md` with:
- What was tested
- What worked
- What didn't work
- Recommendation (proceed / abandon / revise concept)
5. Verdict is PROTOTYPE COMPLETE
**Assertions:**
- [ ] "May I write to `prototypes/grapple-hook/`?" is asked before any files are created
- [ ] Implementation is isolated to `prototypes/` (not `src/`)
- [ ] `findings.md` is created with at minimum: tested/worked/didn't-work/recommendation
- [ ] Verdict is PROTOTYPE COMPLETE
---
### Case 2: Prototype Already Exists — Offers Extend, Replace, or Archive
**Fixture:**
- `prototypes/grapple-hook/` already exists from a previous prototype session
- It contains a basic implementation and a findings.md
**Input:** `/prototype grapple-hook`
**Expected behavior:**
1. Skill detects existing `prototypes/grapple-hook/` directory
2. Skill reports: "Prototype already exists for grapple-hook"
3. Skill presents 3 options:
- Extend: add new features to the existing prototype
- Replace: start fresh (asks "May I replace `prototypes/grapple-hook/`?")
- Archive: move to `prototypes/archive/grapple-hook/` and start fresh
4. User selects; skill proceeds accordingly
**Assertions:**
- [ ] Existing prototype is detected and reported
- [ ] Exactly 3 options are presented (extend, replace, archive)
- [ ] Replace path includes a "May I replace" confirmation
- [ ] Archive path moves (not deletes) the existing prototype
---
### Case 3: Prototype Validates Mechanic — Recommends Proceeding to Production
**Fixture:**
- Prototype implementation complete
- Findings: grapple hook mechanic is fun and technically feasible
**Input:** `/prototype grapple-hook` (prototype session complete)
**Expected behavior:**
1. After prototype is built and tested, findings are summarized
2. Recommendation in findings.md: "Mechanic validated — recommend proceeding
to `/design-system` for full specification"
3. Skill handoff message explicitly suggests `/design-system grapple-hook`
4. Verdict is PROTOTYPE COMPLETE
**Assertions:**
- [ ] `findings.md` contains an explicit recommendation
- [ ] Recommendation references `/design-system` when mechanic is validated
- [ ] Handoff message echoes the recommendation
- [ ] Verdict is PROTOTYPE COMPLETE (not PROTOTYPE ABANDONED)
---
### Case 4: Prototype Reveals Mechanic is Unworkable — PROTOTYPE ABANDONED
**Fixture:**
- Prototype implemented for "procedural-dialogue"
- After testing: the mechanic creates incoherent dialogue trees and is
frustrating to play
**Input:** `/prototype procedural-dialogue`
**Expected behavior:**
1. Prototype is built
2. Findings document the failure: incoherent output, player confusion, technical complexity
3. Recommendation in findings.md: "Mechanic not viable — abandoning"
4. `findings.md` documents the specific reasons the mechanic failed
5. Skill suggests alternatives in the handoff (e.g., curated dialogue instead)
6. Verdict is PROTOTYPE ABANDONED
**Assertions:**
- [ ] Verdict is PROTOTYPE ABANDONED (not PROTOTYPE COMPLETE)
- [ ] `findings.md` documents specific failure reasons (not vague)
- [ ] Alternative approaches are suggested in the handoff
- [ ] Prototype files are retained (not deleted) for reference
---
### Case 5: Director Gate Check — No gate; prototypes are validation artifacts
**Fixture:**
- Mechanic concept provided
**Input:** `/prototype wall-jump`
**Expected behavior:**
1. Skill creates and documents the prototype
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is PROTOTYPE COMPLETE or PROTOTYPE ABANDONED — no gate verdict
---
## Protocol Compliance
- [ ] Asks "May I write to `prototypes/[name]/`?" before creating any files
- [ ] Creates all files under `prototypes/` (not `src/`)
- [ ] Produces `findings.md` with tested/worked/didn't-work/recommendation
- [ ] Notes that production coding standards are intentionally relaxed
- [ ] Offers extend/replace/archive when prototype already exists
- [ ] Verdict is PROTOTYPE COMPLETE or PROTOTYPE ABANDONED
---
## Coverage Notes
- Prototype implementation quality (code style) is intentionally not tested —
prototypes are throwaway artifacts and quality standards do not apply.
- The archiving mechanism is mentioned in Case 2 but the archive format is
not assertion-tested in detail.
- Engine-specific prototype scaffolding (GDScript scenes vs. C# MonoBehaviour)
follows the same flow with engine-appropriate file types.

View File

@@ -0,0 +1,175 @@
# Skill Test Spec: /qa-plan
## Skill Summary
`/qa-plan` generates a structured QA test plan for a feature or sprint milestone.
It reads story files for the specified sprint, extracts acceptance criteria from
each story, cross-references test standards from `coding-standards.md` to assign
the appropriate test type (unit, integration, visual, UI, or config/data), and
produces a prioritized QA plan document.
The skill asks "May I write to `production/qa/qa-plan-sprint-NNN.md`?" before
persisting the output. If an existing test plan for the same sprint is found, the
skill offers to update rather than replace. The verdict is COMPLETE when the plan
is written. No director gates are used — gate-level story readiness is handled by
`/story-readiness`.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keyword: COMPLETE
- [ ] Contains "May I write" collaborative protocol language before writing the plan
- [ ] Has a next-step handoff (e.g., `/smoke-check` or `/story-readiness`)
---
## Director Gate Checks
None. `/qa-plan` is a planning utility. Story readiness gates are separate.
---
## Test Cases
### Case 1: Happy Path — Sprint with 4 stories generates full test plan
**Fixture:**
- `production/sprints/sprint-003.md` lists 4 stories with defined acceptance criteria
- Stories span types: 1 logic (formula), 1 integration, 1 visual, 1 UI
- `coding-standards.md` is present with test evidence table
**Input:** `/qa-plan sprint-003`
**Expected behavior:**
1. Skill reads sprint-003.md and identifies 4 stories
2. Skill reads each story's acceptance criteria
3. Skill assigns test types per coding-standards.md table:
- Logic story → Unit test (BLOCKING)
- Integration story → Integration test (BLOCKING)
- Visual story → Screenshot + lead sign-off (ADVISORY)
- UI story → Manual walkthrough doc (ADVISORY)
4. Skill drafts QA plan with story-by-story test type breakdown
5. Skill asks "May I write to `production/qa/qa-plan-sprint-003.md`?"
6. File is written on approval; verdict is COMPLETE
**Assertions:**
- [ ] All 4 stories are included in the plan
- [ ] Test type is assigned per coding-standards.md (not guessed)
- [ ] Gate level (BLOCKING vs ADVISORY) is noted for each story
- [ ] "May I write" is asked with the correct file path
- [ ] Verdict is COMPLETE
---
### Case 2: Story With No Acceptance Criteria — Flagged as UNTESTABLE
**Fixture:**
- `production/sprints/sprint-004.md` lists 3 stories; one story has empty
acceptance criteria section
**Input:** `/qa-plan sprint-004`
**Expected behavior:**
1. Skill reads all 3 stories
2. Skill detects the story with no AC
3. Story is flagged as `UNTESTABLE — Acceptance Criteria required` in the plan
4. Other 2 stories receive normal test type assignments
5. Plan is written with the UNTESTABLE story flagged; verdict is COMPLETE
**Assertions:**
- [ ] UNTESTABLE label appears for the story with no AC
- [ ] Plan is not blocked — the other stories are still planned
- [ ] Output suggests adding AC to the flagged story (next step)
- [ ] Verdict is COMPLETE (the plan is still generated)
---
### Case 3: Existing Test Plan Found — Offers update rather than replace
**Fixture:**
- `production/qa/qa-plan-sprint-003.md` already exists from a previous run
- Sprint-003 has 2 new stories added since the last plan
**Input:** `/qa-plan sprint-003`
**Expected behavior:**
1. Skill reads sprint-003.md and detects 2 stories not in the existing plan
2. Skill reports: "Existing QA plan found for sprint-003 — offering to update"
3. Skill presents the 2 new stories and their proposed test assignments
4. Skill asks "May I update `production/qa/qa-plan-sprint-003.md`?" (not overwrite)
5. Updated plan is written on approval
**Assertions:**
- [ ] Skill detects the existing plan file
- [ ] "update" language is used (not "overwrite")
- [ ] Only new stories are proposed for addition — existing entries preserved
- [ ] Verdict is COMPLETE
---
### Case 4: No Stories Found for Sprint — Error with guidance
**Fixture:**
- `production/sprints/sprint-007.md` does not exist
- No other sprint file matching sprint-007
**Input:** `/qa-plan sprint-007`
**Expected behavior:**
1. Skill attempts to read sprint-007.md — file not found
2. Skill outputs: "No sprint file found for sprint-007"
3. Skill suggests running `/sprint-plan` to create the sprint first
4. No plan is written; no "May I write" is asked
**Assertions:**
- [ ] Error message names the missing sprint file
- [ ] `/sprint-plan` is suggested as the remediation step
- [ ] No write tool is called
- [ ] Verdict is not COMPLETE (error state)
---
### Case 5: Director Gate Check — No gate; QA planning is a utility
**Fixture:**
- Sprint with valid stories and AC
**Input:** `/qa-plan sprint-003`
**Expected behavior:**
1. Skill generates and writes QA plan
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Skill reaches COMPLETE without any gate check
---
## Protocol Compliance
- [ ] Reads coding-standards.md test evidence table before assigning test types
- [ ] Assigns BLOCKING or ADVISORY gate level per story type
- [ ] Flags stories with no AC as UNTESTABLE (does not silently skip them)
- [ ] Detects existing plan and offers update path
- [ ] Asks "May I write" before creating or updating the plan file
- [ ] Verdict is COMPLETE when plan is written
---
## Coverage Notes
- The case where `coding-standards.md` is missing (skill cannot assign test types)
is not fixture-tested; behavior would follow the BLOCKED pattern with a note
to restore the standards file.
- Multi-sprint planning (spanning 2 sprints) is not tested; the skill is designed
for one sprint at a time.
- Config/data story type (balance tuning → smoke check) follows the same
assignment pattern as other types in Case 1 and is not separately tested.

View File

@@ -0,0 +1,172 @@
# Skill Test Spec: /regression-suite
## Skill Summary
`/regression-suite` maps test coverage to GDD requirements: it reads the
acceptance criteria from story files in the current sprint (or a specified epic),
then scans `tests/` for corresponding test files and checks whether each AC has
a matching assertion. It produces a coverage report identifying which ACs are
fully covered, partially covered, or untested, and which test files have no
matching AC (orphan tests).
The skill may write a coverage report to `production/qa/` after a "May I write"
ask. No director gates apply. Verdicts: FULL COVERAGE (all ACs have tests),
GAPS FOUND (some ACs are untested), or CRITICAL GAPS (a critical-priority AC
has no test).
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keywords: FULL COVERAGE, GAPS FOUND, CRITICAL GAPS
- [ ] Contains "May I write" language (skill may write coverage report)
- [ ] Has a next-step handoff (e.g., `/test-setup` if framework missing, `/qa-plan` if plan missing)
---
## Director Gate Checks
None. `/regression-suite` is a QA analysis utility. No director gates apply.
---
## Test Cases
### Case 1: Full Coverage — All ACs in sprint have corresponding tests
**Fixture:**
- `production/sprints/sprint-004.md` lists 3 stories with 2 ACs each (6 total)
- `tests/unit/` and `tests/integration/` contain test files that match all 6 ACs
(by system name and scenario description)
**Input:** `/regression-suite sprint-004`
**Expected behavior:**
1. Skill reads all 6 ACs from sprint-004 stories
2. Skill scans test files and matches each AC to at least one test assertion
3. All 6 ACs have coverage
4. Skill produces coverage report: "6/6 ACs covered"
5. Skill asks "May I write to `production/qa/regression-sprint-004.md`?"
6. File is written on approval; verdict is FULL COVERAGE
**Assertions:**
- [ ] All 6 ACs appear in the coverage report
- [ ] Each AC is marked as covered with the matching test file referenced
- [ ] Verdict is FULL COVERAGE
- [ ] "May I write" is asked before writing the report
---
### Case 2: Gaps Found — 3 ACs have no tests
**Fixture:**
- Sprint has 5 stories with 8 total ACs
- Tests exist for 5 of the 8 ACs; 3 ACs have no corresponding test file or assertion
**Input:** `/regression-suite`
**Expected behavior:**
1. Skill reads all 8 ACs
2. Skill scans tests — 5 matched, 3 unmatched
3. Coverage report lists the 3 untested ACs by story and AC text
4. Skill asks "May I write to `production/qa/regression-[sprint]-[date].md`?"
5. Report is written; verdict is GAPS FOUND
**Assertions:**
- [ ] The 3 untested ACs are listed by name in the report
- [ ] Matched ACs are also shown (not only the gaps)
- [ ] Verdict is GAPS FOUND (not FULL COVERAGE)
- [ ] Report is written after "May I write" approval
---
### Case 3: Critical AC Untested — CRITICAL GAPS verdict, flagged prominently
**Fixture:**
- Sprint has 4 stories; one story is Priority: Critical with 2 ACs
- One of the critical-priority ACs has no test
**Input:** `/regression-suite`
**Expected behavior:**
1. Skill reads all stories and ACs, noting which stories are critical priority
2. Skill scans tests — the critical AC has no match
3. Report prominently flags: "CRITICAL GAP: [AC text] — no test found (Critical priority story)"
4. Skill recommends blocking story completion until test is added
5. Verdict is CRITICAL GAPS
**Assertions:**
- [ ] Verdict is CRITICAL GAPS (not GAPS FOUND)
- [ ] Critical priority AC is flagged more prominently than normal gaps
- [ ] Recommendation to block story completion is included
- [ ] Non-critical gaps (if any) are also listed
---
### Case 4: Orphan Tests — Test file has no matching AC
**Fixture:**
- `tests/unit/save_system_test.gd` exists with assertions for scenarios
not present in any current story's AC list
- Current sprint stories do not reference save system
**Input:** `/regression-suite`
**Expected behavior:**
1. Skill scans tests and cross-references ACs
2. `save_system_test.gd` assertions do not match any current AC
3. Test file is flagged as ORPHAN TEST in the coverage report
4. Report notes: "Orphan tests may belong to a past or future sprint, or AC was renamed"
5. Verdict is FULL COVERAGE or GAPS FOUND depending on overall AC coverage
(orphan tests do not affect verdict, they are advisory)
**Assertions:**
- [ ] Orphan test is flagged in the report
- [ ] Orphan flag includes the filename and suggestion (past sprint / renamed AC)
- [ ] Orphan tests do not cause a GAPS FOUND verdict on their own
- [ ] Overall verdict reflects AC coverage only
---
### Case 5: Director Gate Check — No gate; regression-suite is a QA utility
**Fixture:**
- Sprint with stories and test files
**Input:** `/regression-suite`
**Expected behavior:**
1. Skill produces coverage report and writes it
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is FULL COVERAGE, GAPS FOUND, or CRITICAL GAPS — no gate verdict
---
## Protocol Compliance
- [ ] Reads story ACs from sprint files before scanning tests
- [ ] Matches ACs to tests by system name and scenario (not file name alone)
- [ ] Flags critical-priority untested ACs as CRITICAL GAPS
- [ ] Flags orphan tests (exist in tests/ but no AC matches)
- [ ] Asks "May I write" before persisting the coverage report
- [ ] Verdict is FULL COVERAGE, GAPS FOUND, or CRITICAL GAPS
---
## Coverage Notes
- The heuristic for matching an AC to a test (by system name + scenario keywords)
is approximate; exact matching logic is defined in the skill body.
- Integration test coverage is mapped the same way as unit test coverage; no
distinction in verdicts is made between the two.
- This skill does not run the tests — it maps AC text to test assertions. Test
execution is handled by the CI pipeline.

View File

@@ -0,0 +1,177 @@
# Skill Test Spec: /release-checklist
## Skill Summary
`/release-checklist` generates an internal release readiness checklist covering:
sprint story completion, open bug severity, QA sign-off status, build stability,
and changelog readiness. It is an internal gate — not a platform/store checklist
(that is `/launch-checklist`). When a previous release checklist exists, it shows
a delta of resolved and newly introduced issues.
The skill writes its checklist report to `production/releases/release-checklist-[date].md`
after a "May I write" ask. No director gates apply — `/gate-check` handles
formal phase gate logic. Verdicts: RELEASE READY, RELEASE BLOCKED, or CONCERNS.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keywords: RELEASE READY, RELEASE BLOCKED, CONCERNS
- [ ] Contains "May I write" collaborative protocol language before writing the report
- [ ] Has a next-step handoff (e.g., `/launch-checklist` for external or `/gate-check` for phase)
---
## Director Gate Checks
None. `/release-checklist` is an internal audit utility. Formal phase advancement
is managed by `/gate-check`.
---
## Test Cases
### Case 1: Happy Path — All Sprint Stories Complete, QA Passed, RELEASE READY
**Fixture:**
- `production/sprints/sprint-008.md` — all stories are `Status: Done`
- No open bugs with severity HIGH or CRITICAL in `production/bugs/`
- `production/qa/qa-plan-sprint-008.md` has QA sign-off annotation
- Changelog entry for this version exists
- `production/stage.txt` contains `Polish`
**Input:** `/release-checklist`
**Expected behavior:**
1. Skill reads sprint-008: all stories Done
2. Skill reads bugs: no HIGH or CRITICAL open bugs
3. Skill confirms QA plan has sign-off
4. Skill confirms changelog entry exists
5. All checks pass; skill asks "May I write to
`production/releases/release-checklist-2026-04-06.md`?"
6. Report written; verdict is RELEASE READY
**Assertions:**
- [ ] All 4 check categories are evaluated (stories, bugs, QA, changelog)
- [ ] All items appear with PASS markers
- [ ] Verdict is RELEASE READY
- [ ] "May I write" is asked before writing
---
### Case 2: Open HIGH Severity Bugs — RELEASE BLOCKED
**Fixture:**
- All sprint stories are Done
- `production/bugs/` contains 2 open bugs with severity HIGH
**Input:** `/release-checklist`
**Expected behavior:**
1. Skill reads sprint — stories complete
2. Skill reads bugs — 2 HIGH severity bugs open
3. Skill reports: "RELEASE BLOCKED — 2 open HIGH severity bugs must be resolved"
4. Both bug filenames are listed in the report
5. Verdict is RELEASE BLOCKED
**Assertions:**
- [ ] Verdict is RELEASE BLOCKED (not CONCERNS)
- [ ] Both bug filenames are listed explicitly
- [ ] Skill makes clear HIGH severity bugs are blocking (not advisory)
---
### Case 3: Changelog Not Generated — CONCERNS
**Fixture:**
- All stories Done, no HIGH/CRITICAL bugs
- No changelog entry found for the current version/sprint
**Input:** `/release-checklist`
**Expected behavior:**
1. Skill checks all items
2. Changelog check fails: no changelog entry found
3. Skill reports: "CONCERNS — Changelog not generated for this release"
4. Skill suggests running `/changelog` to generate it
5. Verdict is CONCERNS (advisory — not a hard block)
**Assertions:**
- [ ] Verdict is CONCERNS (not RELEASE BLOCKED — changelog is advisory)
- [ ] `/changelog` is suggested as the remediation
- [ ] Other passing checks are shown in the report
- [ ] Missing changelog is described as advisory, not blocking
---
### Case 4: Previous Release Checklist Exists — Delta From Last Release
**Fixture:**
- `production/releases/release-checklist-2026-03-20.md` exists
- Previous: 1 story was incomplete, 1 HIGH bug open
- Current: all stories Done, HIGH bug resolved, but now 1 MEDIUM bug appeared
**Input:** `/release-checklist`
**Expected behavior:**
1. Skill finds the previous checklist and loads it
2. New checklist is generated and compared:
- Newly resolved: "Story [X] — was open, now Done"
- Newly resolved: "HIGH bug [filename] — was open, now closed"
- New item: "1 MEDIUM bug appeared (advisory)"
3. Delta section shows all changes prominently
4. Verdict is CONCERNS (MEDIUM bug is advisory, not blocking)
**Assertions:**
- [ ] Delta section appears in the report with resolved and new items
- [ ] Newly resolved items from the previous checklist are noted
- [ ] New items not present in the previous checklist are highlighted
- [ ] Verdict reflects current state (not previous state)
---
### Case 5: Director Gate Check — No gate; release-checklist is an internal audit
**Fixture:**
- Active sprint with stories and bug reports
**Input:** `/release-checklist`
**Expected behavior:**
1. Skill runs the full checklist and writes the report
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is RELEASE READY, RELEASE BLOCKED, or CONCERNS — no gate verdict
---
## Protocol Compliance
- [ ] Checks sprint story completion status
- [ ] Checks open bug severity (CRITICAL/HIGH = BLOCKED; MEDIUM/LOW = CONCERNS)
- [ ] Checks QA plan sign-off status
- [ ] Checks changelog existence
- [ ] Compares against previous checklist when one exists
- [ ] Asks "May I write" before writing the report
- [ ] Verdict is RELEASE READY, RELEASE BLOCKED, or CONCERNS
---
## Coverage Notes
- Build stability verification (no failed CI runs) is listed as a check category
but relies on external CI system state; the skill notes this as a MANUAL CHECK
if CI integration is not configured.
- CRITICAL bugs always result in RELEASE BLOCKED regardless of other items;
this is equivalent to the HIGH severity case in Case 2.
- Stories with `Status: In Review` (not Done) are treated as incomplete
and result in RELEASE BLOCKED; this edge case follows the same pattern
as the HIGH bug case.

View File

@@ -0,0 +1,180 @@
# Skill Test Spec: /reverse-document
## Skill Summary
`/reverse-document` generates design or architecture documentation from existing
source code. It reads the specified source file(s), infers design intent from
class structure, method names, constants, and comments, and produces either a
GDD skeleton (for gameplay systems) or an architecture overview (for technical
systems). The output is a best-effort inference — magic numbers and undocumented
logic may result in a PARTIAL verdict.
The skill asks "May I write to [inferred path]?" before creating the document.
No director gates apply. Verdicts: COMPLETE (clean inference), PARTIAL (some
fields are ambiguous and need human review).
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keywords: COMPLETE, PARTIAL
- [ ] Contains "May I write" collaborative protocol language before writing the doc
- [ ] Has a next-step handoff (e.g., `/design-review` to validate the generated doc)
---
## Director Gate Checks
None. `/reverse-document` is a documentation utility. No director gates apply.
---
## Test Cases
### Case 1: Well-Structured Source — Accurate design doc skeleton produced
**Fixture:**
- `src/gameplay/health_system.gd` exists with:
- `@export var max_health: int = 100`
- `func take_damage(amount: int)` with clamping logic
- `signal health_changed(new_value: int)`
- Docstrings on all public methods
**Input:** `/reverse-document src/gameplay/health_system.gd`
**Expected behavior:**
1. Skill reads the source file and identifies the health system
2. Skill infers design intent: max health, take_damage behavior, health signal
3. Skill produces GDD skeleton for health system with 8 required sections:
Overview, Player Fantasy, Detailed Rules, Formulas, Edge Cases, Dependencies,
Tuning Knobs, Acceptance Criteria
4. Formulas section includes the inferred clamping formula
5. Tuning Knobs notes `max_health = 100` as a configurable value
6. Skill asks "May I write to `design/gdd/health-system.md`?"
7. File written; verdict is COMPLETE
**Assertions:**
- [ ] All 8 required GDD sections are present in the output
- [ ] `max_health = 100` appears as a Tuning Knob
- [ ] Clamping formula is captured in the Formulas section
- [ ] "May I write" is asked with the inferred path
- [ ] Verdict is COMPLETE
---
### Case 2: Ambiguous Source — Magic Numbers, PARTIAL Verdict
**Fixture:**
- `src/gameplay/enemy_ai.gd` exists with:
- Inline magic numbers: `if distance < 150:`, `speed = 3.5`
- No comments or docstrings
- Complex state machine logic that is not self-explanatory
**Input:** `/reverse-document src/gameplay/enemy_ai.gd`
**Expected behavior:**
1. Skill reads the file and detects magic numbers with no context
2. Skill produces a GDD skeleton with notes: "AMBIGUOUS VALUE: 150 (unknown units —
is this pixels, world units, or tiles?)"
3. Skill marks the Formulas and Tuning Knobs sections as requiring human review
4. Skill asks "May I write to `design/gdd/enemy-ai.md`?" with PARTIAL advisory
5. File written with PARTIAL markers; verdict is PARTIAL
**Assertions:**
- [ ] AMBIGUOUS VALUE annotations appear for magic numbers
- [ ] Sections needing human review are marked explicitly
- [ ] Verdict is PARTIAL (not COMPLETE)
- [ ] File is still written — PARTIAL is not a blocking failure
---
### Case 3: Multiple Interdependent Files — Cross-System Overview Produced
**Fixture:**
- User provides 2 source files: `combat_system.gd` and `damage_resolver.gd`
- The files reference each other (combat calls damage_resolver)
**Input:** `/reverse-document src/gameplay/combat_system.gd src/gameplay/damage_resolver.gd`
**Expected behavior:**
1. Skill reads both files and detects the dependency relationship
2. Skill produces a cross-system architecture overview (not individual GDDs)
3. Overview describes: Combat System → Damage Resolver interaction, shared
interfaces, data flow between the two
4. Skill asks "May I write to `docs/architecture/combat-damage-overview.md`?"
5. Overview written after approval; verdict is COMPLETE (or PARTIAL if ambiguous)
**Assertions:**
- [ ] Both files are analyzed together (not as two separate docs)
- [ ] Cross-system dependency is documented in the output
- [ ] Output file is written to `docs/architecture/` (not `design/gdd/`)
- [ ] Verdict is COMPLETE or PARTIAL
---
### Case 4: Source File Not Found — Error
**Fixture:**
- `src/gameplay/inventory_system.gd` does not exist
**Input:** `/reverse-document src/gameplay/inventory_system.gd`
**Expected behavior:**
1. Skill attempts to read the specified file — not found
2. Skill outputs: "Source file not found: src/gameplay/inventory_system.gd"
3. Skill suggests checking the path or running `/map-systems` to identify
the correct source file
4. No document is created
**Assertions:**
- [ ] Error message names the missing file with the full path
- [ ] Alternative suggestion (check path or `/map-systems`) is provided
- [ ] No write tool is called
- [ ] No verdict is issued (error state)
---
### Case 5: Director Gate Check — No gate; reverse-document is a utility
**Fixture:**
- Well-structured source file exists
**Input:** `/reverse-document src/gameplay/health_system.gd`
**Expected behavior:**
1. Skill generates and writes the design doc
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is COMPLETE or PARTIAL — no gate verdict involved
---
## Protocol Compliance
- [ ] Reads source file(s) before generating any content
- [ ] Produces all 8 required GDD sections when target is a gameplay system
- [ ] Annotates ambiguous values with AMBIGUOUS VALUE markers
- [ ] Produces cross-system overview (not individual GDDs) for multiple files
- [ ] Asks "May I write" before creating any output file
- [ ] Verdict is COMPLETE (clean inference) or PARTIAL (ambiguous fields)
---
## Coverage Notes
- Architecture overview format (for technical/infrastructure systems) differs
from GDD format; the inferred output type is determined by the nature of the
source file (gameplay logic → GDD; engine/infra code → architecture doc).
- The case where a source file is readable but contains only auto-generated
boilerplate with no meaningful logic is not tested; skill would likely produce
a near-empty skeleton with a PARTIAL verdict.
- C# and Blueprint source files follow the same inference pattern as GDScript;
language-specific differences are handled in the skill body.

View File

@@ -0,0 +1,182 @@
# Skill Test Spec: /setup-engine
## Skill Summary
`/setup-engine` configures the project's engine, language, rendering backend,
physics engine, specialist agent assignments, and naming conventions by
populating `technical-preferences.md`. It accepts an optional engine argument
(e.g., `/setup-engine godot`) to skip the engine-selection step. For each
section of `technical-preferences.md`, the skill presents a draft and asks
"May I write to `technical-preferences.md`?" before updating.
The skill also populates the specialist routing table (file extension → agent
mappings) based on the chosen engine. It has no director gates — configuration
is a technical utility task. The verdict is always COMPLETE when the file is
fully written.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keyword: COMPLETE
- [ ] Contains "May I write" collaborative protocol language before updating technical-preferences.md
- [ ] Has a next-step handoff (e.g., `/brainstorm` or `/start` depending on flow)
---
## Director Gate Checks
None. `/setup-engine` is a technical configuration skill. No director gates apply.
---
## Test Cases
### Case 1: Godot 4 + GDScript — Full engine configuration
**Fixture:**
- `technical-preferences.md` contains only placeholders
- Engine argument provided: `godot`
**Input:** `/setup-engine godot`
**Expected behavior:**
1. Skill skips engine-selection step (argument provided)
2. Skill presents language options for Godot: GDScript or C#
3. User selects GDScript
4. Skill drafts all engine sections: engine/language/rendering/physics fields,
naming conventions (snake_case for GDScript), specialist assignments
(godot-specialist, gdscript-specialist, godot-shader-specialist, etc.)
5. Skill populates the routing table: `.gd` → gdscript-specialist, `.gdshader`
godot-shader-specialist, `.tscn` → godot-specialist
6. Skill asks "May I write to `technical-preferences.md`?"
7. File is written after approval; verdict is COMPLETE
**Assertions:**
- [ ] Engine field is set to Godot 4 (not a placeholder)
- [ ] Language field is set to GDScript
- [ ] Naming conventions are GDScript-appropriate (snake_case)
- [ ] Routing table includes `.gd`, `.gdshader`, and `.tscn` entries
- [ ] Specialists are assigned (not placeholders)
- [ ] "May I write" is asked before writing
- [ ] Verdict is COMPLETE
---
### Case 2: Unity + C# — Unity-specific configuration
**Fixture:**
- `technical-preferences.md` contains only placeholders
- Engine argument provided: `unity`
**Input:** `/setup-engine unity`
**Expected behavior:**
1. Skill sets engine to Unity, language to C#
2. Naming conventions are C#-appropriate (PascalCase for classes, camelCase for fields)
3. Specialist assignments reference unity-specialist, csharp-specialist
4. Routing table: `.cs` → csharp-specialist, `.asmdef` → unity-specialist,
`.unity` (scene) → unity-specialist
5. Skill asks "May I write to `technical-preferences.md`?" and writes on approval
**Assertions:**
- [ ] Engine field is set to Unity (not Godot or Unreal)
- [ ] Language field is set to C#
- [ ] Naming conventions reflect C# conventions
- [ ] Routing table includes `.cs` and `.unity` entries
- [ ] Verdict is COMPLETE
---
### Case 3: Unreal + Blueprint — Unreal-specific configuration
**Fixture:**
- `technical-preferences.md` contains only placeholders
- Engine argument provided: `unreal`
**Input:** `/setup-engine unreal`
**Expected behavior:**
1. Skill sets engine to Unreal Engine 5, primary language to Blueprint (Visual Scripting)
2. Specialist assignments reference unreal-specialist, blueprint-specialist
3. Routing table: `.uasset` → blueprint-specialist or unreal-specialist,
`.umap` → unreal-specialist
4. Performance budgets are pre-set with Unreal defaults (e.g., higher draw call budget)
5. Skill asks "May I write" and writes on approval; verdict is COMPLETE
**Assertions:**
- [ ] Engine field is set to Unreal Engine 5
- [ ] Routing table includes `.uasset` and `.umap` entries
- [ ] Blueprint specialist is assigned
- [ ] Verdict is COMPLETE
---
### Case 4: Engine Already Configured — Offers to reconfigure specific sections
**Fixture:**
- `technical-preferences.md` has engine set to Godot 4 with all fields populated
- No engine argument provided
**Input:** `/setup-engine`
**Expected behavior:**
1. Skill reads `technical-preferences.md` and detects fully configured engine (Godot 4)
2. Skill reports: "Engine already configured as Godot 4 + GDScript"
3. Skill presents options: reconfigure all, reconfigure specific section only
(Engine/Language, Naming Conventions, Specialists, Performance Budgets)
4. User selects "Reconfigure Performance Budgets only"
5. Only the performance budget section is updated; all other fields unchanged
6. Skill asks "May I write to `technical-preferences.md`?" and writes on approval
**Assertions:**
- [ ] Skill does NOT overwrite all fields when only a section update was requested
- [ ] User is offered section-specific reconfiguration
- [ ] Only the selected section is modified in the written file
- [ ] Verdict is COMPLETE
---
### Case 5: Director Gate Check — No gate; setup-engine is a utility skill
**Fixture:**
- Fresh project with no engine configured
**Input:** `/setup-engine godot`
**Expected behavior:**
1. Skill completes full engine configuration
2. No director agents are spawned at any point
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is COMPLETE without any gate check
---
## Protocol Compliance
- [ ] Presents draft configuration before asking to write
- [ ] Asks "May I write to `technical-preferences.md`?" before writing
- [ ] Respects engine argument when provided (skips selection step)
- [ ] Detects existing config and offers partial reconfigure
- [ ] Routing table is populated for all key file types for the chosen engine
- [ ] Verdict is COMPLETE after file is written
---
## Coverage Notes
- Godot 4 + C# (instead of GDScript) follows the same flow as Case 1 with
different naming conventions and the godot-csharp-specialist assignment.
This variant is not separately tested.
- The engine-version-specific guidance (e.g., Godot 4.6 knowledge gap warning
from VERSION.md) is surfaced by the skill but not assertion-tested here.
- Performance budget defaults per engine are noted as engine-specific but
exact default values are not assertion-tested.

View File

@@ -0,0 +1,185 @@
# Skill Test Spec: /skill-improve
## Skill Summary
`/skill-improve` runs an automated test-fix-retest improvement loop on a skill
file. It invokes `/skill-test static` (and optionally `/skill-test category`) to
establish a baseline score, diagnoses the failing checks, proposes targeted fixes
to the SKILL.md file, asks "May I write the improvements to [skill path]?", applies
the fixes, and re-runs the tests to confirm improvement.
If the proposed fix makes the skill worse (regression), the fix is reverted (with
user confirmation) rather than applied. If the skill is already perfect (0 failures),
the skill exits immediately without making changes. No director gates apply. Verdicts:
IMPROVED (score went up), NO CHANGE (no improvements possible or user declined), or
REVERTED (fix was applied but caused regression and was reverted).
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keywords: IMPROVED, NO CHANGE, REVERTED
- [ ] Contains "May I write" collaborative protocol language before applying fixes
- [ ] Has a next-step handoff (e.g., run `/skill-test spec` to validate behavioral compliance)
---
## Director Gate Checks
None. `/skill-improve` is a meta-utility skill. No director gates apply.
---
## Test Cases
### Case 1: Happy Path — Skill With 2 Static Failures, Both Fixed, IMPROVED
**Fixture:**
- `.claude/skills/some-skill/SKILL.md` has 2 static failures:
- Check 4: no "May I write" language despite having Write in allowed-tools
- Check 5: no next-step handoff at the end
**Input:** `/skill-improve some-skill`
**Expected behavior:**
1. Skill runs `/skill-test static some-skill` — baseline: 5/7 checks pass
2. Skill diagnoses the 2 failing checks (4 and 5)
3. Skill proposes fixes:
- Add "May I write" language to the appropriate phase
- Add a next-step handoff section at the end
4. Skill asks "May I write improvements to `.claude/skills/some-skill/SKILL.md`?"
5. Fixes applied; `/skill-test static some-skill` re-run — now 7/7 checks pass
6. Verdict is IMPROVED (5→7)
**Assertions:**
- [ ] Baseline score is established before any changes (5/7)
- [ ] Both failing checks are diagnosed and addressed in the proposed fix
- [ ] "May I write" is asked before applying the fix
- [ ] Re-test confirms improvement (7/7)
- [ ] Verdict is IMPROVED with before/after score shown
---
### Case 2: Fix Causes Regression — Score Comparison Shows Regression, REVERTED
**Fixture:**
- `.claude/skills/some-skill/SKILL.md` has 1 static failure (missing handoff)
- Proposed fix inadvertently removes the verdict keywords section
(introducing a new failure)
**Input:** `/skill-improve some-skill`
**Expected behavior:**
1. Baseline: 6/7 checks pass (1 failure: missing handoff)
2. Skill proposes fix and asks "May I write improvements?"
3. Fix is applied; re-test runs
4. Re-test result: 5/7 (fixed the handoff but broke verdict keywords)
5. Skill detects regression: score went DOWN
6. Skill asks user: "Fix caused a regression (6→5). May I revert the changes?"
7. User confirms; changes are reverted; verdict is REVERTED
**Assertions:**
- [ ] Re-test score is compared to baseline before finalizing
- [ ] Regression is detected when score decreases
- [ ] User is asked to confirm revert (not automatic)
- [ ] File is reverted on user confirmation
- [ ] Verdict is REVERTED
---
### Case 3: Skill With Category Assignment — Baseline Captures Both Scores
**Fixture:**
- `.claude/skills/gate-check/SKILL.md` is a gate skill with 1 static failure
and 2 category (G-criteria) failures
- `tests/skills/quality-rubric.md` has Gate Skills section
**Input:** `/skill-improve gate-check`
**Expected behavior:**
1. Skill runs both static and category tests for the baseline:
- Static: 6/7 checks pass
- Category: 3/5 G-criteria pass
2. Combined baseline: 9/12
3. Skill diagnoses all 3 failures and proposes fixes
4. "May I write improvements to `.claude/skills/gate-check/SKILL.md`?"
5. Fixes applied; both test types re-run
6. Re-test: static 7/7, category 5/5 = 12/12
7. Verdict is IMPROVED (9→12)
**Assertions:**
- [ ] Both static and category scores are captured in the baseline
- [ ] Combined score is used for comparison (not just one type)
- [ ] All 3 failures are addressed in the proposed fix
- [ ] Re-test confirms improvement in both score types
- [ ] Verdict is IMPROVED with combined before/after
---
### Case 4: Skill Already Perfect — No Improvements Needed
**Fixture:**
- `.claude/skills/brainstorm/SKILL.md` has no static failures
- Category score is also 5/5 (if applicable)
**Input:** `/skill-improve brainstorm`
**Expected behavior:**
1. Skill runs `/skill-test static brainstorm` — 7/7 checks pass
2. If category applies: 5/5 criteria pass
3. Skill outputs: "No improvements needed — brainstorm is fully compliant"
4. Skill exits without proposing any changes
5. No "May I write" is asked; no files are modified
6. Verdict is NO CHANGE
**Assertions:**
- [ ] Skill exits immediately after confirming 0 failures
- [ ] "No improvements needed" message is shown
- [ ] No changes are proposed
- [ ] No "May I write" is asked
- [ ] Verdict is NO CHANGE
---
### Case 5: Director Gate Check — No gate; skill-improve is a meta utility
**Fixture:**
- Skill with at least 1 static failure
**Input:** `/skill-improve some-skill`
**Expected behavior:**
1. Skill runs the test-fix-retest loop
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is IMPROVED, NO CHANGE, or REVERTED — no gate verdict
---
## Protocol Compliance
- [ ] Always establishes a baseline score before proposing any changes
- [ ] Shows before/after score comparison in the output
- [ ] Asks "May I write" before applying any fix
- [ ] Detects regressions by comparing re-test score to baseline
- [ ] Asks for user confirmation before reverting (not automatic)
- [ ] Ends with IMPROVED, NO CHANGE, or REVERTED verdict
---
## Coverage Notes
- The improvement loop is designed to run only one fix-retest cycle per
invocation; running multiple iterations requires re-invoking `/skill-improve`.
- Behavioral compliance (spec-mode test results) is not included in the
improvement loop — only structural (static) and category scores are automated.
- The case where the skill file cannot be read (permissions error or missing file)
is not tested; this would result in an error before the baseline is established.

View File

@@ -0,0 +1,188 @@
# Skill Test Spec: /skill-test
## Skill Summary
`/skill-test` validates skill files for structural correctness, behavioral
compliance, and category-rubric scoring. It operates in three modes:
- **static**: Checks a single skill file for structural requirements
(frontmatter fields, phase headings, verdict keywords, "May I write" language,
next-step handoff) without needing a fixture. Produces a per-check PASS/FAIL
table.
- **spec**: Reads a test spec file from `tests/skills/` and evaluates the skill
against each test case assertion, producing a case-by-case verdict.
- **audit**: Produces a coverage table of all skills in `.claude/skills/` and
all agents in `.claude/agents/`, showing which have spec files and which do not.
An additional **category** mode reads the quality rubric for a skill category
(e.g., gate skills) and scores the skill against rubric criteria. The verdict
system differs by mode.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdicts: COMPLIANT, NON-COMPLIANT, WARNINGS (static mode); PASS, FAIL, PARTIAL (spec mode); COMPLETE (audit mode)
- [ ] Does NOT contain "May I write" language (skill is read-only in all modes)
- [ ] Has a next-step handoff (e.g., `/skill-improve` to fix issues found)
---
## Director Gate Checks
None. `/skill-test` is a meta-utility skill. No director gates apply.
---
## Test Cases
### Case 1: Static Mode — Well-formed skill, all 7 checks pass, COMPLIANT
**Fixture:**
- `.claude/skills/brainstorm/SKILL.md` exists and is well-formed:
- Has all required frontmatter fields
- Has ≥2 phase headings
- Has verdict keywords
- Has "May I write" language
- Has a next-step handoff
- Documents director gates
- Documents gate mode behavior (lean/solo skips)
**Input:** `/skill-test static brainstorm`
**Expected behavior:**
1. Skill reads `.claude/skills/brainstorm/SKILL.md`
2. Skill runs all 7 structural checks
3. All 7 checks pass
4. Skill outputs a PASS/FAIL table with all 7 checks marked PASS
5. Verdict is COMPLIANT
**Assertions:**
- [ ] Exactly 7 structural checks are reported
- [ ] All 7 are marked PASS
- [ ] Verdict is COMPLIANT
- [ ] No files are written
---
### Case 2: Static Mode — Skill Missing "May I Write" Despite Write Tool in allowed-tools
**Fixture:**
- `.claude/skills/some-skill/SKILL.md` has `Write` in `allowed-tools` frontmatter
- The skill body has no "May I write" or "May I update" language
**Input:** `/skill-test static some-skill`
**Expected behavior:**
1. Skill reads `some-skill/SKILL.md`
2. Check 4 (collaborative write protocol) fails: `Write` in allowed-tools but no
"May I write" language found
3. All other checks may pass
4. Verdict is NON-COMPLIANT with Check 4 as the failing assertion
5. Output lists Check 4 as FAIL with explanation
**Assertions:**
- [ ] Check 4 is marked FAIL
- [ ] Explanation identifies the specific mismatch (Write tool without "May I write" language)
- [ ] Verdict is NON-COMPLIANT
- [ ] Other passing checks are shown (not only the failure)
---
### Case 3: Spec Mode — gate-check Skill Evaluated Against Spec
**Fixture:**
- `tests/skills/gate-check.md` exists with 5 test cases
- `.claude/skills/gate-check/SKILL.md` exists
**Input:** `/skill-test spec gate-check`
**Expected behavior:**
1. Skill reads both the skill file and the spec file
2. Skill evaluates each of the 5 test case assertions against the skill's behavior
3. For each case: PASS if skill behavior matches spec assertions, FAIL if not
4. Skill produces a case-by-case result table
5. Overall verdict: PASS (all 5), PARTIAL (some), or FAIL (majority failing)
**Assertions:**
- [ ] All 5 test cases from the spec are evaluated
- [ ] Each case has an individual PASS/FAIL result
- [ ] Overall verdict is PASS, PARTIAL, or FAIL based on case results
- [ ] No files are written
---
### Case 4: Audit Mode — Coverage Table of All Skills and Agents
**Fixture:**
- `.claude/skills/` contains 72+ skill directories
- `.claude/agents/` contains 49+ agent files
- `tests/skills/` contains spec files for a subset of skills
**Input:** `/skill-test audit`
**Expected behavior:**
1. Skill enumerates all skills in `.claude/skills/` and all agents in `.claude/agents/`
2. Skill checks `tests/skills/` for a corresponding spec file for each
3. Skill produces a coverage table:
- Each skill/agent listed
- "Has Spec" column: YES or NO
- Summary: "X of Y skills have specs; A of B agents have specs"
4. Verdict is COMPLETE
**Assertions:**
- [ ] All skill directories are enumerated (not just a sample)
- [ ] "Has Spec" column is accurate for each entry
- [ ] Summary counts are correct
- [ ] Verdict is COMPLETE
---
### Case 5: Category Mode — Gate Skill Evaluated Against Quality Rubric
**Fixture:**
- `tests/skills/quality-rubric.md` exists with a "Gate Skills" section defining
criteria G1-G5 (e.g., G1: has mode guard, G2: has verdict table, etc.)
- `.claude/skills/gate-check/SKILL.md` is a gate skill
**Input:** `/skill-test category gate-check`
**Expected behavior:**
1. Skill reads `quality-rubric.md` and identifies the Gate Skills section
2. Skill evaluates `gate-check/SKILL.md` against criteria G1-G5
3. Each criterion is scored: PASS, PARTIAL, or FAIL
4. Overall category score is computed (e.g., 4/5 criteria pass)
5. Verdict is COMPLIANT (all pass), WARNINGS (some partial), or NON-COMPLIANT (failures)
**Assertions:**
- [ ] All gate criteria (G1-G5) from quality-rubric.md are evaluated
- [ ] Each criterion has an individual score
- [ ] Overall verdict reflects the score distribution
- [ ] No files are written
---
## Protocol Compliance
- [ ] Static mode checks exactly 7 structural assertions
- [ ] Spec mode evaluates each test case from the spec file individually
- [ ] Audit mode covers all skills AND agents (not just one category)
- [ ] Category mode reads quality-rubric.md to get criteria (not hardcoded)
- [ ] Does not write any files in any mode
- [ ] Suggests `/skill-improve` as the next step when issues are found
---
## Coverage Notes
- The skill-test skill is self-referential (it can test itself). The static
mode case for skill-test's own SKILL.md is not separately fixture-tested to
avoid infinite recursion in test design.
- The specific 7 structural checks are defined in the skill body; only Check 4
(May I write) is individually tested here because it has the most nuanced logic.
- Audit mode counts are approximate — the exact number of skills and agents will
change as the system grows; assertions use "all" rather than fixed counts.

View File

@@ -0,0 +1,193 @@
# Skill Test Spec: /smoke-check
## Skill Summary
`/smoke-check` is the gate between implementation and QA hand-off. It detects the
test environment, runs the automated test suite (via Bash), scans test coverage
against sprint stories, and uses `AskUserQuestion` to batch-verify manual smoke
checks with the developer. It writes a report to `production/qa/smoke-[date].md`
after explicit user approval.
Verdicts: PASS (tests pass, all smoke checks pass, no missing test evidence),
PASS WITH WARNINGS (tests pass or NOT RUN, all critical checks pass, but advisory
gaps exist such as missing test coverage), or FAIL (any automated test failure or
any Batch 1/Batch 2 smoke check returns FAIL).
No director gates apply. The skill does NOT invoke any director agents.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keywords: PASS, PASS WITH WARNINGS, FAIL
- [ ] Contains "May I write" collaborative protocol language before writing the report
- [ ] Has a next-step handoff (e.g., `/bug-report` on FAIL, QA hand-off guidance on PASS)
---
## Director Gate Checks
None. `/smoke-check` is a pre-QA utility skill. No director gates apply.
---
## Test Cases
### Case 1: Happy Path — Automated tests pass, manual items confirmed, PASS
**Fixture:**
- `tests/` directory exists with a GDUnit4 runner script
- Engine detected as Godot from `technical-preferences.md`
- `production/qa/qa-plan-sprint-005.md` exists
- Automated test runner reports 12 tests, 12 passing, 0 failing
- Developer confirms all Batch 1 and Batch 2 smoke checks as PASS
- All sprint stories have matching test files (no MISSING coverage)
**Input:** `/smoke-check`
**Expected behavior:**
1. Skill detects test directory and engine, notes QA plan found
2. Runs `godot --headless --script tests/gdunit4_runner.gd` via Bash
3. Parses output: 12/12 passing
4. Scans test coverage — all stories COVERED or EXPECTED
5. Uses `AskUserQuestion` for Batch 1 (core stability) and Batch 2 (sprint mechanics)
6. Developer selects PASS for all items
7. Report assembled: automated tests PASS, all smoke checks PASS, no MISSING coverage
8. Asks "May I write this smoke check report to `production/qa/smoke-[date].md`?"
9. Writes report after approval
10. Delivers verdict: PASS
**Assertions:**
- [ ] Automated test runner is invoked via Bash
- [ ] `AskUserQuestion` is used for manual smoke check batches
- [ ] "May I write" is asked before writing the report file
- [ ] Report is written to `production/qa/smoke-[date].md`
- [ ] Verdict is PASS
---
### Case 2: Failure Path — Automated test fails, FAIL verdict
**Fixture:**
- `tests/` directory exists, engine is Godot
- Automated test runner reports 10 tests run: 8 passing, 2 failing
- Failing tests: `test_health_clamp_at_zero`, `test_damage_calculation_negative`
- QA plan exists
**Input:** `/smoke-check`
**Expected behavior:**
1. Skill runs automated tests via Bash
2. Parses output — 2 failures detected
3. Records failing test names
4. Proceeds through manual smoke check batches
5. Report shows automated tests as FAIL with failing test names listed
6. Asks to write report; writes after approval
7. Delivers FAIL verdict with message: "The smoke check failed. Do not hand off to
QA until these failures are resolved." Lists failing tests and suggests fixing
then re-running `/smoke-check`
**Assertions:**
- [ ] Failing test names are listed in the report
- [ ] Verdict is FAIL
- [ ] Post-verdict message directs developer to fix failures before QA hand-off
- [ ] `/smoke-check` re-run is suggested after fixing
---
### Case 3: Manual Confirmation — AskUserQuestion used, PASS WITH WARNINGS
**Fixture:**
- `tests/` directory exists, engine is Godot
- Automated test runner reports all tests passing (8/8)
- One Logic story has no matching test file (MISSING coverage)
- Developer confirms all Batch 1 and Batch 2 smoke checks as PASS
**Input:** `/smoke-check`
**Expected behavior:**
1. Automated tests PASS
2. Coverage scan finds 1 MISSING entry for a Logic story
3. `AskUserQuestion` is used for Batch 1 and Batch 2 — developer confirms all PASS
4. Report shows: automated tests PASS, manual checks all PASS, 1 MISSING coverage entry
5. Verdict is PASS WITH WARNINGS — build ready for QA, but MISSING entry must be
resolved before `/story-done` closes the affected story
6. Asks to write report; writes after approval
**Assertions:**
- [ ] `AskUserQuestion` is used for manual smoke check batches (not inline text prompts)
- [ ] MISSING test coverage entry appears in the report
- [ ] Verdict is PASS WITH WARNINGS (not PASS, not FAIL)
- [ ] Advisory note explains MISSING entry must be resolved before `/story-done`
- [ ] Report file is written to `production/qa/smoke-[date].md`
---
### Case 4: No Test Directory — Skill stops with guidance
**Fixture:**
- `tests/` directory does not exist
- Engine is configured as Godot
**Input:** `/smoke-check`
**Expected behavior:**
1. Phase 1 checks for `tests/` directory — not found
2. Skill outputs: "No test directory found at `tests/`. Run `/test-setup` to
scaffold the testing infrastructure, or create the directory manually if
tests live elsewhere."
3. Skill stops — no automated tests run, no manual smoke checks, no report written
**Assertions:**
- [ ] Error message references the missing `tests/` directory
- [ ] `/test-setup` is suggested as the remediation step
- [ ] Skill stops after this message (no further phases run)
- [ ] No report file is written
---
### Case 5: Director Gate Check — No gate; smoke-check is a QA pre-check utility
**Fixture:**
- Valid test setup, automated tests pass, manual smoke checks confirmed
**Input:** `/smoke-check`
**Expected behavior:**
1. Skill runs all phases and produces a PASS or PASS WITH WARNINGS verdict
2. No director agents are spawned at any point
3. No gate IDs (CD-*, TD-*, AD-*, PR-*) appear in output
4. No `/gate-check` is invoked
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is PASS, PASS WITH WARNINGS, or FAIL — no gate verdict involved
---
## Protocol Compliance
- [ ] Uses `AskUserQuestion` for all manual smoke check batches (Batch 1, Batch 2, Batch 3)
- [ ] Runs automated tests via Bash before asking any manual questions
- [ ] Asks "May I write" before creating the report file — never writes without approval
- [ ] Verdict vocabulary is strictly PASS / PASS WITH WARNINGS / FAIL — no other verdicts
- [ ] FAIL is triggered by automated test failures or Batch 1/Batch 2 FAIL responses
- [ ] PASS WITH WARNINGS is triggered when MISSING test coverage exists but no critical failures
- [ ] NOT RUN (engine binary unavailable) is recorded as a warning, not a FAIL
- [ ] Does not invoke director gates at any point
---
## Coverage Notes
- The `quick` argument (skips Phase 3 coverage scan and Batch 3) is not separately
fixture-tested; it follows the same pattern as Case 1 with a coverage-skip note in output.
- The `--platform` argument adds platform-specific AskUserQuestion batches and a
per-platform verdict table; not separately tested here.
- The case where the engine binary is not on PATH (NOT RUN) follows the PASS WITH
WARNINGS pattern and is covered by the protocol compliance assertions above.

View File

@@ -0,0 +1,178 @@
# Skill Test Spec: /soak-test
## Skill Summary
`/soak-test` generates a structured soak test protocol — an extended runtime
test plan designed to surface memory leaks, performance drift, and stability
issues that only appear under sustained gameplay. The skill produces a document
specifying the test duration, system under test, monitoring checkpoints (e.g.,
memory sample every 30 minutes), pass/fail thresholds, and conditions for early
termination.
The skill asks "May I write to `production/qa/soak-[slug]-[date].md`?" before
persisting. If a previous soak test for the same system exists, the skill offers
to extend the duration or add new conditions. No director gates apply. The verdict
is COMPLETE when the soak test protocol is written.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keyword: COMPLETE
- [ ] Contains "May I write" collaborative protocol language before writing the protocol
- [ ] Has a next-step handoff (e.g., `/regression-suite` or `/release-checklist`)
---
## Director Gate Checks
None. `/soak-test` is a QA planning utility. No director gates apply.
---
## Test Cases
### Case 1: Happy Path — Online gameplay feature, 2-hour soak protocol
**Fixture:**
- User specifies: system = "online multiplayer lobby", duration = "2 hours"
- `technical-preferences.md` has engine configured
**Input:** `/soak-test online-lobby 2h`
**Expected behavior:**
1. Skill generates a 2-hour soak test protocol for the online lobby system
2. Protocol includes: monitoring checkpoints every 30 minutes, metrics to track
(memory usage, connection count, packet loss), pass thresholds, early termination
conditions (crash or >20% memory growth)
3. Networking-specific checks are included (session drop rate, reconnect handling)
4. Skill asks "May I write to `production/qa/soak-online-lobby-2026-04-06.md`?"
5. File is written on approval; verdict is COMPLETE
**Assertions:**
- [ ] Protocol duration matches the requested 2 hours
- [ ] Monitoring checkpoints are at reasonable intervals (e.g., every 30 minutes)
- [ ] Network-specific checks are included (not just generic memory checks)
- [ ] "May I write" is asked with the correct file path
- [ ] Verdict is COMPLETE
---
### Case 2: No Target Defined — Prompts for system, duration, and conditions
**Fixture:**
- No arguments provided
- No soak test config in session state
**Input:** `/soak-test`
**Expected behavior:**
1. Skill detects no target system or duration specified
2. Skill asks: "What system or feature should be soak-tested?"
3. After user responds with system: Skill asks: "What duration? (e.g., 1h, 4h, 8h)"
4. After user responds with duration: Skill asks for specific conditions or
uses defaults (normal gameplay loop, default player count)
5. Skill generates protocol from collected inputs and asks "May I write"
**Assertions:**
- [ ] At minimum 2 follow-up questions are asked (system + duration)
- [ ] Default conditions are applied when user doesn't specify custom ones
- [ ] Protocol is not generated until system and duration are known
- [ ] Verdict is COMPLETE after file is written
---
### Case 3: Previous Soak Test Exists — Offers to extend or add conditions
**Fixture:**
- `production/qa/soak-online-lobby-2026-03-15.md` exists with a 1-hour protocol
- User wants to extend to 4 hours with new memory threshold conditions
**Input:** `/soak-test online-lobby 4h`
**Expected behavior:**
1. Skill finds existing soak test for online-lobby
2. Skill reports: "Previous soak test found: soak-online-lobby-2026-03-15.md (1h)"
3. Skill presents options: create new protocol (4h standalone), or extend the
existing protocol to 4h and add new conditions
4. User selects extend; existing checkpoints are preserved, new ones added
5. Skill asks "May I write to `production/qa/soak-online-lobby-2026-04-06.md`?"
(new file, not overwriting old one)
**Assertions:**
- [ ] Existing soak test is surfaced and referenced
- [ ] User is offered extend vs. new options
- [ ] New file is created (old file is not overwritten)
- [ ] Extended protocol includes both old and new checkpoints
- [ ] Verdict is COMPLETE
---
### Case 4: Mobile Target Platform — Memory-specific checkpoints added
**Fixture:**
- `technical-preferences.md` specifies target platform: Mobile
- User requests soak test for "gameplay session" at 30 minutes
**Input:** `/soak-test gameplay 30m`
**Expected behavior:**
1. Skill reads `technical-preferences.md` and detects mobile target platform
2. Soak test protocol includes mobile-specific memory checkpoints:
- Check heap memory growth vs. device baseline
- Check texture memory at checkpoint intervals
- Add warning threshold at 300MB (mobile ceiling)
3. Protocol also includes thermal/battery drain advisory notes
4. Skill asks "May I write?" and writes on approval; verdict is COMPLETE
**Assertions:**
- [ ] Mobile platform is detected from technical-preferences.md
- [ ] Memory checkpoints include mobile-appropriate thresholds (not desktop)
- [ ] Thermal/battery notes are present in the protocol
- [ ] Verdict is COMPLETE
---
### Case 5: Director Gate Check — No gate; soak-test is a planning utility
**Fixture:**
- Valid system and duration provided
**Input:** `/soak-test combat 1h`
**Expected behavior:**
1. Skill generates and writes the soak test protocol
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Skill reaches COMPLETE without any gate check
---
## Protocol Compliance
- [ ] Collects system, duration, and conditions before generating protocol
- [ ] Includes monitoring checkpoints at regular intervals
- [ ] Includes pass/fail thresholds and early termination conditions
- [ ] Adapts checkpoints to target platform (mobile vs. desktop)
- [ ] Asks "May I write" before creating the protocol file
- [ ] Verdict is COMPLETE when file is written
---
## Coverage Notes
- Soak tests for specific engine subsystems (rendering pipeline, physics
simulation) follow the same protocol structure and are not separately tested.
- The case where the user provides a duration shorter than the minimum useful
soak period (e.g., 5 minutes) is not tested; the skill would note this is
too short for meaningful results.
- Automated execution of the soak test protocol is outside this skill's scope —
this skill generates the plan, not the runner.

View File

@@ -0,0 +1,173 @@
# Skill Test Spec: /start
## Skill Summary
`/start` is the first-time onboarding skill for new projects. It guides the
user through naming the project, choosing a game engine, and setting up the
initial directory structure. It creates stub configuration files (CLAUDE.md,
technical-preferences.md) and then routes to `/setup-engine` with the chosen
engine as an argument. Each file or directory created is gated behind a
"May I write" ask, following the collaborative protocol.
The skill detects whether a project is already configured and whether a
partial setup exists, offering to resume or restart as appropriate. It has
no director gates — it is a utility setup skill that runs before any agent
hierarchy exists.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
- [ ] Contains "May I write" collaborative protocol language for each config file
- [ ] Has a next-step handoff at the end (routes to `/setup-engine`)
---
## Director Gate Checks
None. `/start` is a utility setup skill. No director agents exist yet at the
point this skill runs.
---
## Test Cases
### Case 1: Happy Path — Fresh repo, no engine, full onboarding flow
**Fixture:**
- Empty repository: no CLAUDE.md overrides, no `production/stage.txt`, no
`technical-preferences.md` content beyond placeholders
- No existing design docs or source code
**Input:** `/start`
**Expected behavior:**
1. Skill detects no existing configuration and begins fresh onboarding
2. Skill asks for project name
3. Skill presents 3 engine options: Godot 4, Unity, Unreal Engine 5
4. User selects an engine
5. Skill asks "May I write the initial directory structure?"
6. Skill creates all directories defined in `directory-structure.md`
7. Skill asks "May I write CLAUDE.md stub?" and writes it on approval
8. Skill routes to `/setup-engine [chosen-engine]` to complete technical config
**Assertions:**
- [ ] Project name is captured before any file is written
- [ ] Exactly 3 engine options are presented
- [ ] "May I write" is asked for each config file individually
- [ ] No file is written without explicit user approval
- [ ] Handoff to `/setup-engine` occurs at the end with the chosen engine argument
- [ ] Verdict is COMPLETE after all files are written and handoff is issued
---
### Case 2: Already Configured — Detects existing config, offers to skip or reconfigure
**Fixture:**
- `technical-preferences.md` has engine already set (not placeholder)
- `production/stage.txt` exists with `Concept`
**Input:** `/start`
**Expected behavior:**
1. Skill reads `technical-preferences.md` and detects configured engine
2. Skill reports: "This project is already configured with [engine]"
3. Skill presents options: skip (exit), reconfigure engine, or reconfigure specific sections
4. If user selects skip: skill exits cleanly with a summary of current config
5. If user selects reconfigure: skill proceeds to the engine-selection step
**Assertions:**
- [ ] Skill does NOT overwrite existing config without user choosing reconfigure
- [ ] Detected engine name is shown to the user in the status message
- [ ] User is offered at least 2 options (skip or reconfigure)
- [ ] Verdict is COMPLETE whether user skips or reconfigures
---
### Case 3: Engine Choice — User picks Godot 4, routes to /setup-engine godot
**Fixture:**
- Fresh repo — no existing configuration
**Input:** `/start`
**Expected behavior:**
1. Skill presents engine options and user selects Godot 4
2. Skill writes initial stubs (directory structure, CLAUDE.md) after approval
3. Skill explicitly routes to `/setup-engine godot` as the next step
4. Handoff message clearly names the engine and the next skill invocation
**Assertions:**
- [ ] Handoff command is `/setup-engine godot` (not generic `/setup-engine`)
- [ ] Handoff is issued after all initial stubs are written, not before
- [ ] Engine choice is echoed back to user before writing begins
---
### Case 4: Interrupted Setup — Partial config detected, offers resume or restart
**Fixture:**
- Directory structure exists (was created) but `technical-preferences.md` is
still all placeholders (engine was never chosen — setup was interrupted)
- No `production/stage.txt`
**Input:** `/start`
**Expected behavior:**
1. Skill detects partial state: directories exist but engine is unconfigured
2. Skill reports: "A partial setup was detected — directories exist but engine is not configured"
3. Skill offers: resume from engine selection, or restart from scratch
4. If resume: skill skips directory creation, proceeds to engine choice
5. If restart: skill asks "May I overwrite existing structure?" before proceeding
**Assertions:**
- [ ] Partial state is correctly identified (directories present, engine absent)
- [ ] User is offered resume vs. restart choice — not forced into one path
- [ ] Resume path skips re-creating directories (no redundant "May I write" for structure)
- [ ] Restart path asks for permission to overwrite before touching any files
---
### Case 5: Director Gate Check — No gate; start is a utility setup skill
**Fixture:**
- Any fixture
**Input:** `/start`
**Expected behavior:**
1. Skill completes full onboarding flow
2. No director agents are spawned at any point
3. No gate IDs (CD-*, TD-*, AD-*, PR-*) appear in the output
**Assertions:**
- [ ] No director gate is invoked during the skill execution
- [ ] No gate skip messages appear (gates are absent, not suppressed)
- [ ] Skill reaches COMPLETE without any gate verdict
---
## Protocol Compliance
- [ ] Asks for project name before any file is written
- [ ] Presents engine options as a structured choice (not free text)
- [ ] Asks "May I write" separately for directory structure and for CLAUDE.md stub
- [ ] Ends with a handoff to `/setup-engine` with the engine name as argument
- [ ] Verdict is clearly stated (COMPLETE or BLOCKED) at end of output
---
## Coverage Notes
- The case where the user rejects all engine options and provides a custom
engine name is not tested — the skill is designed for the three supported
engines only.
- Git initialization (if any) is not tested here; that is an infrastructure
concern outside the skill boundary.
- Solo vs. lean mode behavior is not applicable — this skill has no gates and
mode selection is irrelevant.

View File

@@ -0,0 +1,175 @@
# Skill Test Spec: /test-helpers
## Skill Summary
`/test-helpers` generates engine-specific test helper utilities for the project's
test suite. Helpers include factory functions (for creating test entities with
known state), fixture loaders, assertion helpers, and mock stubs for external
dependencies. Generated helpers follow the naming and structure conventions in
`coding-standards.md` and are written to `tests/helpers/`.
Each helper file is gated behind a "May I write" ask. If a helper file already
exists, the skill offers to extend it rather than replace. No director gates
apply. The verdict is COMPLETE when helper files are written.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keyword: COMPLETE
- [ ] Contains "May I write" collaborative protocol language before writing helpers
- [ ] Has a next-step handoff (e.g., write a test using the generated helper)
---
## Director Gate Checks
None. `/test-helpers` is a scaffolding utility. No director gates apply.
---
## Test Cases
### Case 1: Happy Path — Player factory helper generated for Godot/GDScript
**Fixture:**
- `technical-preferences.md` has engine Godot 4, language GDScript
- `tests/` directory exists (test-setup has been run)
- `design/gdd/player.md` exists with defined player properties
- No existing helpers in `tests/helpers/`
**Input:** `/test-helpers player-factory`
**Expected behavior:**
1. Skill reads engine (Godot 4 / GDScript) and player GDD for property context
2. Skill generates a deterministic `PlayerFactory` helper in GDScript:
- `create_player(health: int = 100, speed: float = 200.0)` function
- Returns a player node pre-configured to a known state
- Uses dependency injection (no singletons)
3. Skill asks "May I write to `tests/helpers/player_factory.gd`?"
4. File is written on approval; verdict is COMPLETE
**Assertions:**
- [ ] Generated helper is in GDScript (not C# or Blueprint)
- [ ] Factory function parameters use defaults matching GDD values
- [ ] Helper uses dependency injection (no Autoload/singleton references)
- [ ] Filename follows snake_case convention for GDScript
- [ ] Verdict is COMPLETE
---
### Case 2: No Test Setup Exists — Redirects to /test-setup
**Fixture:**
- `tests/` directory does not exist
**Input:** `/test-helpers player-factory`
**Expected behavior:**
1. Skill checks for `tests/` directory — not found
2. Skill reports: "Test directory not found — test framework must be set up first"
3. Skill suggests running `/test-setup` before generating helpers
4. No helper file is created
**Assertions:**
- [ ] Error message identifies the missing tests/ directory
- [ ] `/test-setup` is suggested as the prerequisite step
- [ ] No write tool is called
- [ ] Verdict is not COMPLETE (blocked state)
---
### Case 3: Helper Already Exists — Offers to extend rather than replace
**Fixture:**
- `tests/helpers/player_factory.gd` already exists with a `create_player()` function
- User requests a new `create_enemy()` function be added to the factory
**Input:** `/test-helpers enemy-factory`
**Expected behavior:**
1. Skill finds an existing `player_factory.gd` and checks if it's the right file
to extend (or if a separate `enemy_factory.gd` should be created)
2. Skill presents options: add `create_enemy()` to existing factory or create
`tests/helpers/enemy_factory.gd`
3. User selects extend; skill drafts the `create_enemy()` function
4. Skill asks "May I extend `tests/helpers/player_factory.gd`?"
5. Function is added on approval; verdict is COMPLETE
**Assertions:**
- [ ] Existing helper is detected and surfaced
- [ ] User is given extend vs. new file choice
- [ ] "May I extend" language is used (not "May I write" for replacement)
- [ ] Existing `create_player()` is preserved in the extended file
- [ ] Verdict is COMPLETE
---
### Case 4: System Has No GDD — Notes missing design context in helper
**Fixture:**
- `technical-preferences.md` has Godot 4 / GDScript
- `tests/` exists
- User requests a helper for the "inventory system" but no `design/gdd/inventory.md` exists
**Input:** `/test-helpers inventory-factory`
**Expected behavior:**
1. Skill looks for `design/gdd/inventory.md` — not found
2. Skill notes: "No GDD found for inventory — generating helper with placeholder defaults"
3. Skill generates an `inventory_factory.gd` with generic placeholder values
(item_count = 0, max_capacity = 20) and a comment: "# TODO: align defaults
with inventory GDD when written"
4. Skill asks "May I write to `tests/helpers/inventory_factory.gd`?"
5. File is written; verdict is COMPLETE with advisory note
**Assertions:**
- [ ] Skill proceeds without GDD (does not block)
- [ ] Generated helper has placeholder defaults with TODO comment
- [ ] Missing GDD is noted in the output (advisory warning)
- [ ] Verdict is COMPLETE
---
### Case 5: Director Gate Check — No gate; test-helpers is a scaffolding utility
**Fixture:**
- Engine configured, tests/ exists
**Input:** `/test-helpers player-factory`
**Expected behavior:**
1. Skill generates and writes the helper file
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is COMPLETE without any gate check
---
## Protocol Compliance
- [ ] Reads engine before generating any helper (helpers are engine-specific)
- [ ] Reads GDD for default values when available
- [ ] Notes missing GDD context rather than blocking
- [ ] Detects existing helper files and offers extend rather than replace
- [ ] Asks "May I write" (or "May I extend") before any file operation
- [ ] Verdict is COMPLETE when helper is written
---
## Coverage Notes
- Mock/stub helper generation (for dependencies like save systems or audio buses)
follows the same pattern as factory helpers and is not separately tested.
- Unity C# helper generation (using NSubstitute or custom mocks) follows the
same logic as Case 1 with language-appropriate output.
- The case where the requested helper type is not recognized is not tested;
the skill would ask the user to clarify the helper type.

View File

@@ -0,0 +1,173 @@
# Skill Test Spec: /test-setup
## Skill Summary
`/test-setup` scaffolds the test framework for the project based on the
configured engine. It creates the `tests/` directory structure defined in
`coding-standards.md` (unit/, integration/, performance/, playtest/) and
generates the appropriate test runner configuration for the detected engine:
GdUnit4 config for Godot, Unity Test Runner asmdef for Unity, or Unreal headless
runner for Unreal Engine.
Each file or directory created is gated behind a "May I write" ask. If the test
framework already exists, the skill verifies the configuration rather than
reinitializing. No director gates apply. The verdict is COMPLETE when the
scaffold is in place.
---
## Static Assertions (Structural)
Verified automatically by `/skill-test static` — no fixture needed.
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keyword: COMPLETE
- [ ] Contains "May I write" collaborative protocol language before creating files
- [ ] Has a next-step handoff (e.g., `/test-helpers` to generate helper utilities)
---
## Director Gate Checks
None. `/test-setup` is a scaffolding utility. No director gates apply.
---
## Test Cases
### Case 1: Happy Path — Godot project, scaffolds GdUnit4 test structure
**Fixture:**
- `technical-preferences.md` has engine set to Godot 4, language GDScript
- `tests/` directory does not exist yet
**Input:** `/test-setup`
**Expected behavior:**
1. Skill reads engine from `technical-preferences.md` → Godot 4 + GDScript
2. Skill drafts the test directory structure: tests/unit/, tests/integration/,
tests/performance/, tests/playtest/, and a GdUnit4 runner config file
3. Skill asks "May I write the tests/ directory structure?"
4. Directories and GdUnit4 runner script created on approval
5. Skill confirms the runner script matches the CI command in coding-standards.md:
`godot --headless --script tests/gdunit4_runner.gd`
6. Verdict is COMPLETE
**Assertions:**
- [ ] All 4 subdirectories (unit/, integration/, performance/, playtest/) are created
- [ ] GdUnit4 runner config is generated
- [ ] Runner script path matches coding-standards.md CI command
- [ ] "May I write" is asked before creating any files
- [ ] Verdict is COMPLETE
---
### Case 2: Unity Project — Scaffolds Unity Test Runner with asmdef
**Fixture:**
- `technical-preferences.md` has engine set to Unity, language C#
- `tests/` directory does not exist
**Input:** `/test-setup`
**Expected behavior:**
1. Skill reads engine → Unity + C#
2. Skill creates `Tests/` directory with Unity conventions (capitalized)
3. Skill generates `Tests/Tests.asmdef` and `Tests/Editor/EditorTests.asmdef`
4. EditMode and PlayMode test runner modes are configured
5. Skill asks "May I write the Tests/ directory structure?"
6. Verdict is COMPLETE
**Assertions:**
- [ ] Unity-specific `Tests/` structure is created (not the Godot structure)
- [ ] `.asmdef` files are generated
- [ ] EditMode and PlayMode runner config is present
- [ ] Verdict is COMPLETE
---
### Case 3: Test Framework Already Exists — Verifies config, not re-initialized
**Fixture:**
- `tests/unit/`, `tests/integration/` exist
- GdUnit4 runner script exists (Godot project)
**Input:** `/test-setup`
**Expected behavior:**
1. Skill detects existing tests/ structure
2. Skill reports: "Test framework already exists — verifying configuration"
3. Skill checks: runner script path, directory completeness, CI command alignment
4. If all checks pass: reports "Configuration verified — no changes needed"
5. If checks fail (e.g., missing tests/performance/): reports specific gap and
asks "May I add the missing directories?"
**Assertions:**
- [ ] Skill does NOT reinitialize when framework exists
- [ ] Verification checks are performed on existing structure
- [ ] Only missing parts trigger a "May I write" ask
- [ ] Verdict is COMPLETE whether everything was OK or gaps were fixed
---
### Case 4: No Engine Configured — Redirects to /setup-engine
**Fixture:**
- `technical-preferences.md` contains only placeholders (engine not set)
**Input:** `/test-setup`
**Expected behavior:**
1. Skill reads `technical-preferences.md` and finds engine placeholder
2. Skill reports: "Engine not configured — cannot scaffold engine-specific test framework"
3. Skill suggests running `/setup-engine` first
4. No directories or files are created
**Assertions:**
- [ ] Error message explicitly states engine is not configured
- [ ] `/setup-engine` is suggested as the next step
- [ ] No write tool is called
- [ ] Verdict is not COMPLETE (blocked state)
---
### Case 5: Director Gate Check — No gate; test-setup is a scaffolding utility
**Fixture:**
- Engine configured, tests/ does not exist
**Input:** `/test-setup`
**Expected behavior:**
1. Skill scaffolds and writes all test framework files
2. No director agents are spawned
3. No gate IDs appear in output
**Assertions:**
- [ ] No director gate is invoked
- [ ] No gate skip messages appear
- [ ] Verdict is COMPLETE without any gate check
---
## Protocol Compliance
- [ ] Reads engine from `technical-preferences.md` before generating any scaffold
- [ ] Generates engine-appropriate test runner config (not generic)
- [ ] Creates all 4 subdirectories from coding-standards.md
- [ ] Asks "May I write" before creating files
- [ ] Detects existing framework and offers verification (not reinitialization)
- [ ] Verdict is COMPLETE when scaffold is in place
---
## Coverage Notes
- Unreal Engine test scaffolding (headless runner with `-nullrhi`) follows the
same pattern as Cases 1 and 2 and is not separately fixture-tested.
- CI integration file generation (e.g., `.github/workflows/test.yml`) is
referenced but not assertion-tested here — it may be a separate skill concern.
- The case where tests/ exists but is from a different engine (e.g., Unity tests
in a now-Godot project) is not tested; the skill would detect the mismatch
and offer to reconcile.