添加 claude code game studios 到项目
This commit is contained in:
170
CCGS Skill Testing Framework/skills/analysis/asset-audit.md
Normal file
170
CCGS Skill Testing Framework/skills/analysis/asset-audit.md
Normal file
@@ -0,0 +1,170 @@
|
||||
# Skill Test Spec: /asset-audit
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/asset-audit` audits the `assets/` directory for naming convention compliance,
|
||||
missing metadata, and format/size issues. It reads asset files against the
|
||||
conventions and budgets defined in `technical-preferences.md`. No director gates
|
||||
are invoked. The skill does not write without user approval. Verdicts: COMPLIANT,
|
||||
WARNINGS, or NON-COMPLIANT.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLIANT, WARNINGS, NON-COMPLIANT
|
||||
- [ ] Does NOT require "May I write" language (read-only; optional report requires approval)
|
||||
- [ ] Has a next-step handoff (what to do after audit results)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Asset auditing is a read-only analysis skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All assets follow naming conventions
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` specifies naming convention: `snake_case`, e.g., `enemy_grunt_idle.png`
|
||||
- `assets/art/characters/` contains: `enemy_grunt_idle.png`, `enemy_sniper_run.png`
|
||||
- `assets/audio/sfx/` contains: `sfx_jump_land.ogg`, `sfx_item_pickup.ogg`
|
||||
- All files are within size budget (textures ≤2MB, audio ≤500KB)
|
||||
|
||||
**Input:** `/asset-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads naming conventions and size budgets from `technical-preferences.md`
|
||||
2. Skill scans `assets/` recursively
|
||||
3. All files match `snake_case` convention; all within budget
|
||||
4. Audit table shows all rows PASS
|
||||
5. Verdict is COMPLIANT
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Audit covers both art and audio asset directories
|
||||
- [ ] Each file is checked against naming convention and size budget
|
||||
- [ ] All rows show PASS when compliant
|
||||
- [ ] Verdict is COMPLIANT
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Non-Compliant — Textures exceed size budget
|
||||
|
||||
**Fixture:**
|
||||
- `assets/art/environment/` contains 5 texture files
|
||||
- 3 texture files are 4MB each (budget: ≤2MB)
|
||||
- 2 texture files are within budget
|
||||
|
||||
**Input:** `/asset-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads size budget from `technical-preferences.md` (2MB for textures)
|
||||
2. Skill scans `assets/art/environment/` — finds 3 oversized textures
|
||||
3. Audit table lists each oversized file with actual size and budget
|
||||
4. Verdict is NON-COMPLIANT
|
||||
5. Skill recommends compression or resolution reduction for flagged files
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 3 oversized files are listed by name with actual size and budget size
|
||||
- [ ] Verdict is NON-COMPLIANT when any file exceeds its budget
|
||||
- [ ] Optimization recommendation is given for oversized files
|
||||
- [ ] Within-budget files are also listed (showing PASS) for completeness
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Format Issue — Audio in wrong format
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` specifies audio format: OGG
|
||||
- `assets/audio/music/theme_main.wav` exists (WAV format)
|
||||
- `assets/audio/sfx/sfx_footstep.ogg` exists (correct OGG format)
|
||||
|
||||
**Input:** `/asset-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads audio format requirement: OGG
|
||||
2. Skill scans `assets/audio/` — finds `theme_main.wav` in wrong format
|
||||
3. Audit table flags `theme_main.wav` as FORMAT ISSUE (expected OGG, found WAV)
|
||||
4. `sfx_footstep.ogg` shows PASS
|
||||
5. Verdict is WARNINGS (format issues are correctable)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] `theme_main.wav` is flagged as FORMAT ISSUE with expected and actual format noted
|
||||
- [ ] Verdict is WARNINGS (not NON-COMPLIANT) for format issues, which are correctable
|
||||
- [ ] Correct-format assets are shown as PASS
|
||||
- [ ] Skill does not modify or convert any asset files
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Missing Asset — Asset referenced by GDD but absent from assets/
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/enemies.md` references `enemy_boss_idle.png`
|
||||
- `assets/art/characters/boss/` directory is empty — file does not exist
|
||||
|
||||
**Input:** `/asset-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads GDD references to find expected assets (cross-references with `/content-audit` scope)
|
||||
2. Skill scans `assets/art/characters/boss/` — file not found
|
||||
3. Audit table flags `enemy_boss_idle.png` as MISSING ASSET
|
||||
4. Verdict is NON-COMPLIANT (missing critical art asset)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks GDD references to identify expected assets
|
||||
- [ ] Missing assets are flagged as MISSING ASSET with the GDD reference noted
|
||||
- [ ] Verdict is NON-COMPLIANT when critical assets are missing
|
||||
- [ ] Skill does not create or add placeholder assets
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; technical-artist may be consulted separately
|
||||
|
||||
**Fixture:**
|
||||
- 2 files have naming convention violations (CamelCase instead of snake_case)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/asset-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans assets and finds 2 naming violations
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Verdict is WARNINGS
|
||||
4. Output notes: "Consider having a Technical Artist review naming conventions"
|
||||
5. Skill presents findings; offers optional audit report write
|
||||
6. If user opts in: "May I write to `production/qa/asset-audit-[date].md`?"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Technical artist consultation is suggested (not mandated)
|
||||
- [ ] Findings table is presented before any write prompt
|
||||
- [ ] Optional audit report write asks "May I write" before writing
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads `technical-preferences.md` for naming conventions, formats, and size budgets
|
||||
- [ ] Scans `assets/` directory recursively
|
||||
- [ ] Audit table shows file name, check type, expected value, actual value, and result
|
||||
- [ ] Does not modify any asset files
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: COMPLIANT, WARNINGS, NON-COMPLIANT
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Metadata checks (e.g., missing texture import settings in Godot `.import` files)
|
||||
are not explicitly tested here; they follow the same FORMAT ISSUE flagging pattern.
|
||||
- The interaction between `/asset-audit` and `/content-audit` (both check GDD
|
||||
references vs. assets) is intentional overlap; `/asset-audit` focuses on
|
||||
compliance while `/content-audit` focuses on completeness.
|
||||
172
CCGS Skill Testing Framework/skills/analysis/balance-check.md
Normal file
172
CCGS Skill Testing Framework/skills/analysis/balance-check.md
Normal file
@@ -0,0 +1,172 @@
|
||||
# Skill Test Spec: /balance-check
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/balance-check` reads balance data files (JSON or YAML in `assets/data/`) and
|
||||
checks each value against the design formulas defined in GDDs under `design/gdd/`.
|
||||
It produces a findings table with columns: Value → Formula → Deviation → Severity.
|
||||
No director gates are invoked (read-only analysis). The skill may optionally write
|
||||
a balance report but asks "May I write" before doing so. Verdicts: BALANCED,
|
||||
CONCERNS, or OUT OF BALANCE.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: BALANCED, CONCERNS, OUT OF BALANCE
|
||||
- [ ] Contains "May I write" language (optional report write)
|
||||
- [ ] Has a next-step handoff (what to do after findings are reviewed)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Balance check is a read-only analysis skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All balance values within formula tolerances
|
||||
|
||||
**Fixture:**
|
||||
- `assets/data/combat-balance.json` exists with 6 stat values
|
||||
- `design/gdd/combat-system.md` contains formulas for all 6 stats with ±10% tolerance
|
||||
- All 6 values fall within tolerance
|
||||
|
||||
**Input:** `/balance-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all balance data files in `assets/data/`
|
||||
2. Skill reads GDD formulas from `design/gdd/`
|
||||
3. Skill computes deviation for each value against its formula
|
||||
4. All deviations are within ±10% tolerance
|
||||
5. Skill outputs findings table with all rows showing PASS
|
||||
6. Verdict is BALANCED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Findings table is shown for all checked values
|
||||
- [ ] Each row shows: stat name, formula target, actual value, deviation percentage
|
||||
- [ ] All rows show PASS or equivalent when within tolerance
|
||||
- [ ] Verdict is BALANCED
|
||||
- [ ] No files are written without user approval
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Out of Balance — Player damage 40% above formula target
|
||||
|
||||
**Fixture:**
|
||||
- `assets/data/combat-balance.json` has `player_damage_base: 140`
|
||||
- `design/gdd/combat-system.md` formula specifies `player_damage_base = 100` (±10%)
|
||||
- All other stats are within tolerance
|
||||
|
||||
**Input:** `/balance-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads combat-balance.json and computes deviation for `player_damage_base`
|
||||
2. Deviation is +40% — far outside ±10% tolerance
|
||||
3. Skill flags this row as severity HIGH in the findings table
|
||||
4. Verdict is OUT OF BALANCE
|
||||
5. Skill surfaces the HIGH severity item prominently before the table
|
||||
|
||||
**Assertions:**
|
||||
- [ ] `player_damage_base` row shows deviation of +40%
|
||||
- [ ] Severity is HIGH for deviations exceeding tolerance by more than 2×
|
||||
- [ ] Verdict is OUT OF BALANCE when any stat has HIGH severity deviation
|
||||
- [ ] The HIGH severity item is called out explicitly, not buried in table rows
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No GDD Formulas — Cannot validate, guidance given
|
||||
|
||||
**Fixture:**
|
||||
- `assets/data/economy-balance.yaml` exists with 10 stat values
|
||||
- No GDD in `design/gdd/` contains formula definitions for economy stats
|
||||
|
||||
**Input:** `/balance-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads balance data files
|
||||
2. Skill searches GDDs for formula definitions — finds none for economy stats
|
||||
3. Skill outputs: "Cannot validate economy stats — no formulas defined. Run /design-system first."
|
||||
4. No findings table is generated for the economy stats
|
||||
5. Verdict is CONCERNS (data exists but cannot be validated)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not fabricate formula targets when none exist in GDDs
|
||||
- [ ] Output explicitly names the missing formula source
|
||||
- [ ] Output recommends running `/design-system` to define formulas
|
||||
- [ ] Verdict is CONCERNS (not BALANCED, since validation was impossible)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Orphan Reference — Balance file references an undefined stat
|
||||
|
||||
**Fixture:**
|
||||
- `assets/data/combat-balance.json` contains a stat `legacy_armor_mult: 1.5`
|
||||
- `design/gdd/combat-system.md` has no formula for `legacy_armor_mult`
|
||||
- All other stats have formula definitions and pass validation
|
||||
|
||||
**Input:** `/balance-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all stats from combat-balance.json
|
||||
2. Skill cannot find a formula for `legacy_armor_mult` in any GDD
|
||||
3. Skill flags `legacy_armor_mult` as ORPHAN REFERENCE in the findings table
|
||||
4. Other stats are evaluated normally; those within tolerance show PASS
|
||||
5. Verdict is CONCERNS (orphan reference prevents full validation)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] `legacy_armor_mult` appears in findings table with status ORPHAN REFERENCE
|
||||
- [ ] Orphan references are distinguished from formula deviations in the table
|
||||
- [ ] Verdict is CONCERNS when any orphan references are found
|
||||
- [ ] Skill does not skip orphan stats silently
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — Read-only; no gate; optional report requires approval
|
||||
|
||||
**Fixture:**
|
||||
- Balance data and GDD formulas exist; 1 stat has CONCERNS-level deviation (15% above target)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/balance-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads data and GDDs; generates findings table
|
||||
2. Verdict is CONCERNS (one stat slightly out of range)
|
||||
3. No director gate is invoked
|
||||
4. Skill presents findings table to user
|
||||
5. Skill offers to write an optional balance report
|
||||
6. If user says yes: skill asks "May I write to `production/qa/balance-report-[date].md`?"
|
||||
7. If user says no: skill ends without writing
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Findings table is presented without writing anything automatically
|
||||
- [ ] Optional report write is offered but not forced
|
||||
- [ ] "May I write" prompt appears only if user opts in to the report
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads both balance data files and GDD formulas before analysis
|
||||
- [ ] Findings table shows Value, Formula, Deviation, and Severity columns
|
||||
- [ ] Does not write any files without explicit user approval
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: BALANCED, CONCERNS, OUT OF BALANCE
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where `assets/data/` is entirely empty is not tested; behavior
|
||||
follows the CONCERNS pattern with a message that no data files were found.
|
||||
- Tolerance thresholds (±10%, ±20%) are implementation details of the skill;
|
||||
the tests verify that deviations are detected and classified, not the
|
||||
exact threshold values.
|
||||
172
CCGS Skill Testing Framework/skills/analysis/code-review.md
Normal file
172
CCGS Skill Testing Framework/skills/analysis/code-review.md
Normal file
@@ -0,0 +1,172 @@
|
||||
# Skill Test Spec: /code-review
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/code-review` performs an architectural code review of source files in `src/`,
|
||||
checking coding standards from `CLAUDE.md` (doc comments on public APIs,
|
||||
dependency injection over singletons, data-driven values, testability). Findings
|
||||
are advisory. No director gates are invoked. No code edits are made. Verdicts:
|
||||
APPROVED, CONCERNS, or NEEDS CHANGES.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: APPROVED, CONCERNS, NEEDS CHANGES
|
||||
- [ ] Does NOT require "May I write" language (read-only; findings are advisory output)
|
||||
- [ ] Has a next-step handoff (what to do with findings)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Code review is a read-only advisory skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Source file follows all coding standards
|
||||
|
||||
**Fixture:**
|
||||
- `src/gameplay/health_component.gd` exists with:
|
||||
- All public methods have doc comments (`##` notation)
|
||||
- No singletons used; dependencies injected via constructor
|
||||
- No hardcoded values; all constants reference `assets/data/`
|
||||
- ADR reference in file header: `# ADR: docs/architecture/adr-004-health.md`
|
||||
- Referenced ADR has `Status: Accepted`
|
||||
|
||||
**Input:** `/code-review src/gameplay/health_component.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the source file
|
||||
2. Skill checks all coding standards: doc comments, DI, data-driven, ADR status
|
||||
3. All checks pass
|
||||
4. Skill outputs findings summary with all checks PASS
|
||||
5. Verdict is APPROVED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Each coding standard check is listed in the output
|
||||
- [ ] All checks show PASS when standards are met
|
||||
- [ ] Skill reads referenced ADR to confirm its status
|
||||
- [ ] Verdict is APPROVED
|
||||
- [ ] No edits are made to any file
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Needs Changes — Missing doc comment and singleton usage
|
||||
|
||||
**Fixture:**
|
||||
- `src/ui/inventory_ui.gd` has:
|
||||
- 2 public methods without doc comments
|
||||
- Uses `GameManager.instance` (singleton pattern)
|
||||
- All other standards met
|
||||
|
||||
**Input:** `/code-review src/ui/inventory_ui.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the source file
|
||||
2. Skill detects: 2 missing doc comments on public methods
|
||||
3. Skill detects: singleton usage at specific lines (e.g., line 42, line 87)
|
||||
4. Findings list the exact method names and line numbers
|
||||
5. Verdict is NEEDS CHANGES
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Missing doc comments are listed with method names
|
||||
- [ ] Singleton usage is flagged with file and line number
|
||||
- [ ] Verdict is NEEDS CHANGES when BLOCKING-level standard violations exist
|
||||
- [ ] Skill does not edit the file — findings are for the developer to act on
|
||||
- [ ] Output suggests replacing singleton with dependency injection
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Architecture Risk — ADR reference is Proposed, not Accepted
|
||||
|
||||
**Fixture:**
|
||||
- `src/core/save_system.gd` has a header comment: `# ADR: docs/architecture/adr-010-save.md`
|
||||
- `adr-010-save.md` exists but has `Status: Proposed`
|
||||
- Code itself follows all other coding standards
|
||||
|
||||
**Input:** `/code-review src/core/save_system.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the source file
|
||||
2. Skill reads referenced ADR — finds `Status: Proposed`
|
||||
3. Skill flags this as ARCHITECTURE RISK (code is implementing an unaccepted ADR)
|
||||
4. Other coding standard checks pass
|
||||
5. Verdict is CONCERNS (risk flag is advisory, not a hard NEEDS CHANGES)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads referenced ADR file to check its status
|
||||
- [ ] ARCHITECTURE RISK is flagged when ADR status is Proposed
|
||||
- [ ] Verdict is CONCERNS (not NEEDS CHANGES) for ADR risk — advisory severity
|
||||
- [ ] Output recommends resolving the ADR before the code goes to production
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No source files found at specified path
|
||||
|
||||
**Fixture:**
|
||||
- User calls `/code-review src/networking/`
|
||||
- `src/networking/` directory does not exist
|
||||
|
||||
**Input:** `/code-review src/networking/`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read files in `src/networking/`
|
||||
2. Directory or files not found
|
||||
3. Skill outputs an error: "No source files found at `src/networking/`"
|
||||
4. Skill suggests checking `src/` for valid directories
|
||||
5. No verdict is emitted (nothing was reviewed)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash when path does not exist
|
||||
- [ ] Output names the attempted path in the error message
|
||||
- [ ] Output suggests checking `src/` for valid file paths
|
||||
- [ ] No verdict is emitted when there is nothing to review
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; LP may be consulted separately
|
||||
|
||||
**Fixture:**
|
||||
- Source file follows most standards but has 1 CONCERNS-level finding (a magic number)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/code-review src/gameplay/loot_system.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads and reviews the source file
|
||||
2. No director gate is invoked (code review findings are advisory)
|
||||
3. Skill presents findings with the CONCERNS verdict
|
||||
4. Output notes: "Consider requesting a Lead Programmer review for architecture concerns"
|
||||
5. Skill does not invoke any agent automatically
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] LP consultation is suggested (not mandated) in the output
|
||||
- [ ] No code edits are made
|
||||
- [ ] Verdict is CONCERNS for advisory-level findings
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads source file(s) and coding standards before reviewing
|
||||
- [ ] Lists each coding standard check in findings output
|
||||
- [ ] Does not edit any source files (read-only skill)
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: APPROVED, CONCERNS, NEEDS CHANGES
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Batch review of all files in a directory is not explicitly tested; behavior
|
||||
is assumed to apply the same checks file by file and aggregate the verdict.
|
||||
- Test coverage checks (verifying corresponding test files exist) are a stretch
|
||||
goal not tested here; that is primarily the domain of `/test-evidence-review`.
|
||||
@@ -0,0 +1,176 @@
|
||||
# Skill Test Spec: /consistency-check
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/consistency-check` scans all GDDs in `design/gdd/` and checks for internal
|
||||
conflicts across documents. It produces a structured findings table with columns:
|
||||
System A vs System B, Conflict Type, Severity (HIGH / MEDIUM / LOW). Conflict
|
||||
types include: formula mismatch, competing ownership, stale reference, and
|
||||
dependency gap.
|
||||
|
||||
The skill is read-only during analysis. It has no director gates. An optional
|
||||
consistency report can be written to `design/consistency-report-[date].md` if the
|
||||
user requests it, but the skill asks "May I write" before doing so.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: CONSISTENT, CONFLICTS FOUND, DEPENDENCY GAP
|
||||
- [ ] Does NOT require "May I write" language during analysis (read-only scan)
|
||||
- [ ] Has a next-step handoff at the end
|
||||
- [ ] Documents that report writing is optional and requires approval
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
No director gates — this skill spawns no director gate agents. Consistency
|
||||
checking is a mechanical scan; no creative or technical director review is
|
||||
required as part of the scan itself.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — 4 GDDs with no conflicts
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` contains exactly 4 system GDDs
|
||||
- All GDDs have consistent formulas (no overlapping variables with different values)
|
||||
- No two GDDs claim ownership of the same game entity or mechanic
|
||||
- All dependency references point to GDDs that exist
|
||||
|
||||
**Input:** `/consistency-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all 4 GDDs in `design/gdd/`
|
||||
2. Runs cross-GDD consistency checks (formulas, ownership, references)
|
||||
3. No conflicts found
|
||||
4. Outputs structured findings table showing 0 issues
|
||||
5. Verdict: CONSISTENT
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 4 GDDs are read before producing output
|
||||
- [ ] Findings table is present (even if empty — shows "No conflicts found")
|
||||
- [ ] Verdict is CONSISTENT when no conflicts exist
|
||||
- [ ] Skill does NOT write any files without user approval
|
||||
- [ ] Next-step handoff is present
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — Two GDDs with conflicting damage formulas
|
||||
|
||||
**Fixture:**
|
||||
- GDD-A defines damage formula: `damage = attack * 1.5`
|
||||
- GDD-B defines damage formula: `damage = attack * 2.0` for the same entity type
|
||||
- Both GDDs refer to the same "attack" variable
|
||||
|
||||
**Input:** `/consistency-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all GDDs and detects the formula mismatch
|
||||
2. Findings table includes an entry: GDD-A vs GDD-B | Formula Mismatch | HIGH
|
||||
3. Specific conflicting formulas are shown (not just "formula conflict exists")
|
||||
4. Verdict: CONFLICTS FOUND
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is CONFLICTS FOUND (not CONSISTENT)
|
||||
- [ ] Conflict entry names both GDD filenames
|
||||
- [ ] Conflict type is "Formula Mismatch"
|
||||
- [ ] Severity is HIGH for a direct formula contradiction
|
||||
- [ ] Both conflicting formulas are shown in the findings table
|
||||
- [ ] Skill does NOT auto-resolve the conflict
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Partial Path — GDD references a system with no GDD
|
||||
|
||||
**Fixture:**
|
||||
- GDD-A's Dependencies section lists "system-B" as a dependency
|
||||
- No GDD for system-B exists in `design/gdd/`
|
||||
- All other GDDs are consistent
|
||||
|
||||
**Input:** `/consistency-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all GDDs and checks dependency references
|
||||
2. GDD-A's reference to "system-B" cannot be resolved — no GDD exists for it
|
||||
3. Findings table includes: GDD-A vs (missing) | Dependency Gap | MEDIUM
|
||||
4. Verdict: DEPENDENCY GAP (not CONSISTENT, not CONFLICTS FOUND)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is DEPENDENCY GAP (distinct from CONSISTENT and CONFLICTS FOUND)
|
||||
- [ ] Findings entry names GDD-A and the missing system-B
|
||||
- [ ] Severity is MEDIUM for an unresolved dependency reference
|
||||
- [ ] Skill suggests running `/design-system system-B` to create the missing GDD
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No GDDs found
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` directory is empty or does not exist
|
||||
|
||||
**Input:** `/consistency-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read files in `design/gdd/`
|
||||
2. No GDD files found
|
||||
3. Skill outputs an error: "No GDDs found in `design/gdd/`. Run `/design-system` to create GDDs first."
|
||||
4. No findings table is produced
|
||||
5. No verdict is issued
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs a clear error message when no GDDs are found
|
||||
- [ ] No verdict is produced (CONSISTENT / CONFLICTS FOUND / DEPENDENCY GAP)
|
||||
- [ ] Skill recommends the correct next action (`/design-system`)
|
||||
- [ ] Skill does NOT crash or produce a partial report
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — No gate spawned; no review-mode.txt read
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` contains ≥2 GDDs
|
||||
- `production/session-state/review-mode.txt` exists with `full`
|
||||
|
||||
**Input:** `/consistency-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all GDDs and runs the consistency scan
|
||||
2. Skill does NOT read `production/session-state/review-mode.txt`
|
||||
3. No director gate agents are spawned at any point
|
||||
4. Findings table and verdict are produced normally
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate agents are spawned (no CD-, TD-, PR-, AD- prefixed gates)
|
||||
- [ ] Skill does NOT read `production/session-state/review-mode.txt`
|
||||
- [ ] Output contains no "Gate: [GATE-ID]" or gate-skipped entries
|
||||
- [ ] Review mode has no effect on this skill's behavior
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads all GDDs before producing the findings table
|
||||
- [ ] Findings table shown in full before any write ask (if report is requested)
|
||||
- [ ] Verdict is one of exactly: CONSISTENT, CONFLICTS FOUND, DEPENDENCY GAP
|
||||
- [ ] No director gates — no review-mode.txt read
|
||||
- [ ] Report writing (if requested) gated by "May I write" approval
|
||||
- [ ] Ends with next-step handoff appropriate to verdict
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- This skill checks for structural consistency between GDDs. Deep design theory
|
||||
analysis (pillar drift, dominant strategies) is handled by `/review-all-gdds`.
|
||||
- Formula conflict detection relies on consistent formula notation across GDDs —
|
||||
informal descriptions of the same mechanic may not be detected.
|
||||
- The conflict severity rubric (HIGH / MEDIUM / LOW) is defined in the skill body
|
||||
and not re-enumerated here.
|
||||
164
CCGS Skill Testing Framework/skills/analysis/content-audit.md
Normal file
164
CCGS Skill Testing Framework/skills/analysis/content-audit.md
Normal file
@@ -0,0 +1,164 @@
|
||||
# Skill Test Spec: /content-audit
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/content-audit` reads GDDs in `design/gdd/` and checks whether all content
|
||||
items specified there (enemies, items, levels, etc.) are accounted for in
|
||||
`assets/`. It produces a gap table: Content Type → Specified Count → Found Count
|
||||
→ Missing Items. No director gates are invoked. The skill does not write without
|
||||
user approval. Verdicts: COMPLETE, GAPS FOUND, or MISSING CRITICAL CONTENT.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, GAPS FOUND, MISSING CRITICAL CONTENT
|
||||
- [ ] Does NOT require "May I write" language (read-only output; write is optional report)
|
||||
- [ ] Has a next-step handoff (what to do after gap table is reviewed)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Content audit is a read-only analysis skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All specified content present
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/enemies.md` specifies 4 enemy types: Grunt, Sniper, Tank, Boss
|
||||
- `assets/art/characters/` contains folders: `grunt/`, `sniper/`, `tank/`, `boss/`
|
||||
- `design/gdd/items.md` specifies 3 item types; all 3 found in `assets/data/items/`
|
||||
|
||||
**Input:** `/content-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all GDDs in `design/gdd/`
|
||||
2. Skill scans `assets/` for each specified content item
|
||||
3. All 4 enemy types and 3 item types are found
|
||||
4. Gap table shows: all rows have Found Count = Specified Count, no missing items
|
||||
5. Verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Gap table covers all content types found in GDDs
|
||||
- [ ] Each row shows Specified Count and Found Count
|
||||
- [ ] No missing items when counts match
|
||||
- [ ] Verdict is COMPLETE
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Gaps Found — Enemy type missing from assets
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/enemies.md` specifies 3 enemy types: Grunt, Sniper, Boss
|
||||
- `assets/art/characters/` contains: `grunt/`, `sniper/` only (Boss folder missing)
|
||||
|
||||
**Input:** `/content-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads GDD — finds 3 enemy types specified
|
||||
2. Skill scans `assets/art/characters/` — finds only 2
|
||||
3. Gap table row for enemies: Specified 3, Found 2, Missing: Boss
|
||||
4. Verdict is GAPS FOUND
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Gap table row identifies "Boss" as the missing item by name
|
||||
- [ ] Specified Count (3) and Found Count (2) are both shown
|
||||
- [ ] Verdict is GAPS FOUND when any content item is missing
|
||||
- [ ] Skill does not assume the asset will be added later — it flags it now
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No GDD Content Specs Found — Guidance given
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` contains only `core-loop.md` which has no content inventory section
|
||||
- No other GDDs exist with content specifications
|
||||
|
||||
**Input:** `/content-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all GDDs — finds no content inventory sections
|
||||
2. Skill outputs: "No content specifications found in GDDs — run /design-system first to define content lists"
|
||||
3. No gap table is produced
|
||||
4. Verdict is GAPS FOUND (cannot confirm completeness without specs)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not produce a gap table when no GDD content specs exist
|
||||
- [ ] Output recommends running `/design-system`
|
||||
- [ ] Verdict reflects inability to confirm completeness
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — Asset in wrong format for target platform
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/audio.md` specifies audio assets as OGG format
|
||||
- `assets/audio/sfx/jump.wav` exists (WAV format, not OGG)
|
||||
- `assets/audio/sfx/land.ogg` exists (correct format)
|
||||
- `technical-preferences.md` specifies audio format: OGG
|
||||
|
||||
**Input:** `/content-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads GDD audio spec and technical preferences for format requirements
|
||||
2. Skill finds `jump.wav` — present but in wrong format
|
||||
3. Gap table row for audio: Specified 2, Found 2 (by name), but `jump.wav` flagged as FORMAT ISSUE
|
||||
4. Verdict is GAPS FOUND (format compliance is part of content completeness)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks asset format against GDD or technical preferences when format is specified
|
||||
- [ ] `jump.wav` is flagged as FORMAT ISSUE with expected format (OGG) noted
|
||||
- [ ] Format issues are distinct from missing content in the gap table
|
||||
- [ ] Verdict is GAPS FOUND when format issues exist
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — Read-only; no gate; gap table for human review
|
||||
|
||||
**Fixture:**
|
||||
- GDDs specify 10 content items; 9 are found in assets; 1 is missing
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/content-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads GDDs and scans assets; produces gap table
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Skill presents gap table to user as read-only output
|
||||
4. Verdict is GAPS FOUND
|
||||
5. Skill offers to write an audit report but does not write automatically
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Gap table is presented without auto-writing any file
|
||||
- [ ] Optional report write is offered but not forced
|
||||
- [ ] Skill does not modify any asset files
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads GDDs and asset directory before producing gap table
|
||||
- [ ] Gap table shows Content Type, Specified Count, Found Count, Missing Items
|
||||
- [ ] Does not write files without explicit user approval
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: COMPLETE, GAPS FOUND, MISSING CRITICAL CONTENT
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- MISSING CRITICAL CONTENT verdict (vs. GAPS FOUND) is triggered when the
|
||||
missing item is tagged as critical in the GDD; this is not explicitly tested
|
||||
but follows the same detection path.
|
||||
- The case where `assets/` directory does not exist is not tested; the skill
|
||||
would produce a MISSING CRITICAL CONTENT verdict for all specified items.
|
||||
168
CCGS Skill Testing Framework/skills/analysis/estimate.md
Normal file
168
CCGS Skill Testing Framework/skills/analysis/estimate.md
Normal file
@@ -0,0 +1,168 @@
|
||||
# Skill Test Spec: /estimate
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/estimate` estimates task or story effort using a relative-size scale (S / M /
|
||||
L / XL) based on story complexity, acceptance criteria count, and historical
|
||||
sprint velocity from past sprint files. Estimates are advisory and are never
|
||||
written automatically. No director gates are invoked. Verdicts are effort ranges,
|
||||
not pass/fail — every run produces an estimate.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains size labels: S, M, L, XL (the "verdict" equivalents for this skill)
|
||||
- [ ] Does NOT require "May I write" language (advisory output only)
|
||||
- [ ] Has a next-step handoff (how to use the estimate in sprint planning)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Estimation is an advisory informational skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Clear story with known tech stack
|
||||
|
||||
**Fixture:**
|
||||
- `production/epics/combat/story-hitbox-detection.md` exists with:
|
||||
- 4 clear Acceptance Criteria
|
||||
- ADR reference (Accepted status)
|
||||
- No "unknown" or "TBD" language in story body
|
||||
- `production/sprints/sprint-003.md` through `sprint-005.md` exist with velocity data
|
||||
- Tech stack is GDScript (well-understood by team per sprint history)
|
||||
|
||||
**Input:** `/estimate production/epics/combat/story-hitbox-detection.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the story file — assesses clarity, AC count, tech stack
|
||||
2. Skill reads sprint history to determine average velocity
|
||||
3. Skill outputs estimate: M (1–2 days) with reasoning
|
||||
4. No files are written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Estimate is M for a clear, well-scoped story with known tech
|
||||
- [ ] Reasoning references AC count, tech stack familiarity, and velocity data
|
||||
- [ ] Estimate is presented as a range (e.g., "1–2 days"), not a single point
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: High Uncertainty — Unknown system, no ADR yet
|
||||
|
||||
**Fixture:**
|
||||
- `production/epics/online/story-lobby-matchmaking.md` exists with:
|
||||
- 2 vague Acceptance Criteria (using "should" and "TBD")
|
||||
- No ADR reference — matchmaking architecture not yet decided
|
||||
- References new subsystem ("online/matchmaking") with no existing source files
|
||||
|
||||
**Input:** `/estimate production/epics/online/story-lobby-matchmaking.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads story — finds vague AC, no ADR, no existing source
|
||||
2. Skill flags multiple uncertainty factors
|
||||
3. Estimate is L–XL with an explicit risk note: "Estimate range is wide due to architectural unknowns"
|
||||
4. Skill recommends creating an ADR before development begins
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Estimate is L or XL (not S or M) when significant unknowns exist
|
||||
- [ ] Risk note explains the specific unknowns driving the wide range
|
||||
- [ ] Output recommends resolving architectural questions first
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Sprint Velocity Data — Conservative defaults used
|
||||
|
||||
**Fixture:**
|
||||
- Story file exists and is well-defined
|
||||
- `production/sprints/` is empty — no historical sprints
|
||||
|
||||
**Input:** `/estimate production/epics/core/story-save-load.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads story — assesses complexity
|
||||
2. Skill attempts to read sprint velocity data — finds none
|
||||
3. Skill notes: "No sprint history found — using conservative defaults for velocity"
|
||||
4. Estimate is produced using default assumptions (e.g., 1 story point = 1 day)
|
||||
5. No files are written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not error when no sprint history exists
|
||||
- [ ] Output explicitly notes that conservative defaults are being used
|
||||
- [ ] Estimate is still produced (not blocked by missing velocity)
|
||||
- [ ] Conservative defaults produce a higher (not lower) estimate range
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Multiple Stories — Each estimated individually plus sprint total
|
||||
|
||||
**Fixture:**
|
||||
- User provides a sprint file: `production/sprints/sprint-007.md` with 4 stories
|
||||
- Sprint history exists (3 previous sprints)
|
||||
|
||||
**Input:** `/estimate production/sprints/sprint-007.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads sprint file — identifies 4 stories
|
||||
2. Skill estimates each story individually: S, M, M, L
|
||||
3. Skill computes sprint total: approximately 6–8 story points
|
||||
4. Skill presents per-story estimates followed by sprint total
|
||||
5. No files are written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Each story receives its own estimate label
|
||||
- [ ] Sprint total is presented after individual estimates
|
||||
- [ ] Total is a sum range derived from individual ranges
|
||||
- [ ] Skill handles sprint files (not just single story files) as input
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; estimates are informational
|
||||
|
||||
**Fixture:**
|
||||
- Story file exists with medium complexity
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/estimate production/epics/core/story-item-pickup.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads story and sprint history; computes estimate
|
||||
2. No director gate is invoked in any review mode
|
||||
3. Estimate is presented as advisory output only
|
||||
4. Skill notes: "Use this estimate in /sprint-plan when selecting stories for the next sprint"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked regardless of review mode
|
||||
- [ ] Output is purely informational — no approval or write prompt
|
||||
- [ ] Next-step recommendation references `/sprint-plan`
|
||||
- [ ] Estimate does not change based on review mode
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads story file before estimating
|
||||
- [ ] Reads sprint velocity history when available
|
||||
- [ ] Produces effort range (S/M/L/XL), not a single number
|
||||
- [ ] Does not write any files
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Always produces an estimate (never blocked by missing data; uses defaults instead)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The skill does not produce PASS/FAIL verdicts; the "verdict" here is the
|
||||
effort range itself. Test assertions focus on the accuracy of the range
|
||||
and the quality of the reasoning, not a binary outcome.
|
||||
- Team-specific velocity calibration (what "M" means for this team) is an
|
||||
implementation detail not tested here; it is configured via sprint history.
|
||||
171
CCGS Skill Testing Framework/skills/analysis/perf-profile.md
Normal file
171
CCGS Skill Testing Framework/skills/analysis/perf-profile.md
Normal file
@@ -0,0 +1,171 @@
|
||||
# Skill Test Spec: /perf-profile
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/perf-profile` is a structured performance profiling workflow that identifies
|
||||
bottlenecks and recommends optimizations. If profiler data or performance logs
|
||||
are provided, it analyzes them directly. If not, it guides the user through a
|
||||
manual profiling checklist. No director gates are invoked. The skill asks
|
||||
"May I write to `production/qa/perf-[date].md`?" before persisting a report.
|
||||
Verdicts: WITHIN BUDGET, CONCERNS, or OVER BUDGET.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: WITHIN BUDGET, CONCERNS, OVER BUDGET
|
||||
- [ ] Contains "May I write" language (skill writes perf report)
|
||||
- [ ] Has a next-step handoff (what to do after performance findings are reviewed)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Performance profiling is an advisory analysis skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Frame data provided, draw call spike found
|
||||
|
||||
**Fixture:**
|
||||
- User provides `production/qa/profiler-export-2026-03-15.json` with frame time data
|
||||
- Data shows: average frame time 14ms (within 16.6ms budget), but frames 42–48 spike to 28ms
|
||||
- Spike correlates with a scene with 450 draw calls (budget: 200)
|
||||
|
||||
**Input:** `/perf-profile production/qa/profiler-export-2026-03-15.json`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads profiler data
|
||||
2. Skill identifies average frame time is within budget
|
||||
3. Skill identifies draw call spike on frames 42–48 (450 calls vs 200 budget)
|
||||
4. Verdict is CONCERNS (average OK, but spikes indicate an issue)
|
||||
5. Skill recommends batching or culling for the identified scene
|
||||
6. Skill asks "May I write to `production/qa/perf-2026-04-06.md`?"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Spike frames are identified by frame number
|
||||
- [ ] Draw call count and budget are compared explicitly
|
||||
- [ ] Verdict is CONCERNS when spikes exceed budget even if average is OK
|
||||
- [ ] At least one specific optimization recommendation is given
|
||||
- [ ] "May I write" prompt appears before writing report
|
||||
|
||||
---
|
||||
|
||||
### Case 2: No Profiler Data — Manual checklist output
|
||||
|
||||
**Fixture:**
|
||||
- User runs `/perf-profile` with no arguments
|
||||
- No profiler data files exist in `production/qa/`
|
||||
|
||||
**Input:** `/perf-profile`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill finds no profiler data
|
||||
2. Skill outputs a manual profiling checklist for the user to work through:
|
||||
- Enable Godot profiler or target engine's profiler
|
||||
- Record a 60-second play session
|
||||
- Export frame time data
|
||||
- Note any dropped frames or hitches
|
||||
3. Skill asks user to provide data once collected before running analysis
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash or emit a verdict when no data is provided
|
||||
- [ ] Manual profiling checklist is output (actionable steps, not just an error)
|
||||
- [ ] No verdict is emitted (there is nothing to assess yet)
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Over Budget — Frame budget exceeded for target platform
|
||||
|
||||
**Fixture:**
|
||||
- Profiler data shows consistent 22ms frame times (target: 16.6ms for 60fps)
|
||||
- All frames exceed budget; no single spike — systemic issue
|
||||
- `technical-preferences.md` specifies target platform: PC, 60fps
|
||||
|
||||
**Input:** `/perf-profile production/qa/profiler-export-2026-03-20.json`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads profiler data and technical preferences for performance budget
|
||||
2. All frames are over the 16.6ms budget
|
||||
3. Verdict is OVER BUDGET
|
||||
4. Skill outputs a prioritized optimization list (e.g., LOD system, shader complexity, physics tick rate)
|
||||
5. Skill asks "May I write" before writing report
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is OVER BUDGET when all or most frames exceed budget
|
||||
- [ ] Target frame budget is read from `technical-preferences.md` (not hardcoded)
|
||||
- [ ] Optimization priority list is provided, not just the raw verdict
|
||||
- [ ] "May I write" prompt appears before report write
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Previous Perf Report Exists — Delta comparison
|
||||
|
||||
**Fixture:**
|
||||
- `production/qa/perf-2026-03-28.md` exists with prior results (avg 15ms, max 19ms)
|
||||
- New profiler export shows: avg 13ms, max 17ms
|
||||
- Both reports are for the same scene
|
||||
|
||||
**Input:** `/perf-profile production/qa/profiler-export-2026-04-05.json`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads new profiler data
|
||||
2. Skill detects prior report for the same scene
|
||||
3. Skill computes deltas: avg improved 2ms, max improved 2ms
|
||||
4. Skill presents regression check: no regressions detected
|
||||
5. Verdict is WITHIN BUDGET; report notes improvement since last profile
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks `production/qa/` for prior perf reports before writing
|
||||
- [ ] Delta comparison is shown (prior vs. current for key metrics)
|
||||
- [ ] Verdict is WITHIN BUDGET when current metrics are within budget
|
||||
- [ ] Improvement trend is noted positively in the report
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; performance-analyst separate
|
||||
|
||||
**Fixture:**
|
||||
- Profiler data shows CONCERNS-level findings (some spikes)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/perf-profile production/qa/profiler-export-2026-04-01.json`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill analyzes profiler data; verdict is CONCERNS
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Output notes: "For in-depth analysis, consider running `/perf-profile` with the performance-analyst agent"
|
||||
4. Skill asks "May I write" and writes report on user approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Performance-analyst consultation is suggested (not mandated)
|
||||
- [ ] "May I write" prompt appears before report write
|
||||
- [ ] Verdict is CONCERNS for spike-based findings
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads profiler data when provided; outputs checklist when not
|
||||
- [ ] Reads `technical-preferences.md` for target platform frame budget
|
||||
- [ ] Checks for prior perf reports to enable delta comparison
|
||||
- [ ] Always asks "May I write" before writing report
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: WITHIN BUDGET, CONCERNS, OVER BUDGET
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Platform-specific profiling workflows (console, mobile) are not tested here;
|
||||
the checklist output in Case 2 would be platform-specific in practice.
|
||||
- The delta comparison in Case 4 assumes reports cover the same scene; cross-scene
|
||||
comparisons are not explicitly handled.
|
||||
168
CCGS Skill Testing Framework/skills/analysis/scope-check.md
Normal file
168
CCGS Skill Testing Framework/skills/analysis/scope-check.md
Normal file
@@ -0,0 +1,168 @@
|
||||
# Skill Test Spec: /scope-check
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/scope-check` is a Haiku-tier read-only skill that analyzes a feature, sprint,
|
||||
or story for scope creep risk. It reads sprint and story files and compares them
|
||||
against the active milestone goals. It is designed for fast, low-cost checks
|
||||
before or during planning. No director gates are invoked. No files are written.
|
||||
Verdicts: ON SCOPE, CONCERNS, or SCOPE CREEP DETECTED.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: ON SCOPE, CONCERNS, SCOPE CREEP DETECTED
|
||||
- [ ] Does NOT require "May I write" language (read-only skill)
|
||||
- [ ] Has a next-step handoff (what to do based on verdict)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Scope check is a read-only advisory skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Sprint stories align with milestone goals
|
||||
|
||||
**Fixture:**
|
||||
- `production/milestones/milestone-03.md` lists 3 goals: combat system, enemy AI, level loading
|
||||
- `production/sprints/sprint-006.md` contains 5 stories, all tagged to one of the 3 goals
|
||||
- `production/session-state/active.md` references milestone-03 as the active milestone
|
||||
|
||||
**Input:** `/scope-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads active milestone goals from milestone-03
|
||||
2. Skill reads sprint-006 stories and checks each against milestone goals
|
||||
3. All 5 stories map to one of the 3 goals
|
||||
4. Skill outputs a mapping table: story → milestone goal
|
||||
5. Verdict is ON SCOPE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Each story is mapped to a milestone goal in the output
|
||||
- [ ] Verdict is ON SCOPE when all stories map to milestone goals
|
||||
- [ ] No files are written
|
||||
- [ ] Skill does not modify sprint or milestone files
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Scope Creep Detected — Stories introducing systems not in milestone
|
||||
|
||||
**Fixture:**
|
||||
- `production/milestones/milestone-03.md` goals: combat, enemy AI, level loading
|
||||
- `production/sprints/sprint-006.md` contains 5 stories:
|
||||
- 3 stories map to milestone goals
|
||||
- 2 stories reference "online leaderboard" and "achievement system" (not in milestone-03)
|
||||
|
||||
**Input:** `/scope-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads milestone goals and sprint stories
|
||||
2. Skill identifies 2 stories with no matching milestone goal
|
||||
3. Skill names the out-of-scope stories: "Online Leaderboard Feature", "Achievement System Setup"
|
||||
4. Verdict is SCOPE CREEP DETECTED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Out-of-scope stories are named explicitly in the output
|
||||
- [ ] Verdict is SCOPE CREEP DETECTED when any story has no milestone goal match
|
||||
- [ ] Skill does not automatically remove the stories — findings are advisory
|
||||
- [ ] Output recommends deferring the out-of-scope stories to a later milestone
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Milestone Defined — CONCERNS; scope cannot be validated
|
||||
|
||||
**Fixture:**
|
||||
- `production/session-state/active.md` has no milestone reference
|
||||
- `production/milestones/` directory exists but is empty
|
||||
- `production/sprints/sprint-006.md` has 4 stories
|
||||
|
||||
**Input:** `/scope-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads active.md — finds no milestone reference
|
||||
2. Skill checks `production/milestones/` — no milestone files found
|
||||
3. Skill outputs: "No active milestone defined — scope cannot be validated"
|
||||
4. Verdict is CONCERNS
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not error when no milestone is defined
|
||||
- [ ] Output explicitly states that scope validation requires a milestone reference
|
||||
- [ ] Verdict is CONCERNS (not ON SCOPE or SCOPE CREEP DETECTED without data)
|
||||
- [ ] Output suggests running `/milestone-review` or creating a milestone
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Single Story Check — Evaluated against its parent epic
|
||||
|
||||
**Fixture:**
|
||||
- User targets a single story: `production/epics/combat/story-parry-timing.md`
|
||||
- Story references parent epic: `epic-combat.md`
|
||||
- `production/epics/combat/epic-combat.md` has scope: "melee combat mechanics"
|
||||
- Story title: "Implement parry timing window" — matches epic scope
|
||||
|
||||
**Input:** `/scope-check production/epics/combat/story-parry-timing.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the specified story file
|
||||
2. Skill reads the parent epic to get scope definition
|
||||
3. Skill evaluates story against epic scope — "parry timing" matches "melee combat"
|
||||
4. Verdict is ON SCOPE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Single-file argument is accepted (story path, not sprint)
|
||||
- [ ] Skill reads the parent epic referenced in the story file
|
||||
- [ ] Story is evaluated against epic scope (not milestone scope) in single-story mode
|
||||
- [ ] Verdict is ON SCOPE when story matches epic scope
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; PR may be consulted separately
|
||||
|
||||
**Fixture:**
|
||||
- Sprint has 2 SCOPE CREEP stories and 3 ON SCOPE stories
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/scope-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads milestone and sprint; identifies 2 scope creep items
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Skill presents findings with SCOPE CREEP DETECTED verdict
|
||||
4. Output notes: "Consider raising scope concerns with the Producer before sprint begins"
|
||||
5. Skill ends without writing any files
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Producer consultation is suggested (not mandated)
|
||||
- [ ] No files are written
|
||||
- [ ] Verdict is SCOPE CREEP DETECTED
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads milestone goals and sprint/story files before analysis
|
||||
- [ ] Maps each story to a milestone goal (or flags as unmapped)
|
||||
- [ ] Does not write any files
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Runs on Haiku model tier (fast, low-cost)
|
||||
- [ ] Verdict is one of: ON SCOPE, CONCERNS, SCOPE CREEP DETECTED
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where the sprint file itself does not exist is not tested; the
|
||||
skill would output a CONCERNS verdict with a message about missing sprint data.
|
||||
- Partial scope overlap (story touches a milestone goal but also introduces
|
||||
new scope) is not explicitly tested; implementation may classify this as
|
||||
CONCERNS rather than SCOPE CREEP DETECTED.
|
||||
167
CCGS Skill Testing Framework/skills/analysis/security-audit.md
Normal file
167
CCGS Skill Testing Framework/skills/analysis/security-audit.md
Normal file
@@ -0,0 +1,167 @@
|
||||
# Skill Test Spec: /security-audit
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/security-audit` audits the game for security risks including save data
|
||||
integrity, network communication, anti-cheat exposure, and data privacy. It
|
||||
reads source files in `src/` for security patterns and checks whether sensitive
|
||||
data is handled correctly. No director gates are invoked. The skill does not
|
||||
write files (findings report only). Verdicts: SECURE, CONCERNS, or
|
||||
VULNERABILITIES FOUND.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: SECURE, CONCERNS, VULNERABILITIES FOUND
|
||||
- [ ] Does NOT require "May I write" language (read-only; findings report only)
|
||||
- [ ] Has a next-step handoff (what to do with findings)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Security audit is a read-only advisory skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Save data encrypted, no hardcoded credentials
|
||||
|
||||
**Fixture:**
|
||||
- `src/core/save_system.gd` uses `Crypto` class to encrypt save data before writing
|
||||
- No hardcoded API keys, passwords, or credentials in any `src/` file
|
||||
- No version numbers or internal build IDs exposed in client-facing output
|
||||
|
||||
**Input:** `/security-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans `src/` for security patterns: encryption usage, hardcoded credentials, exposed internals
|
||||
2. All checks pass: save data encrypted, no credentials found, no exposed internals
|
||||
3. Findings report shows all checks PASS
|
||||
4. Verdict is SECURE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks save data handling for encryption usage
|
||||
- [ ] Skill scans for hardcoded credentials (API keys, passwords, tokens)
|
||||
- [ ] Skill checks for version/build numbers exposed to players
|
||||
- [ ] All checks shown in findings report
|
||||
- [ ] Verdict is SECURE when all checks pass
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Vulnerabilities Found — Unencrypted save data and exposed version
|
||||
|
||||
**Fixture:**
|
||||
- `src/core/save_system.gd` writes save data as plain JSON (no encryption)
|
||||
- `src/ui/debug_overlay.gd` contains: `label.text = "Build: " + ProjectSettings.get("application/config/version")`
|
||||
(exposes internal build version to player)
|
||||
|
||||
**Input:** `/security-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans `src/` — finds unencrypted save write in `save_system.gd`
|
||||
2. Skill finds exposed version string in `debug_overlay.gd`
|
||||
3. Both findings are flagged as VULNERABILITIES
|
||||
4. Verdict is VULNERABILITIES FOUND
|
||||
5. Skill provides remediation recommendations for each vulnerability
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Unencrypted save data is flagged as a vulnerability with file and approximate line
|
||||
- [ ] Exposed version string is flagged as a vulnerability
|
||||
- [ ] Remediation suggestion is given for each vulnerability
|
||||
- [ ] Verdict is VULNERABILITIES FOUND when any vulnerability is detected
|
||||
- [ ] No files are written or modified
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Online Features Without Authentication — CONCERNS
|
||||
|
||||
**Fixture:**
|
||||
- `src/networking/lobby.gd` exists with functions: `join_lobby()`, `send_chat()`
|
||||
- No authentication check is found before `send_chat()` — players can call it without being verified
|
||||
- Game has online multiplayer features (inferred from file presence)
|
||||
|
||||
**Input:** `/security-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans `src/networking/` — detects online feature code
|
||||
2. Skill checks for authentication guard before network calls — finds none on `send_chat()`
|
||||
3. Flags: "Online feature without authentication check — CONCERNS"
|
||||
4. Verdict is CONCERNS (not VULNERABILITIES FOUND, as this is a missing control, not an exploit)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill detects online features by scanning for networking source files
|
||||
- [ ] Missing authentication checks before network operations are flagged
|
||||
- [ ] Verdict is CONCERNS (advisory severity) for missing authentication guards
|
||||
- [ ] Output recommends adding authentication before network calls
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No Source Files to Analyze
|
||||
|
||||
**Fixture:**
|
||||
- `src/` directory does not exist or is completely empty
|
||||
|
||||
**Input:** `/security-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to scan `src/` — no files found
|
||||
2. Skill outputs an error: "No source files found in `src/` — nothing to audit"
|
||||
3. No findings report is generated
|
||||
4. No verdict is emitted
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash when `src/` is empty or absent
|
||||
- [ ] Output clearly states that no source files were found
|
||||
- [ ] No verdict is emitted (there is nothing to assess)
|
||||
- [ ] Skill suggests verifying the `src/` directory path
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; security-engineer invoked separately
|
||||
|
||||
**Fixture:**
|
||||
- Source files exist; 1 CONCERNS-level finding detected (debug logging enabled in release build)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/security-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans source; finds debug logging active in release path
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Verdict is CONCERNS
|
||||
4. Output notes: "For formal security review, consider engaging a security-engineer agent"
|
||||
5. Findings are presented as a read-only report; no files written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Security-engineer consultation is suggested (not mandated)
|
||||
- [ ] No files are written
|
||||
- [ ] Verdict is CONCERNS for advisory-level security findings
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads source files in `src/` before auditing
|
||||
- [ ] Checks save data encryption, hardcoded credentials, exposed internals, auth guards
|
||||
- [ ] Provides remediation recommendations for each finding
|
||||
- [ ] Does not write any files (read-only skill)
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: SECURE, CONCERNS, VULNERABILITIES FOUND
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Anti-cheat analysis (client-side value validation, server authority) is not
|
||||
explicitly tested here; it follows the CONCERNS or VULNERABILITIES pattern
|
||||
depending on severity.
|
||||
- Data privacy compliance (GDPR, COPPA) is out of scope for this spec; those
|
||||
require legal review beyond code scanning.
|
||||
171
CCGS Skill Testing Framework/skills/analysis/tech-debt.md
Normal file
171
CCGS Skill Testing Framework/skills/analysis/tech-debt.md
Normal file
@@ -0,0 +1,171 @@
|
||||
# Skill Test Spec: /tech-debt
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/tech-debt` tracks, categorizes, and prioritizes technical debt across the
|
||||
codebase. It reads `docs/tech-debt-register.md` for the existing debt register
|
||||
and scans source files in `src/` for inline `TODO` and `FIXME` comments. It
|
||||
merges and sorts items by severity. No director gates are invoked. The skill
|
||||
asks "May I write to `docs/tech-debt-register.md`?" before updating. Verdicts:
|
||||
REGISTER UPDATED or NO NEW DEBT FOUND.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: REGISTER UPDATED, NO NEW DEBT FOUND
|
||||
- [ ] Contains "May I write" language (skill writes to debt register)
|
||||
- [ ] Has a next-step handoff (what to do after register is updated)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Tech debt tracking is an internal codebase analysis skill; no gates are
|
||||
invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Inline TODOs plus existing register items merged
|
||||
|
||||
**Fixture:**
|
||||
- `docs/tech-debt-register.md` exists with 2 items (LOW and MEDIUM severity)
|
||||
- `src/gameplay/combat.gd` has 2 `# TODO` comments and 1 `# FIXME` comment
|
||||
- `src/ui/hud.gd` has 0 inline debt comments
|
||||
|
||||
**Input:** `/tech-debt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `docs/tech-debt-register.md` — finds 2 existing items
|
||||
2. Skill scans `src/` — finds 3 inline comments (2 TODOs, 1 FIXME)
|
||||
3. Skill checks whether inline comments already exist in the register (deduplication)
|
||||
4. Skill presents combined list sorted by severity (FIXME before TODO by default)
|
||||
5. Skill asks "May I write to `docs/tech-debt-register.md`?"
|
||||
6. User approves; register updated; verdict REGISTER UPDATED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Inline comments are found by scanning `src/` recursively
|
||||
- [ ] Existing register items are not duplicated
|
||||
- [ ] Combined list is sorted by severity
|
||||
- [ ] "May I write" prompt appears before any write
|
||||
- [ ] Verdict is REGISTER UPDATED
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Register Doesn't Exist — Offered to create it
|
||||
|
||||
**Fixture:**
|
||||
- `docs/tech-debt-register.md` does NOT exist
|
||||
- `src/` contains 4 inline TODO/FIXME comments
|
||||
|
||||
**Input:** `/tech-debt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read `docs/tech-debt-register.md` — not found
|
||||
2. Skill informs user: "No tech-debt-register.md found"
|
||||
3. Skill offers to create the register with the inline items it found
|
||||
4. Skill asks "May I write to `docs/tech-debt-register.md`?" (create)
|
||||
5. User approves; register created with 4 items; verdict REGISTER UPDATED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash when register file is absent
|
||||
- [ ] User is offered register creation (not silently skipping)
|
||||
- [ ] "May I write" prompt reflects file creation (not update)
|
||||
- [ ] Verdict is REGISTER UPDATED after creation
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Resolved Item Detected — Marked resolved in register
|
||||
|
||||
**Fixture:**
|
||||
- `docs/tech-debt-register.md` has 3 items; one references `src/gameplay/legacy_input.gd`
|
||||
- `src/gameplay/legacy_input.gd` has been deleted (refactored away)
|
||||
- The referenced TODO comment no longer exists in source
|
||||
|
||||
**Input:** `/tech-debt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads register — finds 3 items
|
||||
2. Skill scans `src/` — does not find the source location referenced by item 2
|
||||
3. Skill flags item 2 as RESOLVED (source is gone)
|
||||
4. Skill presents the resolved item to user for confirmation
|
||||
5. On approval, register is updated with item 2 marked `Status: Resolved`
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks whether each register item's source reference still exists
|
||||
- [ ] Missing source locations result in items being flagged as RESOLVED
|
||||
- [ ] User confirms before resolved items are written
|
||||
- [ ] RESOLVED items are kept in the register (not deleted) for audit history
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — CRITICAL debt item surfaces prominently
|
||||
|
||||
**Fixture:**
|
||||
- `src/core/network_sync.gd` has a comment: `# FIXME(CRITICAL): race condition in sync buffer — can corrupt save data`
|
||||
- `docs/tech-debt-register.md` exists with 5 lower-severity items
|
||||
|
||||
**Input:** `/tech-debt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans source and finds the CRITICAL-tagged FIXME
|
||||
2. Skill presents the CRITICAL item at the top of the output — before the full table
|
||||
3. Skill asks user to acknowledge the critical item before proceeding
|
||||
4. After acknowledgment, skill presents full debt table and asks to write
|
||||
5. Register is updated with CRITICAL item at top; verdict REGISTER UPDATED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] CRITICAL items appear at the top of the output, not buried in the table
|
||||
- [ ] Skill surfaces CRITICAL items before asking to write
|
||||
- [ ] User acknowledgment of the CRITICAL item is requested
|
||||
- [ ] CRITICAL severity is preserved in the written register entry
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; register updated only with approval
|
||||
|
||||
**Fixture:**
|
||||
- Inline scan finds 2 new TODOs; register has 3 existing items
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/tech-debt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans source and reads register; compiles combined debt list
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Skill presents sorted debt table to user
|
||||
4. Skill asks "May I write to `docs/tech-debt-register.md`?"
|
||||
5. User approves; register updated; verdict REGISTER UPDATED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Debt table is presented before any write prompt
|
||||
- [ ] "May I write" prompt appears before file update
|
||||
- [ ] Write only occurs with explicit user approval
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads `docs/tech-debt-register.md` and scans `src/` before compiling
|
||||
- [ ] Deduplicates inline comments against existing register items
|
||||
- [ ] Sorts combined list by severity
|
||||
- [ ] Always asks "May I write" before updating register
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is REGISTER UPDATED or NO NEW DEBT FOUND
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where `src/` is empty or absent is not tested; behavior follows
|
||||
the NO NEW DEBT FOUND path for the inline scan, but register items would
|
||||
still be read and presented.
|
||||
- TODO comments without severity tags are treated as LOW severity by default;
|
||||
this classification detail is an implementation concern, not tested here.
|
||||
@@ -0,0 +1,175 @@
|
||||
# Skill Test Spec: /test-evidence-review
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/test-evidence-review` performs a quality review of test files in `tests/`,
|
||||
checking test naming conventions, determinism, isolation, and absence of
|
||||
hardcoded magic numbers — all against the project's test standards defined in
|
||||
`coding-standards.md`. Findings may be flagged for qa-lead review. No director
|
||||
gates are invoked. The skill does not write without user approval. Verdicts:
|
||||
PASS, WARNINGS, or FAIL.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: PASS, WARNINGS, FAIL
|
||||
- [ ] Does NOT require "May I write" language (read-only; write is optional flagging report)
|
||||
- [ ] Has a next-step handoff (what to do after findings are reviewed)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Test evidence review is an advisory quality skill; QL-TEST-COVERAGE gate
|
||||
is a separate skill invocation and is NOT triggered here.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Tests follow all standards
|
||||
|
||||
**Fixture:**
|
||||
- `tests/unit/combat/health_system_take_damage_test.gd` exists with:
|
||||
- Naming: `test_health_system_take_damage_reduces_health()` (follows `test_[system]_[scenario]_[expected]`)
|
||||
- Arrange/Act/Assert structure present
|
||||
- No `sleep()`, `await` with time values, or random seeds
|
||||
- No calls to external APIs or file I/O
|
||||
- No inline magic numbers (uses constants from `tests/unit/combat/fixtures/`)
|
||||
|
||||
**Input:** `/test-evidence-review tests/unit/combat/`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads test standards from `coding-standards.md`
|
||||
2. Skill reads the test file; checks all 5 standards
|
||||
3. All checks pass: naming, structure, determinism, isolation, no hardcoded data
|
||||
4. Verdict is PASS
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Each of the 5 test standards is checked and reported
|
||||
- [ ] All checks show PASS when standards are met
|
||||
- [ ] Verdict is PASS
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Fail — Timing dependency detected
|
||||
|
||||
**Fixture:**
|
||||
- `tests/unit/ui/hud_update_test.gd` contains:
|
||||
```gdscript
|
||||
await get_tree().create_timer(1.0).timeout
|
||||
assert_eq(label.text, "Ready")
|
||||
```
|
||||
- Real-time wait of 1 second used instead of mock or signal-based assertion
|
||||
|
||||
**Input:** `/test-evidence-review tests/unit/ui/hud_update_test.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the test file
|
||||
2. Skill detects real-time wait (`create_timer(1.0)`) — non-deterministic timing dependency
|
||||
3. Skill flags this as a FAIL-level finding
|
||||
4. Verdict is FAIL
|
||||
5. Skill recommends replacing the timer with a signal-based assertion or mock
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Real-time wait usage is detected as a non-deterministic timing dependency
|
||||
- [ ] Finding is classified as FAIL severity (blocking — violates determinism standard)
|
||||
- [ ] Verdict is FAIL
|
||||
- [ ] Remediation suggestion references signal-based or mock-based approach
|
||||
- [ ] Skill does not edit the test file
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Fail — Test calls external API directly
|
||||
|
||||
**Fixture:**
|
||||
- `tests/unit/networking/auth_test.gd` contains:
|
||||
```gdscript
|
||||
var result = HTTPRequest.new().request("https://api.example.com/auth")
|
||||
```
|
||||
- Direct HTTP call to external API without a mock
|
||||
|
||||
**Input:** `/test-evidence-review tests/unit/networking/auth_test.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the test file
|
||||
2. Skill detects direct external API call (HTTPRequest to live URL)
|
||||
3. Skill flags this as a FAIL-level finding — violates isolation standard
|
||||
4. Verdict is FAIL
|
||||
5. Skill recommends injecting a mock HTTP client
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Direct external API call is detected and flagged
|
||||
- [ ] Finding is classified as FAIL severity (violates isolation standard)
|
||||
- [ ] Verdict is FAIL
|
||||
- [ ] Remediation references dependency injection with a mock HTTP client
|
||||
- [ ] Skill does not modify the test file
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No Test Files Found
|
||||
|
||||
**Fixture:**
|
||||
- User calls `/test-evidence-review tests/unit/audio/`
|
||||
- `tests/unit/audio/` directory does not exist
|
||||
|
||||
**Input:** `/test-evidence-review tests/unit/audio/`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read files in `tests/unit/audio/` — not found
|
||||
2. Skill outputs: "No test files found at `tests/unit/audio/` — run `/test-setup` to scaffold test directories"
|
||||
3. No verdict is emitted
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash when path does not exist
|
||||
- [ ] Output names the attempted path in the message
|
||||
- [ ] Output recommends `/test-setup` for scaffolding
|
||||
- [ ] No verdict is emitted when there is nothing to review
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; QL-TEST-COVERAGE is a separate skill
|
||||
|
||||
**Fixture:**
|
||||
- Test file has 1 WARNINGS-level finding (magic number in a non-boundary test)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/test-evidence-review tests/unit/combat/`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reviews tests; finds 1 WARNINGS-level finding
|
||||
2. No director gate is invoked (QL-TEST-COVERAGE is invoked separately, not here)
|
||||
3. Verdict is WARNINGS
|
||||
4. Output notes: "For full test coverage gate, run `/gate-check` which invokes QL-TEST-COVERAGE"
|
||||
5. Skill offers optional report write; asks "May I write" if user opts in
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Output distinguishes this skill from the QL-TEST-COVERAGE gate invocation
|
||||
- [ ] Optional report requires "May I write" before writing
|
||||
- [ ] Verdict is WARNINGS for advisory-level test quality issues
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads `coding-standards.md` test standards before reviewing test files
|
||||
- [ ] Checks naming, Arrange/Act/Assert structure, determinism, isolation, no hardcoded data
|
||||
- [ ] Does not edit any test files (read-only skill)
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: PASS, WARNINGS, FAIL
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Batch review of all test files in `tests/` is not explicitly tested; behavior
|
||||
is assumed to apply the same checks file by file and aggregate the verdict.
|
||||
- The QL-TEST-COVERAGE director gate (which checks test coverage percentage) is
|
||||
a separate concern and is intentionally NOT invoked by this skill.
|
||||
177
CCGS Skill Testing Framework/skills/analysis/test-flakiness.md
Normal file
177
CCGS Skill Testing Framework/skills/analysis/test-flakiness.md
Normal file
@@ -0,0 +1,177 @@
|
||||
# Skill Test Spec: /test-flakiness
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/test-flakiness` detects non-deterministic tests by analyzing test history logs
|
||||
(if available) or scanning test source code for common flakiness patterns (random
|
||||
numbers without seeds, real-time waits, external I/O). No director gates are
|
||||
invoked. The skill does not write without user approval. Verdicts: NO FLAKINESS,
|
||||
SUSPECT TESTS FOUND, or CONFIRMED FLAKY.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: NO FLAKINESS, SUSPECT TESTS FOUND, CONFIRMED FLAKY
|
||||
- [ ] Does NOT require "May I write" language (read-only; optional report requires approval)
|
||||
- [ ] Has a next-step handoff (what to do after flakiness findings)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Flakiness detection is an advisory quality skill for the QA lead; no gates
|
||||
are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Clean test history, no flakiness
|
||||
|
||||
**Fixture:**
|
||||
- `production/qa/test-history/` contains logs for 10 test runs
|
||||
- All tests pass consistently across all 10 runs (100% pass rate per test)
|
||||
- No test has a failure pattern
|
||||
|
||||
**Input:** `/test-flakiness`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads test history logs from `production/qa/test-history/`
|
||||
2. Skill computes per-test pass rate across 10 runs
|
||||
3. All tests pass all 10 runs — no inconsistency detected
|
||||
4. Verdict is NO FLAKINESS
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads test history logs when available
|
||||
- [ ] Per-test pass rate is computed across all available runs
|
||||
- [ ] Verdict is NO FLAKINESS when all tests pass consistently
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Suspect Tests Found — Test fails intermittently in history
|
||||
|
||||
**Fixture:**
|
||||
- `production/qa/test-history/` contains logs for 10 test runs
|
||||
- `test_combat_damage_applies_crit_multiplier` passes 7 times, fails 3 times
|
||||
- Failure messages differ (sometimes timeout, sometimes wrong value)
|
||||
|
||||
**Input:** `/test-flakiness`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads test history logs — computes pass rates
|
||||
2. `test_combat_damage_applies_crit_multiplier` has 70% pass rate (threshold: 95%)
|
||||
3. Skill flags it as SUSPECT with pass rate (7/10) and failure pattern noted
|
||||
4. Verdict is SUSPECT TESTS FOUND
|
||||
5. Skill recommends investigating the test for timing or state dependencies
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Tests below the pass-rate threshold are flagged by name
|
||||
- [ ] Pass rate (fraction and percentage) is shown for each suspect test
|
||||
- [ ] Failure pattern (e.g., inconsistent error messages) is noted if detectable
|
||||
- [ ] Verdict is SUSPECT TESTS FOUND
|
||||
- [ ] Skill recommends investigation steps
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Source Pattern — Random number used without seed
|
||||
|
||||
**Fixture:**
|
||||
- No test history logs exist
|
||||
- `tests/unit/loot/loot_drop_test.gd` contains:
|
||||
```gdscript
|
||||
var roll = randf() # unseeded random — non-deterministic
|
||||
assert_gt(roll, 0.5, "Loot should drop above 50%")
|
||||
```
|
||||
|
||||
**Input:** `/test-flakiness`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill finds no test history logs
|
||||
2. Skill falls back to source code analysis
|
||||
3. Skill detects `randf()` call without a preceding `seed()` call
|
||||
4. Skill flags the test as FLAKINESS RISK (source pattern, not confirmed)
|
||||
5. Verdict is SUSPECT TESTS FOUND (pattern detected, not confirmed by history)
|
||||
6. Skill recommends seeding random before the call or mocking the random function
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Source code analysis is used as fallback when no history logs exist
|
||||
- [ ] Unseeded random number usage is detected as a flakiness risk
|
||||
- [ ] Verdict is SUSPECT TESTS FOUND (not CONFIRMED FLAKY — no history to confirm)
|
||||
- [ ] Remediation recommends seeding or mocking
|
||||
|
||||
---
|
||||
|
||||
### Case 4: No Test History — Source-only analysis with common patterns
|
||||
|
||||
**Fixture:**
|
||||
- `production/qa/test-history/` does not exist
|
||||
- `tests/` contains 15 test files
|
||||
- Scan finds 2 tests using `OS.get_ticks_msec()` for timing assertions
|
||||
- No other flakiness patterns found
|
||||
|
||||
**Input:** `/test-flakiness`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill checks for test history — not found
|
||||
2. Skill notes: "No test history available — analyzing source code for flakiness patterns only"
|
||||
3. Skill scans all test files for known patterns: unseeded random, real-time waits, system clock usage
|
||||
4. Finds 2 tests using `OS.get_ticks_msec()` — flags as FLAKINESS RISK
|
||||
5. Verdict is SUSPECT TESTS FOUND
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill notes clearly that source-only analysis is being performed (no history)
|
||||
- [ ] Common flakiness patterns are scanned: random, time-based assertions, external I/O
|
||||
- [ ] `OS.get_ticks_msec()` usage for assertions is flagged as a flakiness risk
|
||||
- [ ] Verdict is SUSPECT TESTS FOUND when source patterns are found
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; flakiness report is advisory
|
||||
|
||||
**Fixture:**
|
||||
- Test history shows 1 CONFIRMED FLAKY test (fails 6 out of 10 runs)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/test-flakiness`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill analyzes test history; identifies 1 confirmed flaky test
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Verdict is CONFIRMED FLAKY
|
||||
4. Skill presents findings and offers optional written report
|
||||
5. If user opts in: "May I write to `production/qa/flakiness-report-[date].md`?"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] CONFIRMED FLAKY verdict requires history-based evidence (not just source patterns)
|
||||
- [ ] Optional report requires "May I write" before writing
|
||||
- [ ] Flakiness report is advisory for qa-lead; skill does not auto-disable tests
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads test history logs when available; falls back to source analysis when not
|
||||
- [ ] Notes clearly which analysis mode is being used (history vs. source-only)
|
||||
- [ ] Flakiness threshold (e.g., 95% pass rate) is used for SUSPECT classification
|
||||
- [ ] CONFIRMED FLAKY requires history evidence; SUSPECT covers source patterns only
|
||||
- [ ] Does not disable or modify any test files
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: NO FLAKINESS, SUSPECT TESTS FOUND, CONFIRMED FLAKY
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The pass-rate threshold for SUSPECT classification (95% suggested above) is an
|
||||
implementation detail; the tests verify that intermittent failures are flagged,
|
||||
not the exact threshold value.
|
||||
- Tests that fail due to environment issues (missing assets, wrong platform) are
|
||||
not flakiness — the skill distinguishes environment failures from non-determinism
|
||||
in the test itself; this distinction is not explicitly tested here.
|
||||
@@ -0,0 +1,197 @@
|
||||
# Skill Test Spec: /architecture-decision
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/architecture-decision` guides the user through section-by-section authoring of
|
||||
a new Architecture Decision Record (ADR). Required sections are: Status, Context,
|
||||
Decision, Consequences, Alternatives, and Related ADRs. The skill also stamps the
|
||||
engine version reference from `docs/engine-reference/` into the ADR for traceability.
|
||||
|
||||
In `full` review mode, TD-ADR (technical-director) and LP-FEASIBILITY
|
||||
(lead-programmer) gate agents spawn after the draft is complete. If both gates
|
||||
return APPROVED, the ADR status is set to Accepted. In `lean` or `solo` mode,
|
||||
both gates are skipped and the ADR is written with Status: Proposed. The skill
|
||||
asks "May I write" per section during authoring. ADRs are written to
|
||||
`docs/architecture/adr-NNN-[name].md`.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: ACCEPTED, PROPOSED, CONCERNS
|
||||
- [ ] Contains "May I write" collaborative protocol language (per-section approval)
|
||||
- [ ] Has a next-step handoff at the end
|
||||
- [ ] Documents gate behavior: TD-ADR + LP-FEASIBILITY in full mode; skipped in lean/solo
|
||||
- [ ] Documents that ADR status is Accepted (full, gates approve) or Proposed (otherwise)
|
||||
- [ ] Mentions engine version stamp from `docs/engine-reference/`
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
In `full` mode: TD-ADR (technical-director) and LP-FEASIBILITY (lead-programmer)
|
||||
spawn after the ADR draft is complete. If both return APPROVED, ADR Status is set
|
||||
to Accepted. If either returns CONCERNS or FAIL, ADR stays Proposed.
|
||||
|
||||
In `lean` mode: both gates are skipped. ADR is written with Status: Proposed.
|
||||
Output notes: "TD-ADR skipped — lean mode" and "LP-FEASIBILITY skipped — lean mode".
|
||||
|
||||
In `solo` mode: both gates are skipped. ADR is written with Status: Proposed.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — New ADR for rendering approach, full mode, gates approve
|
||||
|
||||
**Fixture:**
|
||||
- `docs/architecture/` exists with no existing ADR for rendering
|
||||
- `docs/engine-reference/[engine]/VERSION.md` exists
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/architecture-decision rendering-approach`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill guides user through each required section (Status, Context, Decision, Consequences, Alternatives, Related ADRs)
|
||||
2. Engine version is stamped into the ADR from `docs/engine-reference/`
|
||||
3. For each section: draft shown, "May I write this section?" asked, approved
|
||||
4. After all sections: TD-ADR and LP-FEASIBILITY gates spawn in parallel
|
||||
5. Both gates return APPROVED
|
||||
6. ADR Status is set to Accepted
|
||||
7. Skill writes `docs/architecture/adr-NNN-rendering-approach.md`
|
||||
8. `docs/architecture/tr-registry.yaml` updated if new TR-IDs are defined
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 6 required sections are authored and written
|
||||
- [ ] Engine version reference is stamped in the ADR
|
||||
- [ ] TD-ADR and LP-FEASIBILITY spawn in parallel (not sequentially)
|
||||
- [ ] ADR Status is Accepted when both gates return APPROVED in full mode
|
||||
- [ ] "May I write" is asked per section during authoring
|
||||
- [ ] File is written to `docs/architecture/adr-NNN-[name].md`
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — TD-ADR returns CONCERNS
|
||||
|
||||
**Fixture:**
|
||||
- ADR draft is complete (all sections filled)
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
- TD-ADR gate returns CONCERNS: "The decision does not address [specific concern]"
|
||||
|
||||
**Input:** `/architecture-decision [topic]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. TD-ADR gate spawns and returns CONCERNS with specific feedback
|
||||
2. Skill surfaces the concerns to the user
|
||||
3. ADR Status remains Proposed (not Accepted)
|
||||
4. User is asked: revise the decision to address concerns, or accept as Proposed
|
||||
5. ADR is written with Status: Proposed if concerns are not resolved
|
||||
|
||||
**Assertions:**
|
||||
- [ ] TD-ADR concerns are shown to the user verbatim
|
||||
- [ ] ADR Status is Proposed (not Accepted) when TD-ADR returns CONCERNS
|
||||
- [ ] Skill does NOT set Status: Accepted while CONCERNS are unresolved
|
||||
- [ ] User is given the option to revise and re-run the gate
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Lean Mode — Both gates skipped; ADR written as Proposed
|
||||
|
||||
**Fixture:**
|
||||
- `production/session-state/review-mode.txt` contains `lean`
|
||||
- ADR draft is authored for a new technical decision
|
||||
|
||||
**Input:** `/architecture-decision [topic]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill guides user through all 6 sections
|
||||
2. After draft is complete: both TD-ADR and LP-FEASIBILITY are skipped
|
||||
3. Output notes: "TD-ADR skipped — lean mode" and "LP-FEASIBILITY skipped — lean mode"
|
||||
4. ADR is written with Status: Proposed (not Accepted, since gates did not approve)
|
||||
5. "May I write" is still asked before the final file write
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Both gate skip notes appear in output
|
||||
- [ ] ADR Status is Proposed (not Accepted) in lean mode
|
||||
- [ ] "May I write" is still asked before writing the file
|
||||
- [ ] Skill writes the ADR after user approval
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — ADR already exists for this topic
|
||||
|
||||
**Fixture:**
|
||||
- `docs/architecture/` contains an existing ADR covering the same topic
|
||||
- The existing ADR has Status: Accepted
|
||||
|
||||
**Input:** `/architecture-decision [same-topic]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects an existing ADR covering the same topic
|
||||
2. Skill asks: "An ADR for [topic] already exists ([filename]). Update it, or create a new superseding ADR?"
|
||||
3. User selects update or supersede
|
||||
4. Skill does NOT silently create a duplicate ADR
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill detects the existing ADR before authoring begins
|
||||
- [ ] User is offered update or supersede options — no silent duplicate
|
||||
- [ ] If update: skill opens the existing ADR for section-by-section revision
|
||||
- [ ] If supersede: new ADR references the superseded one in Related ADRs section
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — Status set correctly based on mode and gate outcome
|
||||
|
||||
**Fixture:**
|
||||
- ADR draft is complete
|
||||
- Two scenarios: (a) full mode, both gates APPROVED; (b) full mode, one gate CONCERNS
|
||||
|
||||
**Full mode, both APPROVED:**
|
||||
- ADR Status is set to Accepted
|
||||
|
||||
**Assertions (both approved):**
|
||||
- [ ] ADR frontmatter/header shows `Status: Accepted`
|
||||
- [ ] Both TD-ADR and LP-FEASIBILITY appear as APPROVED in output
|
||||
|
||||
**Full mode, one gate returns CONCERNS:**
|
||||
- ADR Status stays Proposed
|
||||
|
||||
**Assertions (CONCERNS):**
|
||||
- [ ] ADR frontmatter/header shows `Status: Proposed`
|
||||
- [ ] Concerns are listed in output
|
||||
- [ ] Skill does NOT set Status: Accepted when any gate returns CONCERNS
|
||||
|
||||
**Lean/solo mode:**
|
||||
- ADR Status is always Proposed regardless of content quality
|
||||
|
||||
**Assertions (lean/solo):**
|
||||
- [ ] ADR Status is Proposed in lean mode
|
||||
- [ ] ADR Status is Proposed in solo mode
|
||||
- [ ] No gate output appears in lean or solo mode
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] All 6 required sections authored before gate review
|
||||
- [ ] Engine version stamped in ADR from `docs/engine-reference/`
|
||||
- [ ] "May I write" asked per section during authoring
|
||||
- [ ] TD-ADR and LP-FEASIBILITY spawn in parallel in full mode
|
||||
- [ ] Skipped gates noted by name and mode in lean/solo output
|
||||
- [ ] ADR Status: Accepted only when full mode AND both gates APPROVED
|
||||
- [ ] Ends with next-step handoff: `/architecture-review` or `/create-control-manifest`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- ADR numbering (auto-incrementing NNN) is not independently fixture-tested —
|
||||
the skill reads existing ADR filenames to assign the next number.
|
||||
- Related ADRs section linking (supersedes / related-to) is tested structurally
|
||||
via Case 4 but not all link types are individually verified.
|
||||
- The TR-registry update (when new TR-IDs are defined in the ADR) is part of the
|
||||
write phase — tested implicitly via Case 1.
|
||||
185
CCGS Skill Testing Framework/skills/authoring/art-bible.md
Normal file
185
CCGS Skill Testing Framework/skills/authoring/art-bible.md
Normal file
@@ -0,0 +1,185 @@
|
||||
# Skill Test Spec: /art-bible
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/art-bible` is a guided, section-by-section art bible authoring skill. It
|
||||
produces a comprehensive visual direction document covering: Visual Style overview,
|
||||
Color Palette, Typography, Character Design Rules, Environment Style, and UI
|
||||
Visual Language. The skill follows the skeleton-first pattern: creates the file
|
||||
with all section headers immediately, then fills each section through discussion
|
||||
and writes each to disk after user approval.
|
||||
|
||||
In `full` review mode, the AD-ART-BIBLE director gate (art director) runs after
|
||||
the draft is complete and before any section is written. In `lean` and `solo`
|
||||
modes, AD-ART-BIBLE is skipped and only user approval is required. The verdict
|
||||
is COMPLETE when all sections are written.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" language per section
|
||||
- [ ] Documents the AD-ART-BIBLE director gate and its mode behavior
|
||||
- [ ] Has a next-step handoff (e.g., `/asset-spec` or `/design-system`)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
| Gate ID | Trigger condition | Mode guard |
|
||||
|--------------|--------------------------------|-----------------------|
|
||||
| AD-ART-BIBLE | After draft is complete | full only (not lean/solo) |
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Full mode, art bible drafted, AD-ART-BIBLE approves
|
||||
|
||||
**Fixture:**
|
||||
- No existing `design/art-bible.md`
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
- `design/gdd/game-concept.md` exists with visual tone described
|
||||
|
||||
**Input:** `/art-bible`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill creates skeleton `design/art-bible.md` with all section headers
|
||||
2. Skill discusses and drafts each section with user collaboration
|
||||
3. After all sections are drafted, AD-ART-BIBLE gate is invoked (art director review)
|
||||
4. AD-ART-BIBLE returns APPROVED
|
||||
5. Skill asks "May I write section [N] to `design/art-bible.md`?" per section
|
||||
6. All sections written after approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skeleton file is created first (before any section content is written)
|
||||
- [ ] AD-ART-BIBLE gate is invoked in full mode after draft is complete
|
||||
- [ ] Gate approval precedes the "May I write" section asks
|
||||
- [ ] All sections are present in the final file
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: AD-ART-BIBLE Returns CONCERNS — Section revised before writing
|
||||
|
||||
**Fixture:**
|
||||
- Art bible draft complete
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
- AD-ART-BIBLE gate returns CONCERNS: "Color palette clashes with the dark
|
||||
atmospheric tone described in the game concept"
|
||||
|
||||
**Input:** `/art-bible`
|
||||
|
||||
**Expected behavior:**
|
||||
1. AD-ART-BIBLE gate returns CONCERNS with specific feedback about palette
|
||||
2. Skill surfaces feedback to user: "Art director has concerns about the color palette"
|
||||
3. Skill returns to the Color Palette section for revision
|
||||
4. User and skill revise the palette to align with game concept tone
|
||||
5. AD-ART-BIBLE is not re-invoked (user decides to proceed after revision)
|
||||
6. Revised section is written after "May I write" approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] CONCERNS are shown to user before any section is written
|
||||
- [ ] Skill returns to the affected section for revision (not all sections)
|
||||
- [ ] Revised content (not original) is written to file
|
||||
- [ ] Verdict is COMPLETE after revision and approval
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Lean Mode — AD-ART-BIBLE Skipped, Written With User Approval Only
|
||||
|
||||
**Fixture:**
|
||||
- No existing art bible
|
||||
- `production/session-state/review-mode.txt` contains `lean`
|
||||
|
||||
**Input:** `/art-bible`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads review mode — determines `lean`
|
||||
2. Skill drafts all sections with user collaboration
|
||||
3. AD-ART-BIBLE gate is skipped: output notes "[AD-ART-BIBLE] skipped — lean mode"
|
||||
4. Skill asks user for direct approval of each section
|
||||
5. Sections are written after user confirmation; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] AD-ART-BIBLE gate is NOT invoked in lean mode
|
||||
- [ ] Skip is explicitly noted: "[AD-ART-BIBLE] skipped — lean mode"
|
||||
- [ ] User approval is still required per section (gate skip ≠ approval skip)
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Existing Art Bible — Retrofit Mode
|
||||
|
||||
**Fixture:**
|
||||
- `design/art-bible.md` already exists with all sections populated
|
||||
- User wants to update the Character Design Rules section
|
||||
|
||||
**Input:** `/art-bible`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads existing art bible and detects all sections populated
|
||||
2. Skill offers retrofit: "Art bible exists — which section would you like to update?"
|
||||
3. User selects Character Design Rules
|
||||
4. Skill drafts updated content; in full mode, AD-ART-BIBLE is invoked for the
|
||||
revised section before writing
|
||||
5. Skill asks "May I write Character Design Rules to `design/art-bible.md`?"
|
||||
6. Only that section is updated; other sections preserved; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Existing art bible is detected and retrofit is offered
|
||||
- [ ] Only the selected section is updated
|
||||
- [ ] In full mode: AD-ART-BIBLE gate runs even for single-section retrofit
|
||||
- [ ] Other sections are preserved
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Solo Mode — AD-ART-BIBLE Skipped, Noted in Output
|
||||
|
||||
**Fixture:**
|
||||
- No existing art bible
|
||||
- `production/session-state/review-mode.txt` contains `solo`
|
||||
|
||||
**Input:** `/art-bible`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads review mode — determines `solo`
|
||||
2. Art bible is drafted and written with only user approval
|
||||
3. AD-ART-BIBLE gate is skipped: output notes "[AD-ART-BIBLE] skipped — solo mode"
|
||||
4. No director agents are spawned
|
||||
5. Verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] AD-ART-BIBLE gate is NOT invoked in solo mode
|
||||
- [ ] Skip is explicitly noted with "solo mode" label
|
||||
- [ ] No director agents of any kind are spawned
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Creates skeleton file immediately with all section headers
|
||||
- [ ] Discusses and drafts one section at a time
|
||||
- [ ] AD-ART-BIBLE gate runs in full mode after all sections are drafted
|
||||
- [ ] AD-ART-BIBLE is skipped in lean and solo modes — noted by name
|
||||
- [ ] Asks "May I write section [N]" per section
|
||||
- [ ] Verdict is COMPLETE when all sections are written
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where AD-ART-BIBLE returns REJECT (not just CONCERNS) is not
|
||||
separately tested; the skill would block writing and ask the user how to
|
||||
proceed (revise or override).
|
||||
- The Typography section is listed as a required art bible section but its
|
||||
specific content requirements are not assertion-tested here.
|
||||
- The art bible feeds into `/asset-spec` — this relationship is noted in the
|
||||
handoff but not tested as part of this skill's spec.
|
||||
@@ -0,0 +1,187 @@
|
||||
# Skill Test Spec: /create-architecture
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/create-architecture` guides the user through section-by-section authoring of a
|
||||
technical architecture document. It uses a skeleton-first approach — the file is
|
||||
created with all required section headers before any content is filled. Each
|
||||
section is discussed, drafted, and written individually after user approval. If an
|
||||
architecture document already exists, the skill offers retrofit mode to update
|
||||
specific sections.
|
||||
|
||||
In `full` review mode, TD-ARCHITECTURE (technical-director) and LP-FEASIBILITY
|
||||
(lead-programmer) spawn after the complete draft is finished. In `lean` or `solo`
|
||||
mode, both gates are skipped. The skill writes to `docs/architecture/architecture.md`.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: APPROVED, NEEDS REVISION, MAJOR REVISION NEEDED
|
||||
- [ ] Contains "May I write" collaborative protocol language (per-section approval)
|
||||
- [ ] Has a next-step handoff at the end (`/architecture-review` or `/create-control-manifest`)
|
||||
- [ ] Documents skeleton-first approach
|
||||
- [ ] Documents gate behavior: TD-ARCHITECTURE + LP-FEASIBILITY in full mode; skipped in lean/solo
|
||||
- [ ] Documents retrofit mode for existing architecture documents
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
In `full` mode: TD-ARCHITECTURE (technical-director) and LP-FEASIBILITY
|
||||
(lead-programmer) spawn in parallel after all sections are drafted and before
|
||||
any final approval write.
|
||||
|
||||
In `lean` mode: both gates are skipped. Output notes:
|
||||
"TD-ARCHITECTURE skipped — lean mode" and "LP-FEASIBILITY skipped — lean mode".
|
||||
|
||||
In `solo` mode: both gates are skipped with equivalent notes.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — New architecture doc, skeleton-first, full mode gates approve
|
||||
|
||||
**Fixture:**
|
||||
- No existing `docs/architecture/architecture.md`
|
||||
- `docs/architecture/` contains Accepted ADRs for reference
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/create-architecture`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill creates skeleton `docs/architecture/architecture.md` with all required section headers
|
||||
2. For each section: drafts content, shows draft, asks "May I write [section]?", writes after approval
|
||||
3. After all sections are drafted: TD-ARCHITECTURE and LP-FEASIBILITY spawn in parallel
|
||||
4. Both gates return APPROVED
|
||||
5. Final "May I confirm architecture is complete?" asked
|
||||
6. Session state updated
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skeleton file is created with all section headers before any content is written
|
||||
- [ ] "May I write [section]?" asked per section during authoring
|
||||
- [ ] TD-ARCHITECTURE and LP-FEASIBILITY spawn in parallel (not sequentially)
|
||||
- [ ] Both gates complete before the final completion confirmation
|
||||
- [ ] Verdict is APPROVED when both gates return APPROVED
|
||||
- [ ] Next-step handoff to `/architecture-review` or `/create-control-manifest` is present
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — TD-ARCHITECTURE returns MAJOR REVISION
|
||||
|
||||
**Fixture:**
|
||||
- Architecture doc is fully drafted (all sections)
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
- TD-ARCHITECTURE gate returns MAJOR REVISION: "[specific structural issue]"
|
||||
|
||||
**Input:** `/create-architecture`
|
||||
|
||||
**Expected behavior:**
|
||||
1. All sections are drafted and written
|
||||
2. TD-ARCHITECTURE gate runs and returns MAJOR REVISION with specific feedback
|
||||
3. Skill surfaces the feedback to the user
|
||||
4. Architecture is NOT marked as finalized
|
||||
5. User is asked: revise the flagged sections, or accept the document as a draft
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Architecture is NOT marked finalized when TD-ARCHITECTURE returns MAJOR REVISION
|
||||
- [ ] Gate feedback is shown to the user with specific issue descriptions
|
||||
- [ ] User is given the option to revise specific sections
|
||||
- [ ] Skill does NOT auto-finalize despite MAJOR REVISION feedback
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Lean Mode — Both gates skipped; architecture written with user approval only
|
||||
|
||||
**Fixture:**
|
||||
- No existing architecture doc
|
||||
- `production/session-state/review-mode.txt` contains `lean`
|
||||
|
||||
**Input:** `/create-architecture`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skeleton file is created
|
||||
2. All sections are authored and written per-section with user approval
|
||||
3. After completion: TD-ARCHITECTURE and LP-FEASIBILITY are skipped
|
||||
4. Output notes: "TD-ARCHITECTURE skipped — lean mode" and "LP-FEASIBILITY skipped — lean mode"
|
||||
5. Architecture is considered complete based on user approval alone
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Both gate skip notes appear in output
|
||||
- [ ] Architecture document is written with only user approval in lean mode
|
||||
- [ ] Skill does NOT block completion because gates were skipped
|
||||
- [ ] Next-step handoff is still present
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Retrofit Mode — Existing architecture doc, user updates a section
|
||||
|
||||
**Fixture:**
|
||||
- `docs/architecture/architecture.md` already exists with all sections populated
|
||||
|
||||
**Input:** `/create-architecture`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects existing architecture doc and reads its current content
|
||||
2. Skill offers retrofit mode: "Architecture doc already exists. Which section would you like to update?"
|
||||
3. User selects a section
|
||||
4. Skill authors only that section, asks "May I write [section]?"
|
||||
5. Only the selected section is updated — other sections unchanged
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill detects and reads the existing architecture doc before offering retrofit
|
||||
- [ ] User is asked which section to update — not asked to rewrite the whole document
|
||||
- [ ] Only the selected section is updated
|
||||
- [ ] Other sections are not modified during a retrofit session
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — Architecture references a Proposed ADR; flagged as risk
|
||||
|
||||
**Fixture:**
|
||||
- Architecture doc is being authored
|
||||
- One section references or depends on an ADR that has `Status: Proposed`
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/create-architecture`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill authors all sections
|
||||
2. During authoring, skill detects a reference to a Proposed ADR
|
||||
3. Skill flags: "Note: [section] references ADR-NNN which is Proposed — this is a risk until the ADR is accepted"
|
||||
4. Risk flag is embedded in the relevant section's content
|
||||
5. TD-ARCHITECTURE and LP-FEASIBILITY still run — they are informed of the Proposed ADR risk
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Proposed ADR reference is detected and flagged during section authoring
|
||||
- [ ] Risk note is embedded in the architecture document section
|
||||
- [ ] TD-ARCHITECTURE and LP-FEASIBILITY still spawn (the risk does not block the gates)
|
||||
- [ ] Risk flag names the specific ADR number and title
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Skeleton file created with all section headers before any content is written
|
||||
- [ ] "May I write [section]?" asked per section during authoring
|
||||
- [ ] TD-ARCHITECTURE and LP-FEASIBILITY spawn in parallel in full mode
|
||||
- [ ] Skipped gates noted by name and mode in lean/solo output
|
||||
- [ ] Proposed ADR references flagged as risks in the document
|
||||
- [ ] Ends with next-step handoff: `/architecture-review` or `/create-control-manifest`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The required section list for architecture documents is defined in the skill
|
||||
body and in the `/architecture-review` skill — not re-enumerated here.
|
||||
- Engine version stamping in the architecture doc (parallel to ADR stamping)
|
||||
is part of the authoring workflow — tested implicitly via Case 1.
|
||||
- The retrofit mode for updating multiple sections in one session follows the
|
||||
same per-section approval pattern — not independently tested for multi-section
|
||||
retrofits.
|
||||
192
CCGS Skill Testing Framework/skills/authoring/design-system.md
Normal file
192
CCGS Skill Testing Framework/skills/authoring/design-system.md
Normal file
@@ -0,0 +1,192 @@
|
||||
# Skill Test Spec: /design-system
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/design-system` guides the user through section-by-section authoring of a Game
|
||||
Design Document (GDD) for a single game system. All 8 required sections must be
|
||||
authored: Overview, Player Fantasy, Detailed Rules, Formulas, Edge Cases,
|
||||
Dependencies, Tuning Knobs, and Acceptance Criteria. The skill uses a
|
||||
skeleton-first approach — it creates the GDD file with all 8 section headers
|
||||
before filling any content — and writes each section individually after approval.
|
||||
|
||||
The CD-GDD-ALIGN gate (creative-director) runs in both `full` AND `lean` modes.
|
||||
It is only skipped in `solo` mode. If an existing GDD file is found, the skill
|
||||
offers a retrofit mode to update specific sections rather than rewriting the whole
|
||||
document.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: APPROVED, NEEDS REVISION, MAJOR REVISION
|
||||
- [ ] Contains "May I write" collaborative protocol language (per-section approval)
|
||||
- [ ] Has a next-step handoff at the end
|
||||
- [ ] Documents skeleton-first approach (file created with headers before content)
|
||||
- [ ] Documents CD-GDD-ALIGN gate: active in full AND lean mode; skipped in solo only
|
||||
- [ ] Documents retrofit mode for existing GDD files
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
In `full` mode: CD-GDD-ALIGN (creative-director) gate runs after each section is
|
||||
drafted, before writing. If MAJOR REVISION is returned, the section must be
|
||||
rewritten before proceeding.
|
||||
|
||||
In `lean` mode: CD-GDD-ALIGN still runs (this gate is NOT skipped in lean mode —
|
||||
it runs in both full and lean). Only solo mode skips it.
|
||||
|
||||
In `solo` mode: CD-GDD-ALIGN is skipped. Output notes:
|
||||
"CD-GDD-ALIGN skipped — solo mode". Sections are written with only user approval.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — New GDD, skeleton-first, CD-GDD-ALIGN in lean mode
|
||||
|
||||
**Fixture:**
|
||||
- No existing GDD for the target system in `design/gdd/`
|
||||
- `production/session-state/review-mode.txt` contains `lean`
|
||||
|
||||
**Input:** `/design-system [system-name]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill creates skeleton file `design/gdd/[system-name].md` with all 8 section headers (empty bodies)
|
||||
2. For each section: discusses with user, drafts content, shows draft
|
||||
3. CD-GDD-ALIGN gate runs on each section draft (lean mode — gate is active)
|
||||
4. Gate returns APPROVED for each section
|
||||
5. "May I write [section]?" asked after gate approval
|
||||
6. Section written to file after user approval
|
||||
7. Process repeats for all 8 sections
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skeleton file is created with all 8 section headers before any content is written
|
||||
- [ ] CD-GDD-ALIGN runs on each section in lean mode (not skipped)
|
||||
- [ ] "May I write" is asked per section (not once for all sections)
|
||||
- [ ] Each section is written individually after gate + user approval
|
||||
- [ ] All 8 sections are present in the final GDD file
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Retrofit Mode — Existing GDD, update specific section
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/[system-name].md` already exists with all 8 sections populated
|
||||
|
||||
**Input:** `/design-system [system-name]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects existing GDD file and reads its current content
|
||||
2. Skill offers retrofit mode: "GDD already exists. Which section would you like to update?"
|
||||
3. User selects a specific section (e.g., Formulas)
|
||||
4. Skill authors only that section, runs CD-GDD-ALIGN, asks "May I write?"
|
||||
5. Only the selected section is updated — other sections are not modified
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill detects and reads existing GDD before offering retrofit mode
|
||||
- [ ] User is asked which section to update — not asked to rewrite the whole document
|
||||
- [ ] Only the selected section is rewritten — others remain unchanged
|
||||
- [ ] CD-GDD-ALIGN still runs on the updated section
|
||||
- [ ] "May I write" is asked before updating the section
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Director Gate — CD-GDD-ALIGN returns MAJOR REVISION
|
||||
|
||||
**Fixture:**
|
||||
- New GDD being authored
|
||||
- `production/session-state/review-mode.txt` contains `lean`
|
||||
- CD-GDD-ALIGN gate returns MAJOR REVISION on the Player Fantasy section
|
||||
|
||||
**Input:** `/design-system [system-name]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Player Fantasy section is drafted
|
||||
2. CD-GDD-ALIGN gate runs and returns MAJOR REVISION with specific feedback
|
||||
3. Skill surfaces the feedback to the user
|
||||
4. Section is NOT written to file while MAJOR REVISION is unresolved
|
||||
5. User rewrites the section in collaboration with the skill
|
||||
6. CD-GDD-ALIGN runs again on the revised section
|
||||
7. If revised section passes, "May I write?" is asked and section is written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Section is NOT written when CD-GDD-ALIGN returns MAJOR REVISION
|
||||
- [ ] Gate feedback is shown to the user before requesting revision
|
||||
- [ ] CD-GDD-ALIGN runs again after the section is revised
|
||||
- [ ] Skill does NOT auto-proceed to the next section while MAJOR REVISION is unresolved
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Solo Mode — CD-GDD-ALIGN skipped; sections written with user approval only
|
||||
|
||||
**Fixture:**
|
||||
- New GDD being authored
|
||||
- `production/session-state/review-mode.txt` contains `solo`
|
||||
|
||||
**Input:** `/design-system [system-name]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skeleton file is created with 8 section headers
|
||||
2. For each section: drafted, shown to user
|
||||
3. CD-GDD-ALIGN is skipped — noted per section: "CD-GDD-ALIGN skipped — solo mode"
|
||||
4. "May I write [section]?" asked after user reviews draft
|
||||
5. Section written after user approval
|
||||
6. No gate review at any stage
|
||||
|
||||
**Assertions:**
|
||||
- [ ] "CD-GDD-ALIGN skipped — solo mode" noted for each section
|
||||
- [ ] Sections are written after user approval alone (no gate required)
|
||||
- [ ] Skill does NOT spawn any CD-GDD-ALIGN gate in solo mode
|
||||
- [ ] Full GDD is written with only user approval in solo mode
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — Empty sections not written to file
|
||||
|
||||
**Fixture:**
|
||||
- GDD authoring in progress
|
||||
- User and skill discuss one section but do not produce any approved content
|
||||
(e.g., discussion ends without a decision, or user says "skip for now")
|
||||
|
||||
**Input:** `/design-system [system-name]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Section discussion produces no approved content
|
||||
2. Skill does NOT write an empty or placeholder body to the section
|
||||
3. The section header remains in the skeleton file but the body stays empty
|
||||
4. Skill moves to the next section without writing the empty one
|
||||
5. At the end, incomplete sections are listed and user is reminded to return to them
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Empty or unapproved sections are NOT written to the file
|
||||
- [ ] Skeleton section header remains (preserves structure)
|
||||
- [ ] Skill tracks and lists incomplete sections at the end of the session
|
||||
- [ ] Skill does NOT write "TBD" or placeholder content without user approval
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Skeleton file created with all 8 headers before any content is written
|
||||
- [ ] CD-GDD-ALIGN runs in both full AND lean mode (not just full)
|
||||
- [ ] CD-GDD-ALIGN skipped only in solo mode — noted per section
|
||||
- [ ] "May I write [section]?" asked per section (not once for the whole document)
|
||||
- [ ] MAJOR REVISION from CD-GDD-ALIGN blocks section write until resolved
|
||||
- [ ] Only approved, non-empty sections are written to the file
|
||||
- [ ] Ends with next-step handoff: `/review-all-gdds` or `/map-systems next`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The 8 required sections are validated against the project's design document
|
||||
standards defined in `CLAUDE.md` — not re-enumerated here.
|
||||
- The skill's internal section-ordering logic (which section to author first) is
|
||||
not independently tested — the order follows the standard GDD template.
|
||||
- Pillar alignment checking within CD-GDD-ALIGN is evaluated holistically by
|
||||
the gate agent — specific pillar checks are not fixture-tested here.
|
||||
176
CCGS Skill Testing Framework/skills/authoring/quick-design.md
Normal file
176
CCGS Skill Testing Framework/skills/authoring/quick-design.md
Normal file
@@ -0,0 +1,176 @@
|
||||
# Skill Test Spec: /quick-design
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/quick-design` produces a lightweight design spec for features too small to
|
||||
warrant a full 8-section GDD. The target scope is under 4 hours of design time
|
||||
for a single-system feature. Instead of the full 8-section GDD format, the
|
||||
quick-design spec uses a streamlined 3-section format: Overview, Rules, and
|
||||
Acceptance Criteria.
|
||||
|
||||
The skill has no director gates — adding gate overhead would defeat the purpose
|
||||
of a lightweight design tool. The skill asks "May I write" before writing the
|
||||
design note to `design/quick-notes/[name].md`. If the feature scope is too large
|
||||
for a quick-design, the skill redirects to `/design-system` instead.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: CREATED, BLOCKED, REDIRECTED
|
||||
- [ ] Contains "May I write" collaborative protocol language (for quick-note file)
|
||||
- [ ] Has a next-step handoff at the end
|
||||
- [ ] Explicitly notes: no director gates (lightweight skill by design)
|
||||
- [ ] Mentions scope check: redirects to `/design-system` if scope exceeds sub-4h threshold
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
No director gates — this skill spawns no director gate agents. The lightweight
|
||||
nature of quick-design means director gate overhead is intentionally absent.
|
||||
Full GDD review is not needed for sub-4-hour single-system features.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Small UI change produces a 3-section spec
|
||||
|
||||
**Fixture:**
|
||||
- No existing quick-note for the target feature
|
||||
- Feature is clearly scoped: a single UI element change with no cross-system impact
|
||||
|
||||
**Input:** `/quick-design [feature-name]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill asks scoping questions: what system, what change, what is the acceptance signal
|
||||
2. Skill determines scope is within the sub-4h threshold
|
||||
3. Skill drafts a 3-section spec: Overview, Rules, Acceptance Criteria
|
||||
4. Draft is shown to user
|
||||
5. "May I write `design/quick-notes/[name].md`?" is asked
|
||||
6. File is written after approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Spec contains exactly 3 sections: Overview, Rules, Acceptance Criteria
|
||||
- [ ] Draft is shown to user before "May I write" ask
|
||||
- [ ] "May I write `design/quick-notes/[name].md`?" is asked before writing
|
||||
- [ ] File is written to the correct path: `design/quick-notes/[name].md`
|
||||
- [ ] Verdict is CREATED after successful write
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — Scope check fails; redirected to /design-system
|
||||
|
||||
**Fixture:**
|
||||
- Feature described spans multiple systems or would take more than 4 hours of design time
|
||||
(e.g., "redesign the entire combat system" or "new progression mechanic affecting all classes")
|
||||
|
||||
**Input:** `/quick-design [large-feature]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill asks scoping questions
|
||||
2. Skill determines scope exceeds the sub-4h / single-system threshold
|
||||
3. Skill outputs: "This feature is too large for a quick-design. Use `/design-system [name]` for a full GDD."
|
||||
4. Skill does NOT write a quick-note file
|
||||
5. Verdict is REDIRECTED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill detects the scope excess and stops before drafting
|
||||
- [ ] Message explicitly names `/design-system` as the correct alternative
|
||||
- [ ] No quick-note file is written
|
||||
- [ ] Verdict is REDIRECTED (not CREATED or BLOCKED)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Edge Case — File already exists; offered to update
|
||||
|
||||
**Fixture:**
|
||||
- `design/quick-notes/[name].md` already exists from a previous session
|
||||
|
||||
**Input:** `/quick-design [name]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects existing quick-note file and reads its current content
|
||||
2. Skill asks: "[name].md already exists. Update it, or create a new version?"
|
||||
3. User selects update
|
||||
4. Skill shows the existing spec and asks which section to revise
|
||||
5. Updated spec is shown, "May I write?" asked, file updated after approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill detects and reads the existing file before offering to update
|
||||
- [ ] User is offered update or create-new options — not auto-overwritten
|
||||
- [ ] Only the revised section is updated (or the whole spec if user chooses full rewrite)
|
||||
- [ ] "May I write" is asked before overwriting the existing file
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No argument provided
|
||||
|
||||
**Fixture:**
|
||||
- `design/quick-notes/` directory may or may not exist
|
||||
|
||||
**Input:** `/quick-design` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no argument is provided
|
||||
2. Skill outputs a usage error: "No feature name specified. Usage: /quick-design [feature-name]"
|
||||
3. Skill provides an example: `/quick-design pause-menu-settings`
|
||||
4. No file is created
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs a usage error when no argument is given
|
||||
- [ ] A usage example is shown with the correct format
|
||||
- [ ] No quick-note file is written
|
||||
- [ ] Skill does NOT silently pick a feature name or default to any action
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — No gate spawned; explicitly noted for sub-4h features
|
||||
|
||||
**Fixture:**
|
||||
- Feature is within scope for quick-design
|
||||
- `production/session-state/review-mode.txt` exists with `full`
|
||||
|
||||
**Input:** `/quick-design [feature-name]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill asks scoping questions and determines scope is within threshold
|
||||
2. Skill does NOT read `production/session-state/review-mode.txt`
|
||||
3. Skill does NOT spawn any director gate agent
|
||||
4. Spec is drafted, "May I write" asked, file written after approval
|
||||
5. Output explicitly notes: "No director gate review — quick-design is for sub-4h features"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate agents are spawned (no CD-, TD-, PR-, AD- prefixed gates)
|
||||
- [ ] Skill does NOT read `production/session-state/review-mode.txt`
|
||||
- [ ] Output contains a note explaining why no gate review is needed
|
||||
- [ ] Review mode has no effect on this skill's behavior
|
||||
- [ ] Full GDD review path (`/design-system`) is mentioned as the alternative for larger features
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Scope check runs before drafting (redirects to `/design-system` if scope too large)
|
||||
- [ ] 3-section format used (Overview, Rules, Acceptance Criteria) — NOT the 8-section GDD format
|
||||
- [ ] Draft shown to user before "May I write" ask
|
||||
- [ ] "May I write `design/quick-notes/[name].md`?" asked before writing
|
||||
- [ ] No director gates — no review-mode.txt read
|
||||
- [ ] Ends with next-step handoff (e.g., proceed to implementation or `/dev-story`)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The scope threshold heuristic (sub-4h, single-system) is a judgment call —
|
||||
the skill's internal check is the authoritative definition and is not
|
||||
independently tested by counting hours.
|
||||
- The `design/quick-notes/` directory is created automatically if it does not
|
||||
exist — this filesystem behavior is not independently tested here.
|
||||
- Integration with the story pipeline (can a quick-design generate a story
|
||||
directly?) is out of scope for this spec — quick-designs are standalone.
|
||||
176
CCGS Skill Testing Framework/skills/authoring/ux-design.md
Normal file
176
CCGS Skill Testing Framework/skills/authoring/ux-design.md
Normal file
@@ -0,0 +1,176 @@
|
||||
# Skill Test Spec: /ux-design
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/ux-design` is a guided, section-by-section UX spec authoring skill. It produces
|
||||
user flow diagrams (described textually), interaction state definitions, wireframe
|
||||
descriptions, and accessibility notes for a specified screen or HUD element. The
|
||||
skill follows the skeleton-first pattern: it creates the file with all section
|
||||
headers immediately, then fills each section through discussion and writes each
|
||||
section to disk after user approval.
|
||||
|
||||
The skill has no inline director gates — `/ux-review` is the separate review step.
|
||||
Each section requires a "May I write section [N] to [filepath]?" ask. If a UX spec
|
||||
already exists for the named screen, the skill offers to retrofit individual sections
|
||||
rather than replace. Verdict is COMPLETE when all sections are written.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" language per section
|
||||
- [ ] Has a next-step handoff (e.g., `/ux-review` to validate the completed spec)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/ux-design` has no inline director gates. `/ux-review` is the separate
|
||||
review skill invoked after this skill completes.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — New HUD spec, all sections authored and written
|
||||
|
||||
**Fixture:**
|
||||
- No existing HUD UX spec in `design/ux/`
|
||||
- Engine and rendering preferences configured
|
||||
|
||||
**Input:** `/ux-design hud`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill creates a skeleton file `design/ux/hud.md` with all section headers
|
||||
2. Skill discusses and drafts each section: User Flows, Interaction States
|
||||
(normal/hover/focus/disabled), Wireframe Description, Accessibility Notes
|
||||
3. After each section is drafted and user confirms, skill asks "May I write
|
||||
section [N] to `design/ux/hud.md`?"
|
||||
4. Each section is written in sequence after approval
|
||||
5. After all sections are written, verdict is COMPLETE
|
||||
6. Skill suggests running `/ux-review` as the next step
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skeleton file is created first (with empty section bodies)
|
||||
- [ ] "May I write section [N]" is asked per section (not once at the end)
|
||||
- [ ] All required sections are present: User Flows, Interaction States,
|
||||
Wireframe Description, Accessibility Notes
|
||||
- [ ] Handoff to `/ux-review` is at the end
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Existing UX Spec — Retrofit: user picks section to update
|
||||
|
||||
**Fixture:**
|
||||
- `design/ux/hud.md` already exists with all sections populated
|
||||
- User wants to update only the Accessibility Notes section
|
||||
|
||||
**Input:** `/ux-design hud`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads existing `design/ux/hud.md` and detects all sections are populated
|
||||
2. Skill reports: "UX spec already exists for HUD — offering to retrofit"
|
||||
3. Skill lists all sections and asks which to update
|
||||
4. User selects Accessibility Notes
|
||||
5. Skill drafts updated accessibility content and asks "May I write section
|
||||
Accessibility Notes to `design/ux/hud.md`?"
|
||||
6. Only that section is updated; other sections are preserved; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Existing spec is detected and retrofit is offered
|
||||
- [ ] User selects which section(s) to update
|
||||
- [ ] Only the selected section is updated — other sections unchanged
|
||||
- [ ] "May I write" is asked for the updated section
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Dependency Gap — Spec references a system with no design doc
|
||||
|
||||
**Fixture:**
|
||||
- User is authoring a UX spec for the inventory screen
|
||||
- `design/gdd/inventory.md` does not exist
|
||||
|
||||
**Input:** `/ux-design inventory-screen`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill begins authoring the inventory screen UX spec
|
||||
2. During the User Flows section, skill attempts to reference inventory system rules
|
||||
3. Skill detects: "No GDD found for inventory system — UX spec has a DEPENDENCY GAP"
|
||||
4. The dependency gap is flagged in the spec (noted inline: "DEPENDENCY GAP: inventory GDD")
|
||||
5. Skill continues authoring with placeholder notes for the missing rules
|
||||
6. Verdict is COMPLETE with advisory note about the dependency gap
|
||||
|
||||
**Assertions:**
|
||||
- [ ] DEPENDENCY GAP label appears in the spec for the missing system doc
|
||||
- [ ] Skill does NOT block on the missing GDD — it continues with placeholders
|
||||
- [ ] Dependency gap is also noted in the skill output (not just in the file)
|
||||
- [ ] Handoff suggests both `/ux-review` and writing the missing GDD
|
||||
|
||||
---
|
||||
|
||||
### Case 4: No Argument Provided — Usage error
|
||||
|
||||
**Fixture:**
|
||||
- No argument provided with the skill invocation
|
||||
|
||||
**Input:** `/ux-design`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no screen name or argument provided
|
||||
2. Skill outputs a usage error: "Screen name required. Usage: `/ux-design [screen-name]`"
|
||||
3. Skill provides examples: `/ux-design hud`, `/ux-design main-menu`, `/ux-design inventory`
|
||||
4. No file is created; no "May I write" is asked
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Usage error is clearly stated
|
||||
- [ ] Example invocations are provided
|
||||
- [ ] No file is created
|
||||
- [ ] Skill does not attempt to proceed without an argument
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; ux-review is the separate review skill
|
||||
|
||||
**Fixture:**
|
||||
- New screen spec with argument provided
|
||||
|
||||
**Input:** `/ux-design settings-menu`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill authors all sections of the settings menu UX spec
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output during authoring
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked during ux-design
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is COMPLETE without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Creates skeleton file with all section headers before discussing content
|
||||
- [ ] Discusses and drafts one section at a time
|
||||
- [ ] Asks "May I write section [N]" after each section is approved
|
||||
- [ ] Detects existing spec and offers retrofit path
|
||||
- [ ] Ends with handoff to `/ux-review`
|
||||
- [ ] Verdict is COMPLETE when all sections are written
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Interaction state enumeration (normal/hover/focus/disabled/error) is a core
|
||||
requirement of each spec; the `/ux-review` skill checks for completeness.
|
||||
- Wireframe descriptions are text-only (no images); image references may be
|
||||
added manually by a designer after the fact.
|
||||
- Responsive layout concerns (different screen sizes) are noted as optional
|
||||
content and not assertion-tested here.
|
||||
176
CCGS Skill Testing Framework/skills/authoring/ux-review.md
Normal file
176
CCGS Skill Testing Framework/skills/authoring/ux-review.md
Normal file
@@ -0,0 +1,176 @@
|
||||
# Skill Test Spec: /ux-review
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/ux-review` validates an existing UX spec or HUD design document against
|
||||
accessibility and interaction standards. It checks for required sections
|
||||
(User Flows, Interaction States, Wireframe Description, Accessibility Notes),
|
||||
completeness of interaction state definitions (hover, focus, disabled, error),
|
||||
accessibility compliance (keyboard navigation, color contrast notes, screen
|
||||
reader considerations), and consistency with the art bible or design system
|
||||
if those documents exist.
|
||||
|
||||
The skill is read-only — it produces no file writes. Verdicts: APPROVED
|
||||
(all checks pass), NEEDS REVISION (fixable issues found), or MAJOR REVISION
|
||||
NEEDED (structural or accessibility failures). No director gates apply —
|
||||
`/ux-review` IS the review gate for UX specs.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: APPROVED, NEEDS REVISION, MAJOR REVISION NEEDED
|
||||
- [ ] Does NOT contain "May I write" language (skill is read-only)
|
||||
- [ ] Has a next-step handoff (e.g., back to `/ux-design` for revision, or proceed to implementation)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/ux-review` is itself the review gate for UX specs. No additional director
|
||||
gates are invoked within this skill.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Complete UX spec with all required sections, APPROVED
|
||||
|
||||
**Fixture:**
|
||||
- `design/ux/hud.md` exists with all required sections populated:
|
||||
- User Flows: complete player flow diagrams
|
||||
- Interaction States: normal, hover, focus, disabled, error all defined
|
||||
- Wireframe Description: layout described
|
||||
- Accessibility Notes: keyboard nav, contrast ratios, screen reader notes
|
||||
|
||||
**Input:** `/ux-review hud`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `design/ux/hud.md`
|
||||
2. Skill checks all 4 required sections — all present and non-empty
|
||||
3. Skill checks interaction states — all 5 states defined
|
||||
4. Skill checks accessibility notes — keyboard, contrast, and screen reader covered
|
||||
5. Skill outputs: checklist of all passed checks
|
||||
6. Verdict is APPROVED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 4 required sections are checked
|
||||
- [ ] All 5 interaction states are verified present
|
||||
- [ ] Verdict is APPROVED
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Missing Accessibility Section — NEEDS REVISION
|
||||
|
||||
**Fixture:**
|
||||
- `design/ux/hud.md` exists but the Accessibility Notes section is empty
|
||||
- All other sections are fully populated
|
||||
|
||||
**Input:** `/ux-review hud`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the file and checks all sections
|
||||
2. Accessibility Notes section is empty — check fails
|
||||
3. Skill outputs: "NEEDS REVISION — Accessibility Notes section is empty"
|
||||
4. Skill lists specific items to add: keyboard navigation, color contrast ratios,
|
||||
screen reader labels
|
||||
5. Verdict is NEEDS REVISION
|
||||
6. Handoff suggests returning to `/ux-design hud` to fill in the section
|
||||
|
||||
**Assertions:**
|
||||
- [ ] NEEDS REVISION verdict is returned (not APPROVED or MAJOR REVISION NEEDED)
|
||||
- [ ] Specific missing content items are listed
|
||||
- [ ] Handoff points back to `/ux-design hud` for revision
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Interaction States Incomplete — NEEDS REVISION
|
||||
|
||||
**Fixture:**
|
||||
- `design/ux/settings-menu.md` exists
|
||||
- Interaction States section only defines: normal and hover
|
||||
- Missing: focus, disabled, error states
|
||||
|
||||
**Input:** `/ux-review settings-menu`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the file and checks interaction states
|
||||
2. Only 2 of 5 required states are defined
|
||||
3. Skill reports: "NEEDS REVISION — Interaction states incomplete: missing focus, disabled, error"
|
||||
4. Verdict is NEEDS REVISION with specific missing states named
|
||||
|
||||
**Assertions:**
|
||||
- [ ] NEEDS REVISION verdict returned
|
||||
- [ ] All 3 missing states are named explicitly in the output
|
||||
- [ ] Skill does not return MAJOR REVISION NEEDED for a fixable gap
|
||||
- [ ] Handoff suggests returning to `/ux-design settings-menu`
|
||||
|
||||
---
|
||||
|
||||
### Case 4: File Not Found — Error with remediation
|
||||
|
||||
**Fixture:**
|
||||
- `design/ux/inventory-screen.md` does not exist
|
||||
|
||||
**Input:** `/ux-review inventory-screen`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read `design/ux/inventory-screen.md` — file not found
|
||||
2. Skill outputs: "UX spec not found: design/ux/inventory-screen.md"
|
||||
3. Skill suggests running `/ux-design inventory-screen` to create the spec first
|
||||
4. No review is performed; no verdict is issued
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Error message names the missing file with full path
|
||||
- [ ] `/ux-design inventory-screen` is suggested as the remediation
|
||||
- [ ] No review checklist is produced
|
||||
- [ ] No verdict is issued (error state, not APPROVED/NEEDS REVISION)
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; ux-review is itself the review
|
||||
|
||||
**Fixture:**
|
||||
- Valid UX spec file
|
||||
|
||||
**Input:** `/ux-review hud`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill performs the review and issues a verdict
|
||||
2. No additional director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is APPROVED, NEEDS REVISION, or MAJOR REVISION NEEDED — no gate verdict
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Checks all 4 required sections (User Flows, Interaction States, Wireframe,
|
||||
Accessibility Notes)
|
||||
- [ ] Checks all 5 interaction states (normal, hover, focus, disabled, error)
|
||||
- [ ] Checks accessibility coverage (keyboard nav, contrast, screen reader)
|
||||
- [ ] Does not write any files
|
||||
- [ ] Issues specific, actionable feedback when verdict is not APPROVED
|
||||
- [ ] Ends with next-step handoff to `/ux-design` for revision or implementation
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- MAJOR REVISION NEEDED is triggered when structural sections are entirely
|
||||
absent (not just empty) or when fundamental interaction flows are missing
|
||||
entirely; not tested with a separate fixture here.
|
||||
- Art bible / design system consistency check (color palette alignment) is
|
||||
mentioned as a capability but not separately fixture-tested.
|
||||
- The case where an existing spec was written for a now-renamed screen is
|
||||
not tested; the skill would review the file by path regardless of the name.
|
||||
200
CCGS Skill Testing Framework/skills/gate/gate-check.md
Normal file
200
CCGS Skill Testing Framework/skills/gate/gate-check.md
Normal file
@@ -0,0 +1,200 @@
|
||||
# Skill Test Spec: /gate-check
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/gate-check` validates whether the project is ready to advance to the next
|
||||
development phase. It checks for required artifacts, runs quality checks, asks
|
||||
the user about unverifiable items, and produces a PASS/CONCERNS/FAIL verdict.
|
||||
On PASS with user confirmation, it writes the new stage name to
|
||||
`production/stage.txt`. It governs all 6 phase transitions and is the most
|
||||
critical gate-keeping skill in the pipeline.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings (numbered Phase N or ## sections)
|
||||
- [ ] Contains verdict keywords: PASS, CONCERNS, FAIL
|
||||
- [ ] Contains "May I write" collaborative protocol language
|
||||
- [ ] Has a next-step handoff at the end (Follow-Up Actions section)
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All Concept artifacts present, advancing to Systems Design
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/game-concept.md` exists, has content including all required sections
|
||||
- `design/gdd/game-pillars.md` exists (or pillars defined within concept doc)
|
||||
- No systems index yet (which is correct for this stage)
|
||||
|
||||
**Input:** `/gate-check systems-design`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `design/gdd/game-concept.md` and verifies it has content
|
||||
2. Skill checks for game pillars (in concept or separate file)
|
||||
3. Skill checks quality items (core loop described, target audience identified)
|
||||
4. Skill outputs structured checklist with all items marked
|
||||
5. Skill presents PASS/CONCERNS/FAIL verdict
|
||||
6. If PASS: skill asks "May I update `production/stage.txt` to 'Systems Design'?"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill uses Glob or Read to verify `design/gdd/game-concept.md` exists before marking it checked
|
||||
- [ ] Output includes a "Required Artifacts" section with check status per item
|
||||
- [ ] Output includes a "Quality Checks" section with check status per item
|
||||
- [ ] Output includes a "Verdict" line with one of PASS / CONCERNS / FAIL
|
||||
- [ ] Skill asks about unverifiable quality items (e.g., "Has this been reviewed?") rather than assuming PASS
|
||||
- [ ] Skill asks "May I write" before updating `production/stage.txt`
|
||||
- [ ] Skill does NOT write `production/stage.txt` without explicit user confirmation
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — Missing required artifacts for Concept → Systems Design
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/game-concept.md` does NOT exist
|
||||
- No game pillars document exists
|
||||
- `design/gdd/` directory is empty or absent
|
||||
|
||||
**Input:** `/gate-check systems-design`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read `design/gdd/game-concept.md` — file not found
|
||||
2. Skill marks required artifact as missing (not present)
|
||||
3. Skill outputs FAIL verdict
|
||||
4. Skill lists blocker: "No game concept document found"
|
||||
5. Skill suggests remediation: run `/brainstorm` to create one
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is FAIL (not PASS or CONCERNS) when required artifacts are missing
|
||||
- [ ] Output explicitly names `design/gdd/game-concept.md` as missing
|
||||
- [ ] Output includes a "Blockers" section with at least 1 item
|
||||
- [ ] Output recommends `/brainstorm` as the remediation action
|
||||
- [ ] Skill does NOT write `production/stage.txt` when verdict is FAIL
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Argument — Auto-detect current stage
|
||||
|
||||
**Fixture:**
|
||||
- `production/stage.txt` contains `Concept`
|
||||
- `design/gdd/game-concept.md` exists with content
|
||||
- No systems index yet
|
||||
|
||||
**Input:** `/gate-check` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `production/stage.txt` to determine current stage
|
||||
2. Skill determines the next gate is Concept → Systems Design
|
||||
3. Skill proceeds with the Systems Design gate checks
|
||||
4. Output clearly states which transition is being validated
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads `production/stage.txt` (or uses project-stage-detect heuristics) to determine current stage
|
||||
- [ ] Output header names both current and target phases (e.g., "Gate Check: Concept → Systems Design")
|
||||
- [ ] Skill does not ask the user which gate to check if current stage is determinable
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — Manual check items flagged correctly
|
||||
|
||||
**Fixture:**
|
||||
- All required artifacts for Concept → Systems Design are present
|
||||
- No playtest or review record exists (can't auto-verify quality checks)
|
||||
|
||||
**Input:** `/gate-check systems-design`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill verifies all artifact files exist
|
||||
2. Skill encounters quality check: "Game concept reviewed (not MAJOR REVISION NEEDED)"
|
||||
3. Since no review record exists, skill marks item as MANUAL CHECK NEEDED
|
||||
4. Skill asks the user: "Has the game concept been reviewed for design quality?"
|
||||
5. Skill waits for user input before finalizing verdict
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Items that cannot be auto-verified are marked `[?] MANUAL CHECK NEEDED` rather than assumed PASS
|
||||
- [ ] Skill uses a question to the user for at least one unverifiable quality item
|
||||
- [ ] Skill does not mark unverifiable items as PASS by default
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — lean vs full vs solo mode
|
||||
|
||||
**Fixture:**
|
||||
- `production/session-state/review-mode.txt` exists (or equivalent state file)
|
||||
- All required artifacts for the target gate are present
|
||||
- `design/gdd/game-concept.md` exists
|
||||
|
||||
**Case 5a — full mode:**
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/gate-check systems-design` (with full mode active)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads review mode — determines `full`
|
||||
2. Skill spawns all 4 PHASE-GATE director prompts in parallel:
|
||||
- CD-PHASE-GATE (creative-director)
|
||||
- TD-PHASE-GATE (technical-director)
|
||||
- PR-PHASE-GATE (producer)
|
||||
- AD-PHASE-GATE (art-director)
|
||||
3. If one director returns CONCERNS → overall gate verdict is at minimum CONCERNS
|
||||
4. All 4 verdicts are collected before producing final output
|
||||
|
||||
**Assertions (5a):**
|
||||
- [ ] Skill reads review-mode before deciding which directors to spawn
|
||||
- [ ] All 4 PHASE-GATE director prompts are spawned (not just 1 or 2)
|
||||
- [ ] Directors are spawned in parallel (simultaneous, not sequential)
|
||||
- [ ] A CONCERNS verdict from any one director propagates to overall verdict
|
||||
- [ ] Verdict is NOT auto-PASS if any director returns CONCERNS or REJECT
|
||||
|
||||
**Case 5b — solo mode:**
|
||||
- `review-mode.txt` contains `solo`
|
||||
|
||||
**Input:** `/gate-check systems-design` (with solo mode active)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads review mode — determines `solo`
|
||||
2. Each director is noted as skipped: "[CD-PHASE-GATE] skipped — Solo mode"
|
||||
3. Gate verdict is derived from artifact/quality checks only
|
||||
4. No director gates spawn
|
||||
|
||||
**Assertions (5b):**
|
||||
- [ ] No director gates are spawned in solo mode
|
||||
- [ ] Each skipped gate is explicitly noted in output: "[GATE-ID] skipped — Solo mode"
|
||||
- [ ] Verdict is based on artifact and quality checks only
|
||||
|
||||
**Note on Case 3 correction:**
|
||||
The Case 3 assertions previously stated "Skill does not ask the user which gate to check
|
||||
if current stage is determinable." This is correct. However, the skill DOES use
|
||||
AskUserQuestion to confirm the auto-detected transition before running full checks —
|
||||
this is a confirmation step, not a gate selection. Assertions for Case 3 should not
|
||||
treat this confirmation as a failure.
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Uses "May I write" before updating `production/stage.txt`
|
||||
- [ ] Presents the full checklist report before asking for write approval
|
||||
- [ ] Ends with a "Follow-Up Actions" section listing next steps per verdict
|
||||
- [ ] Never advances the stage without explicit user confirmation
|
||||
- [ ] Never auto-creates `production/stage.txt` if it doesn't exist without asking
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The Production → Polish and Polish → Release gates are not covered here
|
||||
because they require complex multi-artifact setups (sprint plans, playtest
|
||||
data, QA sign-off); these are deferred to dedicated follow-up specs.
|
||||
- The "CONCERNS" verdict path (minor gaps, not blocking) is not explicitly
|
||||
tested here; it falls between Case 1 and Case 2 and follows the same pattern.
|
||||
- The Vertical Slice validation block (Pre-Production → Production gate) is not
|
||||
covered because it requires a playable build context that cannot be expressed
|
||||
as a document fixture.
|
||||
@@ -0,0 +1,175 @@
|
||||
# Skill Test Spec: /create-control-manifest
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/create-control-manifest` reads all Accepted ADRs from `docs/architecture/` and
|
||||
generates a control manifest — a summary document that captures all architectural
|
||||
constraints, required patterns, and forbidden patterns in one place. The manifest
|
||||
is the reference document that story authors use when writing story files, ensuring
|
||||
stories inherit the correct architectural rules without having to read all ADRs
|
||||
individually.
|
||||
|
||||
The skill only includes Accepted ADRs; Proposed ADRs are excluded and noted. It
|
||||
has no director gates. The skill asks "May I write" before writing
|
||||
`docs/architecture/control-manifest.md`.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: CREATED, BLOCKED
|
||||
- [ ] Contains "May I write" collaborative protocol language (for control-manifest.md)
|
||||
- [ ] Has a next-step handoff at the end (`/create-epics` or `/create-stories`)
|
||||
- [ ] Documents that only Accepted ADRs are included (not Proposed)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
No director gates — this skill spawns no director gate agents. The control
|
||||
manifest is a mechanical extraction from Accepted ADRs; no creative or technical
|
||||
review gate is needed.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — 4 Accepted ADRs create a correct manifest
|
||||
|
||||
**Fixture:**
|
||||
- `docs/architecture/` contains 4 ADR files, all with `Status: Accepted`
|
||||
- Each ADR has a "Required Patterns" and/or "Forbidden Patterns" section
|
||||
- No existing `docs/architecture/control-manifest.md`
|
||||
|
||||
**Input:** `/create-control-manifest`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all ADR files in `docs/architecture/`
|
||||
2. Extracts Required Patterns, Forbidden Patterns, and key constraints from each
|
||||
3. Drafts the manifest with correct section structure
|
||||
4. Shows the draft manifest to the user
|
||||
5. Asks "May I write `docs/architecture/control-manifest.md`?"
|
||||
6. Writes the manifest after approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 4 Accepted ADRs are represented in the manifest
|
||||
- [ ] Manifest includes distinct sections for Required Patterns and Forbidden Patterns
|
||||
- [ ] Manifest includes the source ADR number for each constraint
|
||||
- [ ] "May I write" is asked before writing
|
||||
- [ ] Skill does NOT write without approval
|
||||
- [ ] Verdict is CREATED after writing
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — No ADRs found
|
||||
|
||||
**Fixture:**
|
||||
- `docs/architecture/` directory exists but contains no ADR files
|
||||
|
||||
**Input:** `/create-control-manifest`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `docs/architecture/` and finds no ADR files
|
||||
2. Skill outputs: "No ADRs found. Run `/architecture-decision` to create ADRs before generating the control manifest."
|
||||
3. Skill exits without creating any file
|
||||
4. Verdict is BLOCKED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs a clear error when no ADRs are found
|
||||
- [ ] No control manifest file is written
|
||||
- [ ] Skill recommends `/architecture-decision` as the next action
|
||||
- [ ] Verdict is BLOCKED (not an error crash)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Mixed ADR Statuses — Only Accepted ADRs included
|
||||
|
||||
**Fixture:**
|
||||
- `docs/architecture/` contains 3 Accepted ADRs and 2 Proposed ADRs
|
||||
|
||||
**Input:** `/create-control-manifest`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all ADR files and filters by Status: Accepted
|
||||
2. Manifest is drafted from the 3 Accepted ADRs only
|
||||
3. Output notes: "2 Proposed ADRs were excluded: [adr-NNN-name, adr-NNN-name]"
|
||||
4. User sees which ADRs were excluded before approving the write
|
||||
5. Asks "May I write `docs/architecture/control-manifest.md`?"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Only the 3 Accepted ADRs appear in the manifest content
|
||||
- [ ] Excluded Proposed ADRs are listed by name in the output
|
||||
- [ ] User sees the exclusion list before approving the write
|
||||
- [ ] Skill does NOT silently omit Proposed ADRs without noting them
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — Manifest already exists
|
||||
|
||||
**Fixture:**
|
||||
- `docs/architecture/control-manifest.md` already exists (version 1, dated last week)
|
||||
- `docs/architecture/` contains Accepted ADRs (some new since last manifest)
|
||||
|
||||
**Input:** `/create-control-manifest`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects existing manifest and reads its version number / date
|
||||
2. Skill offers to regenerate: "control-manifest.md already exists (v1, [date]). Regenerate with current ADRs?"
|
||||
3. If user confirms: skill drafts updated manifest, increments version number
|
||||
4. Asks "May I write `docs/architecture/control-manifest.md`?" (overwrite)
|
||||
5. Writes updated manifest after approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads and reports the existing manifest version before offering to regenerate
|
||||
- [ ] User is offered a regenerate/skip choice — not auto-overwritten
|
||||
- [ ] Updated manifest has an incremented version number
|
||||
- [ ] "May I write" is asked before overwriting the existing file
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — No gate spawned; no review-mode.txt read
|
||||
|
||||
**Fixture:**
|
||||
- 4 Accepted ADRs exist
|
||||
- `production/session-state/review-mode.txt` exists with `full`
|
||||
|
||||
**Input:** `/create-control-manifest`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads ADRs and drafts manifest
|
||||
2. Skill does NOT read `production/session-state/review-mode.txt`
|
||||
3. No director gate agents are spawned at any point
|
||||
4. Skill proceeds directly to "May I write" after drafting
|
||||
5. Review mode setting has no effect on this skill's behavior
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate agents are spawned (no CD-, TD-, PR-, AD- prefixed gates)
|
||||
- [ ] Skill does NOT read `production/session-state/review-mode.txt`
|
||||
- [ ] Output contains no "Gate: [GATE-ID]" or gate-skipped entries
|
||||
- [ ] The manifest is generated from ADRs alone, with no external gate review
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads all ADR files before drafting manifest
|
||||
- [ ] Only Accepted ADRs included — Proposed ones noted as excluded
|
||||
- [ ] Manifest draft shown to user before "May I write" ask
|
||||
- [ ] "May I write `docs/architecture/control-manifest.md`?" asked before writing
|
||||
- [ ] No director gates — no review-mode.txt read
|
||||
- [ ] Ends with next-step handoff: `/create-epics` or `/create-stories`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The exact section structure of the generated manifest (constraint tables, pattern
|
||||
lists) is defined by the skill body and not re-enumerated in test assertions.
|
||||
- The `version` field incrementing logic (v1 → v2) is tested via Case 4 but exact
|
||||
version numbering format is not fixture-locked.
|
||||
- ADR parsing (extracting Required/Forbidden Patterns) depends on consistent ADR
|
||||
structure — tested implicitly via Case 1's fixture.
|
||||
190
CCGS Skill Testing Framework/skills/pipeline/create-epics.md
Normal file
190
CCGS Skill Testing Framework/skills/pipeline/create-epics.md
Normal file
@@ -0,0 +1,190 @@
|
||||
# Skill Test Spec: /create-epics
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/create-epics` reads all approved GDDs and translates them into EPIC.md files,
|
||||
one per system. Epics are organized by layer (Foundation → Core → Feature →
|
||||
Presentation) and processed in priority order within each layer. Each EPIC.md
|
||||
includes scope, governing ADRs, GDD requirements, engine risk level, and a
|
||||
Definition of Done. The skill asks "May I write" before creating each EPIC file.
|
||||
|
||||
In `full` review mode, a PR-EPIC gate (producer) runs after drafting epics and
|
||||
before writing any files. In `lean` or `solo` mode, PR-EPIC is skipped and noted.
|
||||
Epics are written to `production/epics/[layer]/EPIC-[name].md`.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: CREATED, BLOCKED
|
||||
- [ ] Contains "May I write" collaborative protocol language (per-epic approval)
|
||||
- [ ] Has a next-step handoff at the end (`/create-stories`)
|
||||
- [ ] Documents PR-EPIC gate behavior: runs in full mode; skipped in lean/solo
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
In `full` mode: PR-EPIC (producer) gate runs after epics are drafted and before
|
||||
any epic file is written. If PR-EPIC returns CONCERNS, epics are revised before
|
||||
the "May I write" ask.
|
||||
|
||||
In `lean` mode: PR-EPIC is skipped. Output notes: "PR-EPIC skipped — lean mode".
|
||||
|
||||
In `solo` mode: PR-EPIC is skipped. Output notes: "PR-EPIC skipped — solo mode".
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Two approved GDDs create two EPIC files
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/systems-index.md` exists with 2 systems listed
|
||||
- Both systems have approved GDDs in `design/gdd/`
|
||||
- `docs/architecture/architecture.md` exists with matching modules
|
||||
- At least one Accepted ADR exists for each system
|
||||
- `production/session-state/review-mode.txt` contains `lean`
|
||||
|
||||
**Input:** `/create-epics`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads systems index and both GDDs
|
||||
2. Drafts 2 EPIC definitions (layer, GDD path, ADRs, requirements, engine risk)
|
||||
3. PR-EPIC gate is skipped (lean mode) — noted in output
|
||||
4. For each epic: asks "May I write `production/epics/[layer]/EPIC-[name].md`?"
|
||||
5. After approval: writes both EPIC files
|
||||
6. Creates or updates `production/epics/index.md`
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Epic summary is shown before any write ask
|
||||
- [ ] "May I write" is asked per-epic (not once for all epics together)
|
||||
- [ ] Each EPIC.md contains: layer, GDD path, governing ADRs, requirements table, Definition of Done
|
||||
- [ ] PR-EPIC skip is noted in output
|
||||
- [ ] `production/epics/index.md` is updated after writing
|
||||
- [ ] Skill does NOT write EPIC files without per-epic approval
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — No approved GDDs found
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/systems-index.md` exists
|
||||
- No GDDs in `design/gdd/` have approved status (all are Draft or In Progress)
|
||||
|
||||
**Input:** `/create-epics`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads systems index and attempts to find approved GDDs
|
||||
2. No approved GDDs found
|
||||
3. Skill outputs: "No approved GDDs to convert. GDDs must be Approved before creating epics."
|
||||
4. Skill suggests running `/design-system` and completing GDD approval first
|
||||
5. Skill exits without creating any EPIC files
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill stops cleanly with a clear message when no approved GDDs exist
|
||||
- [ ] No EPIC files are written
|
||||
- [ ] Skill recommends the correct next action
|
||||
- [ ] Verdict is BLOCKED
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Director Gate — Full mode spawns PR-EPIC before writing
|
||||
|
||||
**Fixture:**
|
||||
- 2 approved GDDs exist
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
|
||||
**Full mode expected behavior:**
|
||||
1. Skill drafts both epics
|
||||
2. PR-EPIC gate spawns and reviews the epic drafts
|
||||
3. If PR-EPIC returns APPROVED: "May I write" ask proceeds normally
|
||||
4. Epic files are written after approval
|
||||
|
||||
**Assertions (full mode):**
|
||||
- [ ] PR-EPIC gate appears in output as an active gate
|
||||
- [ ] PR-EPIC runs before any "May I write" ask
|
||||
- [ ] Epic files are NOT written before PR-EPIC completes
|
||||
|
||||
**Fixture (lean mode):**
|
||||
- Same GDDs
|
||||
- `production/session-state/review-mode.txt` contains `lean`
|
||||
|
||||
**Lean mode expected behavior:**
|
||||
1. Epics are drafted
|
||||
2. PR-EPIC is skipped — noted in output
|
||||
3. "May I write" ask proceeds directly
|
||||
|
||||
**Assertions (lean mode):**
|
||||
- [ ] "PR-EPIC skipped — lean mode" appears in output
|
||||
- [ ] Skill proceeds to "May I write" without waiting for PR-EPIC
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — Epic already exists for a GDD
|
||||
|
||||
**Fixture:**
|
||||
- `production/epics/[layer]/EPIC-[name].md` already exists for one of the approved GDDs
|
||||
- The other GDD has no existing EPIC file
|
||||
|
||||
**Input:** `/create-epics`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects the existing EPIC file for the first system
|
||||
2. Skill offers to update rather than overwrite: "EPIC-[name].md already exists. Update it, or skip?"
|
||||
3. For the second system (no existing file): proceeds normally with "May I write"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill detects existing EPIC files before writing
|
||||
- [ ] User is offered "update" or "skip" options — not auto-overwritten
|
||||
- [ ] The new system's EPIC is created normally without conflict
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — PR-EPIC returns CONCERNS
|
||||
|
||||
**Fixture:**
|
||||
- 2 approved GDDs exist
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
- PR-EPIC gate returns CONCERNS (e.g., scope of one epic is too large)
|
||||
|
||||
**Input:** `/create-epics`
|
||||
|
||||
**Expected behavior:**
|
||||
1. PR-EPIC gate spawns and returns CONCERNS with specific feedback
|
||||
2. Skill surfaces the concerns to the user before any write ask
|
||||
3. User is given options: revise epics, accept concerns and proceed, or stop
|
||||
4. If user revises: updated epic drafts are shown before the "May I write" ask
|
||||
5. Skill does NOT write epics while CONCERNS are unaddressed
|
||||
|
||||
**Assertions:**
|
||||
- [ ] CONCERNS from PR-EPIC are shown to the user before writing
|
||||
- [ ] Skill does NOT auto-write epics when CONCERNS are returned
|
||||
- [ ] User is given a clear choice to revise, proceed, or stop
|
||||
- [ ] Revised epic drafts are re-shown after revision before final approval
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Epic drafts shown to user before any "May I write" ask
|
||||
- [ ] "May I write" asked per-epic, not once for the entire batch
|
||||
- [ ] PR-EPIC gate (if active) runs before write asks — not after
|
||||
- [ ] Skipped gates noted by name and mode in output
|
||||
- [ ] EPIC.md content sourced only from GDDs, ADRs, and architecture docs — nothing invented
|
||||
- [ ] Ends with next-step handoff: `/create-stories [epic-slug]` per created epic
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Processing of Core, Feature, and Presentation layers follows the same per-epic
|
||||
pattern as Foundation — layer-specific ordering is not independently tested.
|
||||
- Engine risk level assignment (LOW/MEDIUM/HIGH) from governing ADRs is
|
||||
validated implicitly via Case 1's fixture structure.
|
||||
- The `layer: [name]` and `[system-name]` argument modes follow the same approval
|
||||
pattern as the default (all systems) mode.
|
||||
191
CCGS Skill Testing Framework/skills/pipeline/create-stories.md
Normal file
191
CCGS Skill Testing Framework/skills/pipeline/create-stories.md
Normal file
@@ -0,0 +1,191 @@
|
||||
# Skill Test Spec: /create-stories
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/create-stories` breaks a single epic into developer-ready story files. It reads
|
||||
the EPIC.md, the corresponding GDD, governing ADRs, the control manifest, and the
|
||||
TR registry. Each story gets structured frontmatter including: Title, Epic, Layer,
|
||||
Priority, Status, TR-ID, ADR references, Acceptance Criteria, and Definition of
|
||||
Done. Stories are classified by type (Logic / Integration / Visual/Feel / UI /
|
||||
Config/Data) which determines the required test evidence path.
|
||||
|
||||
In `full` review mode, a QL-STORY-READY check runs per story after creation. In
|
||||
`lean` or `solo` mode, QL-STORY-READY is skipped. The skill asks "May I write"
|
||||
before writing each story file. Stories are written to
|
||||
`production/epics/[layer]/story-[name].md`.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED, NEEDS WORK
|
||||
- [ ] Contains "May I write" collaborative protocol language (per-story approval)
|
||||
- [ ] Has a next-step handoff at the end (`/story-readiness`, `/dev-story`)
|
||||
- [ ] Documents story Status: Blocked when governing ADR is Proposed
|
||||
- [ ] Documents QL-STORY-READY gate: active in full mode, skipped in lean/solo
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
In `full` mode: QL-STORY-READY check runs per story after creation. Stories that
|
||||
fail the check are noted as NEEDS WORK before the "May I write" ask.
|
||||
|
||||
In `lean` mode: QL-STORY-READY is skipped. Output notes:
|
||||
"QL-STORY-READY skipped — lean mode" per story.
|
||||
|
||||
In `solo` mode: QL-STORY-READY is skipped with equivalent notes.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Epic with 3 stories, all ADRs Accepted
|
||||
|
||||
**Fixture:**
|
||||
- `production/epics/[layer]/EPIC-[name].md` exists with 3 GDD requirements
|
||||
- Corresponding GDD exists with matching acceptance criteria
|
||||
- All governing ADRs have `Status: Accepted`
|
||||
- `docs/architecture/control-manifest.md` exists
|
||||
- `docs/architecture/tr-registry.yaml` has TR-IDs for all 3 requirements
|
||||
- `production/session-state/review-mode.txt` contains `lean`
|
||||
|
||||
**Input:** `/create-stories [epic-name]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads EPIC.md, GDD, governing ADRs, control manifest, and TR registry
|
||||
2. Classifies each requirement into a story type (Logic / Integration / Visual/Feel / UI / Config/Data)
|
||||
3. Drafts 3 story files with correct frontmatter schema
|
||||
4. QL-STORY-READY is skipped (lean mode) — noted in output
|
||||
5. Asks "May I write" before writing each story file
|
||||
6. Writes all 3 story files after approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Each story's frontmatter contains: Title, Epic, Layer, Priority, Status, TR-ID, ADR reference, Acceptance Criteria, DoD
|
||||
- [ ] Story types are correctly classified (at least one Logic type in fixture)
|
||||
- [ ] "May I write" is asked per story (not once for the entire batch)
|
||||
- [ ] QL-STORY-READY skip is noted in output
|
||||
- [ ] All 3 story files are written with correct naming: `story-[name].md`
|
||||
- [ ] Skill does NOT start implementation
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — No epic file found
|
||||
|
||||
**Fixture:**
|
||||
- The epic path provided does not exist in `production/epics/`
|
||||
|
||||
**Input:** `/create-stories nonexistent-epic`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read the EPIC.md file
|
||||
2. File not found
|
||||
3. Skill outputs a clear error with the path it searched
|
||||
4. Skill suggests checking `production/epics/` or running `/create-epics` first
|
||||
5. No story files are created
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs a clear error naming the missing file path
|
||||
- [ ] No story files are written
|
||||
- [ ] Skill recommends the correct next action (`/create-epics`)
|
||||
- [ ] Skill does NOT create stories without a valid EPIC.md
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Blocked Story — ADR is Proposed
|
||||
|
||||
**Fixture:**
|
||||
- EPIC.md exists with 2 requirements
|
||||
- Requirement 1 is covered by an Accepted ADR
|
||||
- Requirement 2 is covered by an ADR with `Status: Proposed`
|
||||
|
||||
**Input:** `/create-stories [epic-name]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the ADR for Requirement 2 and finds Status: Proposed
|
||||
2. Story for Requirement 2 is drafted with `Status: Blocked`
|
||||
3. Blocking note references the specific ADR: "BLOCKED: ADR-NNN is Proposed"
|
||||
4. Story for Requirement 1 is drafted normally with `Status: Ready`
|
||||
5. Both stories are shown in the draft — user asked "May I write" for both
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Story 2 has `Status: Blocked` in its frontmatter
|
||||
- [ ] Blocking note names the specific ADR number and recommends `/architecture-decision`
|
||||
- [ ] Story 1 has `Status: Ready` — blocked status does not affect non-blocked stories
|
||||
- [ ] Blocked status is shown in the draft preview before writing
|
||||
- [ ] Both story files are written (blocked stories are still written — just flagged)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No argument provided
|
||||
|
||||
**Fixture:**
|
||||
- `production/epics/` directory exists with ≥2 epic subdirectories
|
||||
|
||||
**Input:** `/create-stories` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no argument is provided
|
||||
2. Outputs a usage error: "No epic specified. Usage: /create-stories [epic-name]"
|
||||
3. Skill lists available epics from `production/epics/`
|
||||
4. No story files are created
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs a usage error when no argument is given
|
||||
- [ ] Skill lists available epics to help the user choose
|
||||
- [ ] No story files are written
|
||||
- [ ] Skill does NOT silently pick an epic without user input
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — Full mode runs QL-STORY-READY; stories failing noted as NEEDS WORK
|
||||
|
||||
**Fixture:**
|
||||
- EPIC.md exists with 2 requirements
|
||||
- Both governing ADRs are Accepted
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
- QL-STORY-READY check finds one story has ambiguous acceptance criteria
|
||||
|
||||
**Input:** `/create-stories [epic-name]`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Both stories are drafted
|
||||
2. QL-STORY-READY check runs for each story
|
||||
3. Story 1 passes QL-STORY-READY
|
||||
4. Story 2 fails QL-STORY-READY — noted as NEEDS WORK with specific feedback
|
||||
5. Both stories are shown to user with pass/fail status before "May I write"
|
||||
6. User can proceed (story written as-is with NEEDS WORK note) or revise first
|
||||
|
||||
**Assertions:**
|
||||
- [ ] QL-STORY-READY results appear per story in the output
|
||||
- [ ] Story 2 is flagged as NEEDS WORK with the specific failing criteria
|
||||
- [ ] Story 1 shows as passing QL-STORY-READY
|
||||
- [ ] User is given the choice to proceed or revise before writing
|
||||
- [ ] Skill does NOT auto-block writing of stories that fail QL-STORY-READY without user input
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] All context (EPIC, GDD, ADRs, manifest, TR registry) loaded before drafting stories
|
||||
- [ ] Story drafts shown in full before any "May I write" ask
|
||||
- [ ] "May I write" asked per story (not once for the entire batch)
|
||||
- [ ] Blocked stories flagged before write approval — not discovered after writing
|
||||
- [ ] TR-IDs reference the registry — requirement text is not embedded inline in story files
|
||||
- [ ] Control manifest rules quoted per-story from the manifest, not invented
|
||||
- [ ] Ends with next-step handoff: `/story-readiness` → `/dev-story`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Integration story test evidence (playtest doc alternative) follows the same
|
||||
approval pattern as Logic stories — not independently fixture-tested.
|
||||
- Story ordering (foundational first, UI last) is validated implicitly via
|
||||
Case 1's multi-story fixture.
|
||||
- The story sizing rule (splitting large requirement groups) is not tested here
|
||||
— it is addressed in the `/create-stories` skill's internal logic.
|
||||
205
CCGS Skill Testing Framework/skills/pipeline/dev-story.md
Normal file
205
CCGS Skill Testing Framework/skills/pipeline/dev-story.md
Normal file
@@ -0,0 +1,205 @@
|
||||
# Skill Test Spec: /dev-story
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/dev-story` reads a story file, loads all required context (referenced ADR,
|
||||
TR-ID from the registry, control manifest, engine preferences), implements the
|
||||
story, verifies that all acceptance criteria are met, and marks the story
|
||||
Complete. The skill routes implementation to the correct specialist agent based
|
||||
on the engine and file type — it does not write source code directly.
|
||||
|
||||
In `full` review mode, an LP-CODE-REVIEW gate runs before marking the story
|
||||
Complete. In `lean` or `solo` mode, LP-CODE-REVIEW is skipped and the story is
|
||||
marked Complete after the user confirms all criteria are met. The skill asks
|
||||
"May I write" before updating story status and before writing code files.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED, IN PROGRESS, NEEDS CHANGES
|
||||
- [ ] Contains "May I write" collaborative protocol language (story status + code files)
|
||||
- [ ] Has a next-step handoff at the end (`/story-done`)
|
||||
- [ ] Documents LP-CODE-REVIEW gate: active in full mode, skipped in lean/solo
|
||||
- [ ] Notes that implementation is delegated to specialist agents (not done directly)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
In `full` mode: LP-CODE-REVIEW gate runs after implementation is complete and all
|
||||
criteria are verified, before marking the story Complete.
|
||||
|
||||
In `lean` mode: LP-CODE-REVIEW is skipped. Output notes:
|
||||
"LP-CODE-REVIEW skipped — lean mode". Story is marked Complete after user confirms.
|
||||
|
||||
In `solo` mode: LP-CODE-REVIEW is skipped with equivalent notes.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Story implemented and marked Complete (full mode)
|
||||
|
||||
**Fixture:**
|
||||
- A story file exists at `production/epics/[layer]/story-[name].md` with:
|
||||
- `Status: Ready`
|
||||
- A TR-ID referencing a registered requirement
|
||||
- At least 2 Given-When-Then acceptance criteria
|
||||
- A test evidence path
|
||||
- Referenced ADR has `Status: Accepted`
|
||||
- `docs/architecture/control-manifest.md` exists
|
||||
- `.claude/docs/technical-preferences.md` has engine and language configured
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/dev-story production/epics/[layer]/story-[name].md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the story file and all referenced context
|
||||
2. Skill verifies the ADR is Accepted (no block)
|
||||
3. Skill routes implementation to the correct specialist agent
|
||||
4. All acceptance criteria are verified as met
|
||||
5. LP-CODE-REVIEW gate spawns and returns APPROVED
|
||||
6. Skill asks "May I update story status to Complete?"
|
||||
7. Story status is updated to Complete
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads story before spawning any agent
|
||||
- [ ] ADR status is checked before implementation begins
|
||||
- [ ] Implementation is delegated to a specialist agent (not done inline)
|
||||
- [ ] All acceptance criteria are confirmed before LP-CODE-REVIEW
|
||||
- [ ] LP-CODE-REVIEW appears in output as a completed gate
|
||||
- [ ] Story status is updated to Complete only after gate approval and user consent
|
||||
- [ ] Test file is written as part of implementation (not deferred)
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — Referenced ADR is Proposed
|
||||
|
||||
**Fixture:**
|
||||
- A story file exists with `Status: Ready`
|
||||
- The story's TR-ID points to a requirement covered by an ADR with `Status: Proposed`
|
||||
|
||||
**Input:** `/dev-story production/epics/[layer]/story-[name].md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the story file
|
||||
2. Skill resolves the TR-ID and reads the governing ADR
|
||||
3. ADR status is Proposed — skill outputs a BLOCKED message
|
||||
4. Skill names the specific ADR blocking the story
|
||||
5. Skill recommends running `/architecture-decision` to advance the ADR
|
||||
6. Implementation does NOT begin
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT begin implementation with a Proposed ADR
|
||||
- [ ] BLOCKED message names the specific ADR number and title
|
||||
- [ ] Skill recommends `/architecture-decision` as the next action
|
||||
- [ ] Story status remains unchanged (not set to In Progress or Complete)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Ambiguous Acceptance Criteria — Skill asks for clarification
|
||||
|
||||
**Fixture:**
|
||||
- A story file exists with `Status: Ready`
|
||||
- Referenced ADR is Accepted
|
||||
- One acceptance criterion is ambiguous (not Given-When-Then; uses subjective language like "feels responsive")
|
||||
|
||||
**Input:** `/dev-story production/epics/[layer]/story-[name].md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the story and identifies the ambiguous criterion
|
||||
2. Before routing to the specialist, skill asks the user to clarify the criterion
|
||||
3. User provides a concrete, testable restatement
|
||||
4. Skill proceeds with implementation using the clarified criterion
|
||||
5. Skill does NOT guess at the intended behavior
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill surfaces the ambiguous criterion before implementation starts
|
||||
- [ ] Skill asks for user clarification (not auto-interpretation)
|
||||
- [ ] Implementation begins only after clarification is provided
|
||||
- [ ] Clarified criterion is used in the test (not the original vague version)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No argument; reads from session state
|
||||
|
||||
**Fixture:**
|
||||
- No argument is provided
|
||||
- `production/session-state/active.md` references an active story file
|
||||
- That story file exists with `Status: In Progress`
|
||||
|
||||
**Input:** `/dev-story` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no argument is provided
|
||||
2. Skill reads `production/session-state/active.md`
|
||||
3. Skill finds the active story reference
|
||||
4. Skill confirms with user: "Continuing work on [story title] — is that correct?"
|
||||
5. After confirmation, skill proceeds with that story
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads session state when no argument is provided
|
||||
- [ ] Skill confirms the active story with the user before proceeding
|
||||
- [ ] Skill does NOT silently assume the active story without confirmation
|
||||
- [ ] If session state has no active story, skill asks which story to implement
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — LP-CODE-REVIEW returns NEEDS CHANGES; lean mode skips gate
|
||||
|
||||
**Fixture (full mode):**
|
||||
- Story is implemented and all criteria appear met
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
- LP-CODE-REVIEW gate returns NEEDS CHANGES with specific feedback
|
||||
|
||||
**Full mode expected behavior:**
|
||||
1. LP-CODE-REVIEW gate spawns after implementation
|
||||
2. Gate returns NEEDS CHANGES with 2 specific issues
|
||||
3. Story status remains In Progress — NOT marked Complete
|
||||
4. User is shown the gate feedback and asked how to proceed
|
||||
|
||||
**Assertions (full mode):**
|
||||
- [ ] Story is NOT marked Complete when LP-CODE-REVIEW returns NEEDS CHANGES
|
||||
- [ ] Gate feedback is shown to the user verbatim
|
||||
- [ ] Story status stays In Progress until issues are resolved and gate passes
|
||||
|
||||
**Fixture (lean mode):**
|
||||
- Same story, `production/session-state/review-mode.txt` contains `lean`
|
||||
|
||||
**Lean mode expected behavior:**
|
||||
1. Implementation completes
|
||||
2. LP-CODE-REVIEW gate is skipped — noted in output
|
||||
3. User is asked to confirm all criteria are met
|
||||
4. Story is marked Complete after user confirmation
|
||||
|
||||
**Assertions (lean mode):**
|
||||
- [ ] "LP-CODE-REVIEW skipped — lean mode" appears in output
|
||||
- [ ] Story is marked Complete after user confirms criteria (no gate required)
|
||||
- [ ] Skill does NOT block on a gate that is skipped
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Does NOT write source code directly — delegates to specialist agents
|
||||
- [ ] Reads all context (story, TR-ID, ADR, manifest, engine prefs) before implementation
|
||||
- [ ] "May I write" asked before updating story status and before writing code files
|
||||
- [ ] Skipped gates noted by name and mode in output
|
||||
- [ ] Updates `production/session-state/active.md` after story completion
|
||||
- [ ] Ends with next-step handoff: `/story-done`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Engine routing logic (Godot vs Unity vs Unreal) is not tested per engine —
|
||||
the routing pattern is consistent; engine selection is a config fact.
|
||||
- Visual/Feel and UI story types (no automated test required) have different
|
||||
evidence requirements and are not covered in these cases.
|
||||
- Integration story type follows the same pattern as Logic but with a different
|
||||
evidence path — not independently fixture-tested.
|
||||
196
CCGS Skill Testing Framework/skills/pipeline/map-systems.md
Normal file
196
CCGS Skill Testing Framework/skills/pipeline/map-systems.md
Normal file
@@ -0,0 +1,196 @@
|
||||
# Skill Test Spec: /map-systems
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/map-systems` decomposes a game concept into a systems index. It reads the
|
||||
approved game concept and pillars, enumerates both explicit and implicit systems,
|
||||
maps dependencies between systems, assigns priority tiers (MVP / Vertical Slice /
|
||||
Alpha / Full Vision), and organizes systems into a layered design order
|
||||
(Foundation → Core → Feature → Presentation). The output is written to
|
||||
`design/systems-index.md` after user approval.
|
||||
|
||||
This skill is required between game concept approval and per-system GDD creation
|
||||
— it is a mandatory gate in the pipeline. In `full` review mode, CD-SYSTEMS
|
||||
(creative-director) and TD-SYSTEM-BOUNDARY (technical-director) spawn in parallel
|
||||
after the decomposition is drafted. In `lean` or `solo` mode, both gates are
|
||||
skipped. The skill writes to `design/systems-index.md`.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
|
||||
- [ ] Contains "May I write" collaborative protocol language (for systems-index.md)
|
||||
- [ ] Has a next-step handoff at the end (`/design-system`)
|
||||
- [ ] Documents gate behavior: CD-SYSTEMS + TD-SYSTEM-BOUNDARY in parallel in full mode
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
In `full` mode: CD-SYSTEMS (creative-director) and TD-SYSTEM-BOUNDARY
|
||||
(technical-director) spawn in parallel after the systems decomposition is drafted
|
||||
and before `design/systems-index.md` is written.
|
||||
|
||||
In `lean` mode: both gates are skipped. Output notes:
|
||||
"CD-SYSTEMS skipped — lean mode" and "TD-SYSTEM-BOUNDARY skipped — lean mode".
|
||||
|
||||
In `solo` mode: both gates are skipped with equivalent notes.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Game concept exists, 5-8 systems identified
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/game-concept.md` exists with Core Mechanics and MVP Definition sections
|
||||
- `design/gdd/game-pillars.md` exists with ≥1 pillar defined
|
||||
- No `design/systems-index.md` exists yet
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/map-systems`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads game-concept.md and game-pillars.md
|
||||
2. Identifies 5-8 systems (explicit + implicit)
|
||||
3. Maps dependencies between systems and assigns layers
|
||||
4. CD-SYSTEMS and TD-SYSTEM-BOUNDARY spawn in parallel and return APPROVED
|
||||
5. Asks "May I write `design/systems-index.md`?"
|
||||
6. Writes systems-index.md after approval
|
||||
7. Updates `production/session-state/active.md`
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Between 5 and 8 systems are identified (not fewer, not more without explanation)
|
||||
- [ ] CD-SYSTEMS and TD-SYSTEM-BOUNDARY spawn in parallel (not sequentially)
|
||||
- [ ] Both gates complete before the "May I write" ask
|
||||
- [ ] "May I write `design/systems-index.md`?" is asked before writing
|
||||
- [ ] systems-index.md is NOT written without approval
|
||||
- [ ] Session state is updated after writing
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — No game concept found
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/game-concept.md` does NOT exist
|
||||
- `design/gdd/` directory may be empty or absent
|
||||
|
||||
**Input:** `/map-systems`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read `design/gdd/game-concept.md`
|
||||
2. File not found
|
||||
3. Skill outputs: "No game concept found. Run `/brainstorm` to create one, then return to `/map-systems`."
|
||||
4. Skill exits without creating systems-index.md
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs a clear error naming the missing file path
|
||||
- [ ] Skill recommends `/brainstorm` as the next action
|
||||
- [ ] No systems-index.md is created
|
||||
- [ ] Verdict is BLOCKED
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Director Gate — CD-SYSTEMS returns CONCERNS (missing core system)
|
||||
|
||||
**Fixture:**
|
||||
- Game concept exists
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
- CD-SYSTEMS gate returns CONCERNS: "The [core-system] is implied by the concept but not identified"
|
||||
|
||||
**Input:** `/map-systems`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Systems are drafted (5-8 initial systems identified)
|
||||
2. CD-SYSTEMS gate returns CONCERNS naming the missing core system
|
||||
3. TD-SYSTEM-BOUNDARY returns APPROVED
|
||||
4. Skill surfaces CD-SYSTEMS concerns to user
|
||||
5. User is asked: revise systems list to add the missing system, or proceed as-is
|
||||
6. If revised: updated systems list shown before "May I write" ask
|
||||
|
||||
**Assertions:**
|
||||
- [ ] CD-SYSTEMS concerns are shown to the user before writing
|
||||
- [ ] Skill does NOT auto-write systems-index.md while CONCERNS are unresolved
|
||||
- [ ] User is given the option to revise or proceed
|
||||
- [ ] Revised systems list is re-shown after revision before final "May I write"
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — systems-index.md already exists
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/game-concept.md` exists
|
||||
- `design/systems-index.md` already exists with N systems
|
||||
|
||||
**Input:** `/map-systems`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the existing systems-index.md and presents its current state
|
||||
2. Skill asks: "systems-index.md already exists with [N] systems. Update with new systems, or review and revise priorities?"
|
||||
3. User chooses an action
|
||||
4. Skill does NOT silently overwrite the existing index
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill detects and reads the existing systems-index.md before proceeding
|
||||
- [ ] User is offered update/review options — not auto-overwritten
|
||||
- [ ] Existing system count is presented to the user
|
||||
- [ ] Skill does NOT proceed with a full re-decomposition without user choosing to do so
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — Lean mode and solo mode both skip gates, noted
|
||||
|
||||
**Fixture (lean mode):**
|
||||
- Game concept exists
|
||||
- `production/session-state/review-mode.txt` contains `lean`
|
||||
|
||||
**Lean mode expected behavior:**
|
||||
1. Systems are decomposed and drafted
|
||||
2. Both CD-SYSTEMS and TD-SYSTEM-BOUNDARY are skipped
|
||||
3. Output notes: "CD-SYSTEMS skipped — lean mode" and "TD-SYSTEM-BOUNDARY skipped — lean mode"
|
||||
4. "May I write" ask proceeds directly
|
||||
|
||||
**Assertions (lean mode):**
|
||||
- [ ] Both gate skip notes appear in output
|
||||
- [ ] Skill proceeds to "May I write" without gate approval
|
||||
- [ ] systems-index.md is written after user approval
|
||||
|
||||
**Fixture (solo mode):**
|
||||
- Same game concept, `production/session-state/review-mode.txt` contains `solo`
|
||||
|
||||
**Solo mode expected behavior:**
|
||||
1. Same decomposition workflow
|
||||
2. Both gates skipped — noted in output with "solo mode"
|
||||
3. "May I write" ask proceeds
|
||||
|
||||
**Assertions (solo mode):**
|
||||
- [ ] Both skip notes appear with "solo mode" label
|
||||
- [ ] Behavior is otherwise identical to lean mode for this skill
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads game-concept.md and game-pillars.md before any decomposition
|
||||
- [ ] "May I write `design/systems-index.md`?" asked before writing
|
||||
- [ ] systems-index.md is NOT written without user approval
|
||||
- [ ] CD-SYSTEMS and TD-SYSTEM-BOUNDARY spawn in parallel in full mode
|
||||
- [ ] Skipped gates noted by name and mode in lean/solo output
|
||||
- [ ] Ends with next-step handoff: `/design-system [next-system]`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Circular dependency detection (System A depends on System B which depends on A)
|
||||
is part of the dependency mapping phase — not independently fixture-tested here.
|
||||
- Priority tier assignment (MVP heuristics) is evaluated as part of the Case 1
|
||||
collaborative workflow rather than independently.
|
||||
- The `next` argument mode (handing off the highest-priority undesigned system to
|
||||
`/design-system`) is not tested here — it is a post-index-creation convenience.
|
||||
@@ -0,0 +1,175 @@
|
||||
# Skill Test Spec: /propagate-design-change
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/propagate-design-change` handles GDD revision cascades. When a GDD is updated,
|
||||
the skill traces all downstream artifacts that reference it: ADRs, TR-registry
|
||||
entries, stories, and epics. It produces a structured impact report showing what
|
||||
needs to change and why. The skill does NOT automatically apply changes — it
|
||||
proposes edits for each affected artifact and asks "May I write" per artifact
|
||||
before making any modification.
|
||||
|
||||
The skill is read-only during analysis and write-gated per artifact during the
|
||||
update phase. It has no director gates — the analysis itself is mechanical
|
||||
tracing, not a creative review.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED, NO IMPACT
|
||||
- [ ] Contains "May I write" collaborative protocol language (per-artifact approval)
|
||||
- [ ] Has a next-step handoff at the end
|
||||
- [ ] Documents that changes are proposed, not applied automatically
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
No director gates — this skill spawns no director gate agents during analysis.
|
||||
The impact report is a mechanical tracing operation; no creative or technical
|
||||
director review is required at the analysis stage.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — GDD revision affects 2 stories and 1 epic
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/[system].md` exists and has been recently revised (git diff shows changes)
|
||||
- `production/epics/[layer]/EPIC-[system].md` references this GDD
|
||||
- 2 story files reference TR-IDs from this GDD
|
||||
- The changed GDD section affects the acceptance criteria of both stories
|
||||
|
||||
**Input:** `/propagate-design-change design/gdd/[system].md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the revised GDD and identifies what changed (git diff or content comparison)
|
||||
2. Skill scans ADRs, TR-registry, epics, and stories for references to this GDD
|
||||
3. Skill produces an impact report: 1 epic affected, 2 stories affected
|
||||
4. Skill shows the proposed change for each artifact
|
||||
5. For each artifact: asks "May I update [filepath]?" separately
|
||||
6. Applies changes only after per-artifact approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Impact report identifies all 3 affected artifacts (1 epic + 2 stories)
|
||||
- [ ] Each affected artifact's proposed change is shown before asking to write
|
||||
- [ ] "May I write" is asked per artifact (not once for all artifacts)
|
||||
- [ ] Skill does NOT apply any changes without per-artifact approval
|
||||
- [ ] Verdict is COMPLETE after all approved changes are applied
|
||||
|
||||
---
|
||||
|
||||
### Case 2: No Impact — Changed GDD has no downstream references
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/[system].md` exists and has been revised
|
||||
- No ADRs, stories, or epics reference this GDD's TR-IDs or GDD path
|
||||
|
||||
**Input:** `/propagate-design-change design/gdd/[system].md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the revised GDD
|
||||
2. Skill scans all ADRs, stories, and epics for references
|
||||
3. No references found
|
||||
4. Skill outputs: "No downstream impact found for [system].md — no artifacts reference this GDD."
|
||||
5. No write operations are performed
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs the "No downstream impact found" message
|
||||
- [ ] Verdict is NO IMPACT
|
||||
- [ ] No "May I write" asks are issued (nothing to update)
|
||||
- [ ] Skill does NOT error or crash when no references are found
|
||||
|
||||
---
|
||||
|
||||
### Case 3: In-Progress Story Warning — Referenced story is currently being developed
|
||||
|
||||
**Fixture:**
|
||||
- A story referencing this GDD has `Status: In Progress`
|
||||
- The developer has already started implementing this story
|
||||
|
||||
**Input:** `/propagate-design-change design/gdd/[system].md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill identifies the In Progress story as an affected artifact
|
||||
2. Skill outputs an elevated warning: "CAUTION: [story-file] is currently In Progress — a developer may be working on this. Coordinate before updating."
|
||||
3. The warning appears in the impact report before the "May I write" ask for that story
|
||||
4. User can still approve or skip the update for that story
|
||||
|
||||
**Assertions:**
|
||||
- [ ] In Progress story is flagged with an elevated warning (distinct from regular affected-artifact entries)
|
||||
- [ ] Warning appears before the "May I write" ask for that story
|
||||
- [ ] Skill still offers to update the story — the warning does not block the option
|
||||
- [ ] Other (non-In-Progress) artifacts are not affected by this warning
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No argument provided
|
||||
|
||||
**Fixture:**
|
||||
- Multiple GDDs exist in `design/gdd/`
|
||||
|
||||
**Input:** `/propagate-design-change` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no argument is provided
|
||||
2. Skill outputs a usage error: "No GDD specified. Usage: /propagate-design-change design/gdd/[system].md"
|
||||
3. Skill lists recently modified GDDs as suggestions (git log)
|
||||
4. No analysis is performed
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs a usage error when no argument is given
|
||||
- [ ] Usage example is shown with the correct path format
|
||||
- [ ] No impact analysis is performed without a target GDD
|
||||
- [ ] Skill does NOT silently pick a GDD without user input
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — No gate spawned regardless of review mode
|
||||
|
||||
**Fixture:**
|
||||
- A GDD has been revised with downstream references
|
||||
- `production/session-state/review-mode.txt` exists with `full`
|
||||
|
||||
**Input:** `/propagate-design-change design/gdd/[system].md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the GDD and traces downstream references
|
||||
2. Skill does NOT read `production/session-state/review-mode.txt`
|
||||
3. No director gate agents are spawned at any point
|
||||
4. Impact report is produced and per-artifact approval proceeds normally
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate agents are spawned (no CD-, TD-, PR-, AD- prefixed gates)
|
||||
- [ ] Skill does NOT read `production/session-state/review-mode.txt`
|
||||
- [ ] Output contains no "Gate: [GATE-ID]" or gate-skipped entries
|
||||
- [ ] Review mode has no effect on this skill's behavior
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads revised GDD and all potentially affected artifacts before producing impact report
|
||||
- [ ] Impact report shown in full before any "May I write" ask
|
||||
- [ ] "May I write" asked per artifact — never for the entire set at once
|
||||
- [ ] In Progress stories flagged with elevated warning before their approval ask
|
||||
- [ ] No director gates — no review-mode.txt read
|
||||
- [ ] Ends with next-step handoff appropriate to verdict (COMPLETE or NO IMPACT)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- ADR impact (when a GDD change requires an ADR update or new ADR) follows the
|
||||
same per-artifact approval pattern as story/epic updates — not independently
|
||||
fixture-tested.
|
||||
- TR-registry impact (when changed GDD requires new or updated TR-IDs) is part
|
||||
of the analysis phase but not independently fixture-tested.
|
||||
- The git diff comparison method (detecting what changed in the GDD) is a runtime
|
||||
concern — fixtures use pre-arranged content differences.
|
||||
209
CCGS Skill Testing Framework/skills/readiness/story-done.md
Normal file
209
CCGS Skill Testing Framework/skills/readiness/story-done.md
Normal file
@@ -0,0 +1,209 @@
|
||||
# Skill Test Spec: /story-done
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/story-done` closes the loop between design and implementation. Run at the
|
||||
end of implementing a story, it reads the story file and verifies each
|
||||
acceptance criterion against the implementation. It checks for GDD and ADR
|
||||
deviations, prompts a code review, updates the story status to `Complete`,
|
||||
logs any tech debt, and surfaces the next ready story from the sprint. It
|
||||
produces a COMPLETE / COMPLETE WITH NOTES / BLOCKED verdict and writes to
|
||||
the story file and optionally to `docs/tech-debt-register.md`.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥5 phase headings (complex skill warranting `context: fork` if applicable)
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
|
||||
- [ ] Contains "May I write" collaborative protocol language (writes to story file and tech-debt register)
|
||||
- [ ] Has a next-step handoff (surfaces next story from sprint)
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All acceptance criteria met, no deviations
|
||||
|
||||
**Fixture:**
|
||||
- Story file at `production/epics/core/story-light-pickup.md` with:
|
||||
- 3 acceptance criteria, all implemented as described
|
||||
- `TR-ID: TR-light-001` referencing a GDD requirement
|
||||
- `ADR: docs/architecture/adr-003-inventory.md` (Accepted)
|
||||
- `Status: In Progress`
|
||||
- Implementation files listed in story exist in `src/`
|
||||
- GDD requirement text at TR-light-001 matches how the feature was implemented
|
||||
- ADR guidance was followed (no deviations)
|
||||
|
||||
**Input:** `/story-done production/epics/core/story-light-pickup.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the story file and extracts all key fields
|
||||
2. Skill reads the GDD requirement fresh from `tr-registry.yaml` (not from story's quoted text)
|
||||
3. Skill reads the referenced ADR to understand implementation constraints
|
||||
4. Skill evaluates each acceptance criterion (auto where possible, manual prompt where not)
|
||||
5. Skill checks for GDD requirement deviations
|
||||
6. Skill checks for ADR guideline deviations
|
||||
7. Skill prompts user: "Please provide the code review outcome for this story"
|
||||
8. Skill presents COMPLETE verdict
|
||||
9. Skill asks "May I update story Status to Complete and add Completion Notes?"
|
||||
10. If yes: skill updates the story file
|
||||
11. Skill surfaces the next `Ready for Dev` story from the sprint
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads `docs/architecture/tr-registry.yaml` for TR-ID requirement text (not just story)
|
||||
- [ ] Skill reads the referenced ADR file (not just the story reference)
|
||||
- [ ] Each acceptance criterion is listed with VERIFIED / DEFERRED / FAILED status
|
||||
- [ ] Skill prompts the user for code review outcome (does not skip this step)
|
||||
- [ ] Verdict is COMPLETE when all criteria are verified and no deviations exist
|
||||
- [ ] Skill asks "May I write" before updating the story file
|
||||
- [ ] Skill does NOT auto-update story status without user confirmation
|
||||
- [ ] After completion, skill surfaces the next ready story from `production/sprints/`
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Blocked Path — Acceptance criterion cannot be verified
|
||||
|
||||
**Fixture:**
|
||||
- Story file has an acceptance criterion: "Player sees correct animation on pickup"
|
||||
- No automated test for this criterion exists
|
||||
- Manual verification has not been performed
|
||||
- All other criteria are met
|
||||
|
||||
**Input:** `/story-done production/epics/core/story-light-pickup.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill processes all acceptance criteria
|
||||
2. Reaches the animation criterion — cannot auto-verify
|
||||
3. Skill asks the user: "Acceptance criterion 'Player sees correct animation on
|
||||
pickup' cannot be auto-verified. Has this been manually tested?"
|
||||
4. If user says No: criterion is marked DEFERRED, verdict becomes COMPLETE WITH NOTES
|
||||
5. Skill records the deferred criterion in completion notes
|
||||
6. Asks "May I write updated story with deferred criterion noted?"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill asks the user about unverifiable criteria rather than assuming PASS
|
||||
- [ ] Deferred criteria result in COMPLETE WITH NOTES (not COMPLETE or BLOCKED)
|
||||
- [ ] The deferred criterion is explicitly named in the completion notes
|
||||
- [ ] Skill still asks "May I write" before updating the story file
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Blocked Path — GDD deviation detected
|
||||
|
||||
**Fixture:**
|
||||
- Story TR-ID points to requirement: "Player can carry max 3 light sources"
|
||||
- Implementation in `src/` uses a variable `MAX_CARRIED_LIGHTS = 5`
|
||||
- This is a deliberate deviation from the GDD
|
||||
|
||||
**Input:** `/story-done production/epics/core/story-light-pickup.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the GDD requirement text (max 3)
|
||||
2. Skill detects discrepancy between requirement and implementation value (5)
|
||||
3. Skill flags this as a GDD deviation and asks the user to classify it:
|
||||
- INTENTIONAL: document the deviation and reason
|
||||
- ERROR: implementation must be fixed before story can be marked Complete
|
||||
- OUT OF SCOPE: requirement changed and GDD needs updating
|
||||
4. If INTENTIONAL: skill records deviation in completion notes, verdict is COMPLETE WITH NOTES
|
||||
5. If ERROR: verdict is BLOCKED until implementation is corrected
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill detects the mismatch between GDD requirement and implementation value
|
||||
- [ ] Skill asks the user to classify the deviation (not auto-assumes either way)
|
||||
- [ ] INTENTIONAL deviation → COMPLETE WITH NOTES (not BLOCKED)
|
||||
- [ ] ERROR deviation → BLOCKED verdict until fixed
|
||||
- [ ] Detected deviations are recorded in completion notes or tech debt register
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No argument, auto-detect current story
|
||||
|
||||
**Fixture:**
|
||||
- `production/session-state/active.md` contains a reference to
|
||||
`production/epics/core/story-oxygen-drain.md` as the active story
|
||||
- That story file exists with `Status: In Progress`
|
||||
|
||||
**Input:** `/story-done` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `production/session-state/active.md`
|
||||
2. Skill finds the active story reference
|
||||
3. Skill reads that story file and proceeds normally
|
||||
4. Output confirms which story was auto-detected
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads `production/session-state/active.md` when no argument is given
|
||||
- [ ] Skill identifies and confirms the auto-detected story before proceeding
|
||||
- [ ] If no story is found in session state, skill asks the user to provide a path
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — LP-CODE-REVIEW behavior across review modes
|
||||
|
||||
**Fixture:**
|
||||
- Story file at `production/epics/core/story-light-pickup.md`
|
||||
- All acceptance criteria verified, no GDD deviations
|
||||
- `production/session-state/review-mode.txt` exists
|
||||
|
||||
**Case 5a — full mode:**
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/story-done production/epics/core/story-light-pickup.md` (full mode)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads review mode — determines `full`
|
||||
2. After implementation verification, skill invokes LP-CODE-REVIEW gate
|
||||
3. Lead programmer reviews the implementation
|
||||
4. If LP verdict is NEEDS CHANGES → story cannot be marked Complete
|
||||
5. If LP verdict is APPROVED → skill proceeds to mark story Complete
|
||||
|
||||
**Assertions (5a):**
|
||||
- [ ] Skill reads review mode before deciding whether to invoke LP-CODE-REVIEW
|
||||
- [ ] LP-CODE-REVIEW gate is invoked in full mode after implementation check
|
||||
- [ ] An LP NEEDS CHANGES verdict prevents story from being marked Complete
|
||||
- [ ] Gate result is noted in output: "Gate: LP-CODE-REVIEW — [result]"
|
||||
- [ ] Skill still asks "May I write" before updating story status even if LP approved
|
||||
|
||||
**Case 5b — lean or solo mode:**
|
||||
- `review-mode.txt` contains `lean` or `solo`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads review mode — determines `lean` or `solo`
|
||||
2. LP-CODE-REVIEW gate is SKIPPED
|
||||
3. Output notes the skip: "[LP-CODE-REVIEW] skipped — Lean/Solo mode"
|
||||
4. Story completion proceeds based on acceptance criteria check only
|
||||
|
||||
**Assertions (5b):**
|
||||
- [ ] LP-CODE-REVIEW gate does NOT spawn in lean or solo mode
|
||||
- [ ] Skip is explicitly noted in output
|
||||
- [ ] Skill still requires "May I write" approval before marking story Complete
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Uses "May I write" before updating the story file
|
||||
- [ ] Uses "May I write" before adding entries to `docs/tech-debt-register.md`
|
||||
- [ ] Presents complete findings (criteria check, deviation check) before asking approval
|
||||
- [ ] Ends by surfacing the next ready story from the sprint plan
|
||||
- [ ] Does not mark a story Complete if any criteria are in ERROR state
|
||||
- [ ] Does not skip the code review prompt
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The full 8-phase flow of the skill is exercised across Cases 1-3; not all
|
||||
edge cases within each phase are covered.
|
||||
- Tech debt logging (deferred items written to `docs/tech-debt-register.md`)
|
||||
is mentioned in Case 2 but not the primary assertion focus; dedicated
|
||||
coverage deferred.
|
||||
- The `sprint-status.yaml` update (Phase 7 in the skill) is implied by Case 1
|
||||
but not the primary assertion; assumed to follow the same "May I write" pattern.
|
||||
- Stories with multiple TR-IDs or multiple ADRs are not explicitly tested.
|
||||
195
CCGS Skill Testing Framework/skills/readiness/story-readiness.md
Normal file
195
CCGS Skill Testing Framework/skills/readiness/story-readiness.md
Normal file
@@ -0,0 +1,195 @@
|
||||
# Skill Test Spec: /story-readiness
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/story-readiness` validates that a story file is ready for a developer to
|
||||
pick up and implement. It checks four dimensions: Design (embedded GDD
|
||||
requirements), Architecture (ADR references and status), Scope (clear
|
||||
boundaries and DoD), and Definition of Done (testable criteria). It produces
|
||||
a READY / NEEDS WORK / BLOCKED verdict. It is a read-only skill and runs
|
||||
before any developer picks up a story.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings or numbered check sections
|
||||
- [ ] Contains verdict keywords: READY, NEEDS WORK, BLOCKED
|
||||
- [ ] Does NOT require "May I write" language (read-only skill)
|
||||
- [ ] Has a next-step handoff (what to do after verdict)
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Fully ready story
|
||||
|
||||
**Fixture:**
|
||||
- Story file exists at `production/epics/core/story-light-pickup.md`
|
||||
- Story contains:
|
||||
- `TR-ID: TR-light-001` (GDD requirement reference)
|
||||
- `ADR: docs/architecture/adr-003-inventory.md`
|
||||
- Referenced ADR exists and has status `Accepted`
|
||||
- Referenced TR-ID exists in `docs/architecture/tr-registry.yaml`
|
||||
- Story has `## Acceptance Criteria` with ≥3 testable items
|
||||
- Story has `## Definition of Done` section
|
||||
- Story has `Status: Ready for Dev`
|
||||
- Manifest version in story header matches current `docs/architecture/control-manifest.md`
|
||||
|
||||
**Input:** `/story-readiness production/epics/core/story-light-pickup.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the story file
|
||||
2. Skill reads the referenced ADR — verifies status is `Accepted`
|
||||
3. Skill reads `docs/architecture/tr-registry.yaml` — verifies TR-ID exists
|
||||
4. Skill reads `docs/architecture/control-manifest.md` — verifies manifest version matches
|
||||
5. Skill evaluates all 4 dimensions (Design, Architecture, Scope, DoD)
|
||||
6. Skill outputs READY verdict with all checks passing
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads the referenced ADR file (not just the story)
|
||||
- [ ] Skill verifies ADR status is `Accepted` (not `Proposed`)
|
||||
- [ ] Skill reads `tr-registry.yaml` to verify TR-ID exists
|
||||
- [ ] Output includes check results for all 4 dimensions
|
||||
- [ ] Verdict is READY when all checks pass
|
||||
- [ ] Skill does not write any files
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Blocked Path — Referenced ADR is Proposed (not Accepted)
|
||||
|
||||
**Fixture:**
|
||||
- Story file exists with `ADR: docs/architecture/adr-005-light-system.md`
|
||||
- `adr-005-light-system.md` exists but has `Status: Proposed`
|
||||
- All other story content is otherwise complete
|
||||
|
||||
**Input:** `/story-readiness production/epics/core/story-light-system.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the story
|
||||
2. Skill reads `adr-005-light-system.md` — finds `Status: Proposed`
|
||||
3. Skill flags this as a BLOCKING issue (cannot implement against unaccepted ADR)
|
||||
4. Skill outputs BLOCKED verdict
|
||||
5. Skill recommends: accept or reject the ADR before picking up the story
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is BLOCKED (not NEEDS WORK or READY) when ADR is Proposed
|
||||
- [ ] Output explicitly names the Proposed ADR as the blocker
|
||||
- [ ] Output recommends resolving ADR status before proceeding
|
||||
- [ ] Skill does not output READY regardless of other checks passing
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Needs Work — Missing Acceptance Criteria
|
||||
|
||||
**Fixture:**
|
||||
- Story file exists but has no `## Acceptance Criteria` section
|
||||
- ADR reference exists and is `Accepted`
|
||||
- TR-ID exists in registry
|
||||
- Manifest version matches
|
||||
|
||||
**Input:** `/story-readiness production/epics/core/story-oxygen-drain.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the story
|
||||
2. Skill finds no Acceptance Criteria section
|
||||
3. Skill flags this as a NEEDS WORK issue (story is incomplete, not blocked)
|
||||
4. Skill outputs NEEDS WORK verdict
|
||||
5. Skill names the missing section and suggests adding measurable criteria
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is NEEDS WORK (not BLOCKED or READY) when Acceptance Criteria section is absent
|
||||
- [ ] Output identifies the missing Acceptance Criteria section specifically
|
||||
- [ ] Output suggests adding testable/measurable criteria
|
||||
- [ ] Skill distinguishes NEEDS WORK (fixable without external dependencies) from BLOCKED (requires outside action)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — Stale manifest version
|
||||
|
||||
**Fixture:**
|
||||
- Story file has `Manifest Version: 2026-01-15` in its header
|
||||
- `docs/architecture/control-manifest.md` has `Manifest Version: 2026-03-10`
|
||||
- Versions do not match (story was created before manifest was updated)
|
||||
|
||||
**Input:** `/story-readiness production/epics/core/story-mirror-rotation.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the story and extracts manifest version `2026-01-15`
|
||||
2. Skill reads control manifest header and extracts current version `2026-03-10`
|
||||
3. Skill detects version mismatch
|
||||
4. Skill flags this as an ADVISORY issue (not blocking, but worth noting)
|
||||
5. Verdict is NEEDS WORK with manifest staleness noted
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads `docs/architecture/control-manifest.md` to get current version
|
||||
- [ ] Skill compares story's embedded manifest version against current manifest version
|
||||
- [ ] Stale manifest version results in NEEDS WORK (not BLOCKED, not READY)
|
||||
- [ ] Output explains that the story's embedded guidance may be outdated
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — QL-STORY-READY behavior across review modes
|
||||
|
||||
**Fixture:**
|
||||
- Story file exists and is READY (all 4 dimensions pass, ADR Accepted, criteria present)
|
||||
- `production/session-state/review-mode.txt` exists
|
||||
|
||||
**Case 5a — full mode:**
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/story-readiness production/epics/core/story-light-pickup.md` (full mode)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads review mode — determines `full`
|
||||
2. After completing its own 4-dimension check, skill invokes QL-STORY-READY gate
|
||||
3. QA lead reviews the story for readiness
|
||||
4. If QA lead verdict is INADEQUATE → story verdict is BLOCKED regardless of 4-dimension result
|
||||
5. If QA lead verdict is ADEQUATE → verdict proceeds normally
|
||||
|
||||
**Assertions (5a):**
|
||||
- [ ] Skill reads review mode before deciding whether to invoke QL-STORY-READY
|
||||
- [ ] QL-STORY-READY gate is invoked in full mode after the 4-dimension check completes
|
||||
- [ ] A QA lead INADEQUATE verdict overrides a READY 4-dimension result → final verdict BLOCKED
|
||||
- [ ] Gate invocation is noted in output: "Gate: QL-STORY-READY — [result]"
|
||||
|
||||
**Case 5b — lean or solo mode:**
|
||||
- `review-mode.txt` contains `lean` or `solo`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads review mode — determines `lean` or `solo`
|
||||
2. QL-STORY-READY gate is SKIPPED
|
||||
3. Output notes the skip: "[QL-STORY-READY] skipped — Lean/Solo mode"
|
||||
4. Verdict is based on 4-dimension check only
|
||||
|
||||
**Assertions (5b):**
|
||||
- [ ] QL-STORY-READY gate does NOT spawn in lean or solo mode
|
||||
- [ ] Skip is explicitly noted in output
|
||||
- [ ] Verdict is based on 4-dimension check alone
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Does NOT use Write or Edit tools (read-only skill)
|
||||
- [ ] Presents complete check results before verdict
|
||||
- [ ] Does not ask for approval (no file writes)
|
||||
- [ ] Ends with recommended next step (fix issues or proceed to implementation)
|
||||
- [ ] Distinguishes three verdict levels clearly (READY vs NEEDS WORK vs BLOCKED)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Case where TR-ID is missing from the registry entirely is not explicitly
|
||||
tested here; it follows the same NEEDS WORK pattern as Case 3.
|
||||
- The "no argument" path (skill auto-detecting the current story) is not
|
||||
tested because it depends on `production/session-state/active.md` content,
|
||||
which is hard to fixture reliably.
|
||||
- Stories with multiple ADR references are not tested; behavior is assumed to
|
||||
be additive (all ADRs must be Accepted for READY verdict).
|
||||
@@ -0,0 +1,192 @@
|
||||
# Skill Test Spec: /architecture-review
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/architecture-review` is an Opus-tier skill that validates a technical architecture
|
||||
document against the project's 8 required architecture sections and checks that it
|
||||
is internally consistent, non-contradictory with existing ADRs, and correctly
|
||||
targeting the pinned engine version. It produces a verdict of APPROVED /
|
||||
NEEDS REVISION / MAJOR REVISION NEEDED.
|
||||
|
||||
In `full` review mode, the skill spawns two director gate agents in parallel:
|
||||
TD-ARCHITECTURE (technical-director) and LP-FEASIBILITY (lead-programmer). In
|
||||
`lean` or `solo` mode, both gates are skipped and noted. The skill is read-only —
|
||||
no files are written.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: APPROVED, NEEDS REVISION, MAJOR REVISION NEEDED
|
||||
- [ ] Does NOT require "May I write" language (read-only skill)
|
||||
- [ ] Has a next-step handoff at the end
|
||||
- [ ] Documents gate behavior: TD-ARCHITECTURE + LP-FEASIBILITY in full mode; skipped in lean/solo
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
In `full` mode: TD-ARCHITECTURE (technical-director) and LP-FEASIBILITY
|
||||
(lead-programmer) are spawned in parallel after the skill reads the architecture doc.
|
||||
|
||||
In `lean` mode: both gates are skipped. Output notes:
|
||||
"TD-ARCHITECTURE skipped — lean mode" and "LP-FEASIBILITY skipped — lean mode".
|
||||
|
||||
In `solo` mode: both gates are skipped with equivalent notes.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Complete architecture doc in full mode
|
||||
|
||||
**Fixture:**
|
||||
- `docs/architecture/architecture.md` exists with all 8 required sections populated
|
||||
- All sections reference the correct engine version from `docs/engine-reference/`
|
||||
- No contradictions with existing Accepted ADRs in `docs/architecture/`
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/architecture-review docs/architecture/architecture.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the architecture document
|
||||
2. Skill reads existing ADRs for cross-reference
|
||||
3. Skill reads engine version reference
|
||||
4. TD-ARCHITECTURE and LP-FEASIBILITY gate agents spawn in parallel
|
||||
5. Both gates return APPROVED
|
||||
6. Skill outputs section-by-section completeness check (8/8 sections present)
|
||||
7. Verdict: APPROVED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 8 required sections are checked and reported
|
||||
- [ ] TD-ARCHITECTURE and LP-FEASIBILITY spawn in parallel (not sequentially)
|
||||
- [ ] Verdict is APPROVED when all sections are present and no conflicts exist
|
||||
- [ ] Skill does NOT write any files
|
||||
- [ ] Next-step handoff to `/create-control-manifest` or `/create-epics` is present
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — Missing required sections
|
||||
|
||||
**Fixture:**
|
||||
- `docs/architecture/architecture.md` exists but is missing at least 2 required sections
|
||||
(e.g., no data model section, no error handling section)
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/architecture-review docs/architecture/architecture.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the document and identifies missing sections
|
||||
2. Section completeness shows fewer than 8/8 sections present
|
||||
3. Missing sections are listed by name with specific remediation guidance
|
||||
4. Verdict: MAJOR REVISION NEEDED (≥2 missing sections)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is MAJOR REVISION NEEDED (not APPROVED or NEEDS REVISION) for ≥2 missing sections
|
||||
- [ ] Each missing section is named explicitly in the output
|
||||
- [ ] Remediation guidance is specific (what to add, not just "add missing sections")
|
||||
- [ ] Skill does NOT pass a document missing required sections
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Partial Path — Architecture contradicts an existing ADR
|
||||
|
||||
**Fixture:**
|
||||
- `docs/architecture/architecture.md` exists with all 8 sections present
|
||||
- One Accepted ADR in `docs/architecture/` establishes a constraint that the architecture doc contradicts
|
||||
(e.g., ADR-001 mandates ECS pattern; architecture.md describes a different pattern for the same system)
|
||||
|
||||
**Input:** `/architecture-review docs/architecture/architecture.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the architecture doc and all existing ADRs
|
||||
2. Conflict is detected between the architecture doc and the named ADR
|
||||
3. Conflict entry names: the ADR number/title, the contradicting sections, and impact
|
||||
4. Verdict: NEEDS REVISION (conflict exists but structure is otherwise sound)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is NEEDS REVISION (not MAJOR REVISION NEEDED for a single contradiction)
|
||||
- [ ] The specific ADR number and title are named in the conflict entry
|
||||
- [ ] The contradicting sections in both documents are identified
|
||||
- [ ] Skill does NOT auto-resolve the contradiction
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — File not found
|
||||
|
||||
**Fixture:**
|
||||
- The path provided does not exist in the project
|
||||
|
||||
**Input:** `/architecture-review docs/architecture/nonexistent.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read the file
|
||||
2. File not found
|
||||
3. Skill outputs a clear error naming the missing file
|
||||
4. Skill suggests checking `docs/architecture/` or running `/create-architecture`
|
||||
5. Skill does NOT produce a verdict
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs a clear error when the file is not found
|
||||
- [ ] No verdict is produced (APPROVED / NEEDS REVISION / MAJOR REVISION NEEDED)
|
||||
- [ ] Skill suggests a corrective action
|
||||
- [ ] Skill does NOT crash or produce a partial report
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — Full mode spawns both gates; solo mode skips both
|
||||
|
||||
**Fixture (full mode):**
|
||||
- `docs/architecture/architecture.md` exists with all 8 sections
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
|
||||
**Full mode expected behavior:**
|
||||
1. TD-ARCHITECTURE gate spawns
|
||||
2. LP-FEASIBILITY gate spawns in parallel with TD-ARCHITECTURE
|
||||
3. Both gates complete before verdict is issued
|
||||
|
||||
**Assertions (full mode):**
|
||||
- [ ] TD-ARCHITECTURE and LP-FEASIBILITY both appear in the output as completed gates
|
||||
- [ ] Both gates spawn in parallel (not one after the other)
|
||||
- [ ] Verdict reflects gate feedback
|
||||
|
||||
**Fixture (solo mode):**
|
||||
- Same architecture doc
|
||||
- `production/session-state/review-mode.txt` contains `solo`
|
||||
|
||||
**Solo mode expected behavior:**
|
||||
1. Skill reads the architecture doc
|
||||
2. Gates are NOT spawned
|
||||
3. Output notes: "TD-ARCHITECTURE skipped — solo mode" and "LP-FEASIBILITY skipped — solo mode"
|
||||
4. Verdict is based on structural checks only
|
||||
|
||||
**Assertions (solo mode):**
|
||||
- [ ] Neither TD-ARCHITECTURE nor LP-FEASIBILITY appears as an active gate
|
||||
- [ ] Both skipped gates are noted in the output
|
||||
- [ ] Verdict is still produced based on the structural check alone
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Does NOT write any files (read-only skill)
|
||||
- [ ] Presents section completeness check before issuing verdict
|
||||
- [ ] TD-ARCHITECTURE and LP-FEASIBILITY spawn in parallel in full mode
|
||||
- [ ] Skipped gates are noted by name and mode in lean/solo output
|
||||
- [ ] Verdict is one of exactly: APPROVED, NEEDS REVISION, MAJOR REVISION NEEDED
|
||||
- [ ] Ends with next-step handoff appropriate to verdict
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The 8 required architecture sections are project-specific; tests use the
|
||||
section list defined in the skill body — not re-enumerated here.
|
||||
- Engine version compatibility checking (cross-referencing `docs/engine-reference/`)
|
||||
is part of Case 1's happy path but not independently fixture-tested.
|
||||
- RTM (requirement traceability matrix) mode is a separate concern covered by
|
||||
the `/architecture-review` skill's own `rtm` argument mode, not tested here.
|
||||
170
CCGS Skill Testing Framework/skills/review/design-review.md
Normal file
170
CCGS Skill Testing Framework/skills/review/design-review.md
Normal file
@@ -0,0 +1,170 @@
|
||||
# Skill Test Spec: /design-review
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/design-review` reads a game design document (GDD) and evaluates it against
|
||||
the project's 8-section design standard (Overview, Player Fantasy, Detailed
|
||||
Rules, Formulas, Edge Cases, Dependencies, Tuning Knobs, Acceptance Criteria).
|
||||
It checks for internal consistency, implementability, and cross-system
|
||||
conflicts. It produces a verdict of APPROVED, NEEDS REVISION, or MAJOR
|
||||
REVISION NEEDED. It is a read-only skill (no file writes) and runs as a
|
||||
`context: fork` subagent.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings or numbered steps
|
||||
- [ ] Contains verdict keywords: APPROVED, NEEDS REVISION, MAJOR REVISION NEEDED
|
||||
- [ ] Does NOT require "May I write" language (read-only skill — `allowed-tools` excludes Write/Edit)
|
||||
- [ ] Output format is documented (review template shown in skill body)
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Complete GDD, all 8 sections present
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/light-manipulation.md` exists (use `_fixtures/minimal-game-concept.md`
|
||||
as a stand-in — represents a complete document with all required content)
|
||||
- All 8 required sections are populated with substantive content
|
||||
- Formulas section contains at least one formula with defined variables
|
||||
- Acceptance Criteria section contains at least 3 testable criteria
|
||||
|
||||
**Input:** `/design-review design/gdd/light-manipulation.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the target document in full
|
||||
2. Skill reads CLAUDE.md for project context and standards
|
||||
3. Skill evaluates all 8 required sections (present/absent check)
|
||||
4. Skill checks internal consistency (formulas match described behavior)
|
||||
5. Skill checks implementability (rules are precise enough to code)
|
||||
6. Skill outputs structured review with section-by-section status
|
||||
7. Skill outputs APPROVED verdict
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads the target file before producing any output
|
||||
- [ ] Output includes a "Completeness" section showing X/8 sections present
|
||||
- [ ] Output includes an "Internal Consistency" section
|
||||
- [ ] Output includes an "Implementability" section
|
||||
- [ ] Output ends with a verdict line: APPROVED / NEEDS REVISION / MAJOR REVISION NEEDED
|
||||
- [ ] APPROVED verdict is given when all 8 sections are present and consistent
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — Incomplete GDD (4/8 sections)
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/light-manipulation.md` exists using content from
|
||||
`tests/skills/_fixtures/incomplete-gdd.md` (4 of 8 sections populated;
|
||||
Formulas, Edge Cases, Tuning Knobs, Acceptance Criteria are missing)
|
||||
|
||||
**Input:** `/design-review design/gdd/light-manipulation.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the document
|
||||
2. Skill identifies 4 missing sections
|
||||
3. Skill outputs "Completeness: 4/8 sections present"
|
||||
4. Skill lists specifically which 4 sections are missing
|
||||
5. Skill outputs MAJOR REVISION NEEDED verdict (not APPROVED or NEEDS REVISION)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Output shows "4/8" in the completeness section (not a higher number)
|
||||
- [ ] Output explicitly names each missing section (Formulas, Edge Cases, Tuning Knobs, Acceptance Criteria)
|
||||
- [ ] Verdict is MAJOR REVISION NEEDED (not APPROVED or NEEDS REVISION) when ≥3 sections are missing
|
||||
- [ ] Output does not suggest the document is implementation-ready
|
||||
- [ ] Skill does not write any files (read-only enforcement)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Partial Path — 7/8 sections, minor inconsistency
|
||||
|
||||
**Fixture:**
|
||||
- GDD has all sections except Formulas
|
||||
- The described behavior mentions numeric values but no formulas are defined
|
||||
- Acceptance Criteria exist but are vague ("feels good" rather than measurable)
|
||||
|
||||
**Input:** `/design-review design/gdd/[document].md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill identifies missing Formulas section
|
||||
2. Skill flags vague acceptance criteria as an implementability issue
|
||||
3. Skill outputs NEEDS REVISION verdict (not APPROVED, not MAJOR REVISION NEEDED)
|
||||
4. Skill provides specific remediation notes for each issue
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is NEEDS REVISION (not APPROVED, not MAJOR REVISION NEEDED) for 7/8 with issues
|
||||
- [ ] Output identifies the missing Formulas section specifically
|
||||
- [ ] Output flags the vague acceptance criteria as an implementability gap
|
||||
- [ ] Each flagged issue has a specific, actionable remediation note
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — File not found
|
||||
|
||||
**Fixture:**
|
||||
- The path provided does not exist in the project
|
||||
|
||||
**Input:** `/design-review design/gdd/nonexistent.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read the file
|
||||
2. File not found
|
||||
3. Skill outputs an error message naming the missing file
|
||||
4. Skill suggests checking the path or listing files in `design/gdd/`
|
||||
5. Skill does NOT produce a verdict
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs a clear error when the file is not found
|
||||
- [ ] Skill does NOT output APPROVED, NEEDS REVISION, or MAJOR REVISION NEEDED when file is missing
|
||||
- [ ] Skill suggests a corrective action (check path, list available GDDs)
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — no gate spawned regardless of review mode
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/light-manipulation.md` exists with all 8 sections
|
||||
- `production/session-state/review-mode.txt` exists with `full` (most permissive mode)
|
||||
|
||||
**Input:** `/design-review design/gdd/light-manipulation.md` (with full review mode active)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the GDD document
|
||||
2. Skill does NOT read `review-mode.txt` — this skill has no director gates
|
||||
3. Skill produces the review output normally
|
||||
4. No director gate agents are spawned at any point
|
||||
5. Verdict is APPROVED (all 8 sections present in fixture)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT spawn any director gate agent (CD-, TD-, PR-, AD- prefixed agents)
|
||||
- [ ] Skill does NOT read `review-mode.txt` or equivalent mode file
|
||||
- [ ] The `--review` flag or `full` mode state has NO effect on whether directors spawn
|
||||
- [ ] Output does not contain any "Gate: [GATE-ID]" entries
|
||||
- [ ] Skill IS the review — it does not delegate the review to a director
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Does NOT use Write or Edit tools (read-only skill)
|
||||
- [ ] Presents complete findings before any verdict
|
||||
- [ ] Does not ask for approval before producing output (no writes to approve)
|
||||
- [ ] Ends with recommended next step (e.g., fix issues and re-run, or proceed to `/map-systems`)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Cross-system consistency checking (Case 3 in the skill's own phase list) is
|
||||
not directly tested here because it requires multiple GDD files to compare;
|
||||
this is covered by the `/review-all-gdds` spec instead.
|
||||
- The skill's `context: fork` behavior (running as a subagent) is not tested
|
||||
at the spec level — this is a runtime behavior verified manually.
|
||||
- Performance and edge cases involving very large GDD files are not in scope.
|
||||
178
CCGS Skill Testing Framework/skills/review/review-all-gdds.md
Normal file
178
CCGS Skill Testing Framework/skills/review/review-all-gdds.md
Normal file
@@ -0,0 +1,178 @@
|
||||
# Skill Test Spec: /review-all-gdds
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/review-all-gdds` is an Opus-tier skill that performs a holistic cross-GDD review
|
||||
across all files in `design/gdd/`. It runs two complementary review phases in
|
||||
parallel: Phase 1 checks for consistency (contradictions, formula mismatches,
|
||||
stale references, competing ownership), and Phase 2 checks design theory (dominant
|
||||
strategies, pillar drift, cognitive overload, economic imbalance). Because the two
|
||||
phases are independent, they are spawned simultaneously to save time. The skill
|
||||
produces a CONSISTENT / MINOR ISSUES / MAJOR ISSUES verdict and is read-only — no
|
||||
files are written without explicit user approval.
|
||||
|
||||
The skill is itself the holistic review gate in the pipeline. It is invoked after
|
||||
individual GDDs are complete and before architecture work begins. It does NOT spawn
|
||||
any director gate agents (it IS the director-level review).
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥5 phase headings (complex multi-phase skill)
|
||||
- [ ] Contains verdict keywords: CONSISTENT, MINOR ISSUES, MAJOR ISSUES
|
||||
- [ ] Does NOT require "May I write" language (read-only skill)
|
||||
- [ ] Has a next-step handoff at the end
|
||||
- [ ] Documents parallel phase spawning (Phase 1 and Phase 2 are independent)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
No director gates — this skill spawns no director gate agents. It IS the holistic
|
||||
review; delegating to a director gate would create a circular dependency.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Clean GDD set with no conflicts
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` contains ≥3 system GDDs
|
||||
- All GDDs are internally consistent: no formula contradictions, no competing ownership, no stale references
|
||||
- All GDDs align with the pillars defined in `design/gdd/game-pillars.md`
|
||||
|
||||
**Input:** `/review-all-gdds`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all GDD files in `design/gdd/`
|
||||
2. Phase 1 (consistency scan) and Phase 2 (design theory check) spawn in parallel
|
||||
3. Phase 1 finds no contradictions, no formula mismatches, no ownership conflicts
|
||||
4. Phase 2 finds no pillar drift, no dominant strategies, no cognitive overload
|
||||
5. Skill outputs a structured findings table with 0 blocking issues
|
||||
6. Verdict: CONSISTENT
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Both review phases are spawned in parallel (not sequentially)
|
||||
- [ ] Output includes a findings table (even if empty — shows "No issues found")
|
||||
- [ ] Verdict is CONSISTENT when no conflicts are found
|
||||
- [ ] Skill does NOT write any files without user approval
|
||||
- [ ] Next-step handoff to `/architecture-review` or `/create-architecture` is present
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — Conflicting rules between two GDDs
|
||||
|
||||
**Fixture:**
|
||||
- GDD-A defines a floor value (e.g. "minimum [output] is [N]")
|
||||
- GDD-B states a mechanic that bypasses that floor (e.g. "[mechanic] can reduce [output] to 0")
|
||||
- The two GDDs are otherwise complete and valid
|
||||
|
||||
**Input:** `/review-all-gdds`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1 (consistency scan) detects the contradiction between GDD-A and GDD-B
|
||||
2. Conflict is reported with: both filenames, the specific conflicting rules, and severity HIGH
|
||||
3. Verdict: MAJOR ISSUES
|
||||
4. Handoff instructs user to resolve the conflict and re-run before proceeding
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is MAJOR ISSUES (not CONSISTENT or MINOR ISSUES)
|
||||
- [ ] Both GDD filenames are named in the conflict entry
|
||||
- [ ] The specific contradicting rules are quoted or described (not vague "conflict found")
|
||||
- [ ] Issue is classified as severity HIGH (blocking)
|
||||
- [ ] Skill does NOT auto-resolve the conflict
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Partial Path — Single GDD with orphaned dependency reference
|
||||
|
||||
**Fixture:**
|
||||
- GDD-A lists a dependency in its Dependencies section pointing to "system-B"
|
||||
- No GDD for system-B exists in `design/gdd/`
|
||||
- All other GDDs are consistent
|
||||
|
||||
**Input:** `/review-all-gdds`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1 detects the orphaned dependency reference in GDD-A
|
||||
2. Issue is reported as: DEPENDENCY GAP — GDD-A references system-B which has no GDD
|
||||
3. No other conflicts found
|
||||
4. Verdict: MINOR ISSUES (dependency gap is advisory, not blocking by itself)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is MINOR ISSUES (not MAJOR ISSUES for a single orphaned reference)
|
||||
- [ ] The specific GDD filename and the missing dependency name are reported
|
||||
- [ ] Skill suggests running `/design-system system-B` to resolve the gap
|
||||
- [ ] Skill does NOT skip or silently ignore the missing dependency
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No GDD files found
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` directory is empty or does not exist
|
||||
- No GDD files are present
|
||||
|
||||
**Input:** `/review-all-gdds`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read files in `design/gdd/`
|
||||
2. No files found — skill outputs an error with guidance
|
||||
3. Skill recommends running `/brainstorm` and `/design-system` before re-running
|
||||
4. Skill does NOT produce a verdict (CONSISTENT / MINOR ISSUES / MAJOR ISSUES)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs a clear error message when no GDDs are found
|
||||
- [ ] No verdict is produced when the directory is empty
|
||||
- [ ] Skill recommends the correct next action (`/brainstorm` or `/design-system`)
|
||||
- [ ] Skill does NOT crash or produce a partial report
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — No gate spawned regardless of review mode
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` contains ≥2 consistent system GDDs
|
||||
- `production/session-state/review-mode.txt` exists with content `full`
|
||||
|
||||
**Input:** `/review-all-gdds`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all GDDs and runs the two review phases
|
||||
2. Skill does NOT read `review-mode.txt`
|
||||
3. Skill does NOT spawn any director gate agent (CD-, TD-, PR-, AD- prefixed)
|
||||
4. Skill completes and outputs its verdict normally
|
||||
5. Review mode setting has no effect on this skill's behavior
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate agents are spawned at any point
|
||||
- [ ] Skill does NOT read `production/session-state/review-mode.txt`
|
||||
- [ ] Output does not contain any "Gate: [GATE-ID]" or "skipped" gate entries
|
||||
- [ ] The skill produces a verdict regardless of review mode
|
||||
- [ ] R4 metric: gate count for this skill = 0 in all modes
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Phase 1 (consistency) and Phase 2 (design theory) spawned in parallel — not sequentially
|
||||
- [ ] Does NOT write any files without "May I write" approval
|
||||
- [ ] Findings table shown before any write ask
|
||||
- [ ] Verdict is one of exactly: CONSISTENT, MINOR ISSUES, MAJOR ISSUES
|
||||
- [ ] Ends with appropriate handoff: MAJOR ISSUES → fix and re-run; MINOR ISSUES → may proceed with awareness; CONSISTENT → `/create-architecture`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Economic balance analysis (source/sink loops) requires cross-GDD resource data — covered
|
||||
structurally by Case 2 (the conflict detection pattern is the same).
|
||||
- The design theory phase (Phase 2) checks including dominant strategy detection and
|
||||
cognitive overload are not individually fixture-tested — they follow the same
|
||||
pattern as consistency checks and are validated via the pillar drift case structure.
|
||||
- The `since-last-review` scoping mode is not tested here — it is a runtime concern.
|
||||
169
CCGS Skill Testing Framework/skills/sprint/changelog.md
Normal file
169
CCGS Skill Testing Framework/skills/sprint/changelog.md
Normal file
@@ -0,0 +1,169 @@
|
||||
# Skill Test Spec: /changelog
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/changelog` is a Haiku-tier skill that auto-generates a developer-facing
|
||||
changelog by reading git commit history and closed sprint stories since the
|
||||
last release tag. It organizes entries into features, fixes, and known issues.
|
||||
No director gates are used. The skill asks "May I write to `docs/CHANGELOG.md`?"
|
||||
before persisting. Verdict is always COMPLETE.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" language (skill writes changelog)
|
||||
- [ ] Has a next-step handoff (e.g., run /patch-notes for player-facing version)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Changelog generation is a fast compilation task; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Multiple sprints since last release tag
|
||||
|
||||
**Fixture:**
|
||||
- Git history has a tag `v0.3.0` three sprints ago
|
||||
- Since that tag: 12 commits across sprints 006, 007, 008
|
||||
- Sprint story files reference task IDs matching commit messages
|
||||
- `docs/CHANGELOG.md` does not yet exist
|
||||
|
||||
**Input:** `/changelog`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads git log since `v0.3.0` tag
|
||||
2. Skill reads sprint stories to cross-reference task IDs
|
||||
3. Skill compiles entries into Features, Fixes, and Known Issues sections
|
||||
4. Skill presents draft to user
|
||||
5. Skill asks "May I write to `docs/CHANGELOG.md`?"
|
||||
6. User approves; file written; verdict COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Changelog covers commits since the most recent git tag
|
||||
- [ ] Entries are organized into Features / Fixes / Known Issues sections
|
||||
- [ ] Sprint story references are used to enrich commit descriptions
|
||||
- [ ] "May I write" prompt appears before file write
|
||||
- [ ] Verdict is COMPLETE after write
|
||||
|
||||
---
|
||||
|
||||
### Case 2: No Git Tags Found — All commits used, version baseline noted
|
||||
|
||||
**Fixture:**
|
||||
- Git repository has commits but no tags exist
|
||||
- 20 commits in history across 3 sprints
|
||||
|
||||
**Input:** `/changelog`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill checks for git tags — finds none
|
||||
2. Skill uses all commits in history as the baseline
|
||||
3. Skill notes in the output: "No version tag found — using full commit history; version baseline is unset"
|
||||
4. Skill still compiles organized changelog from available commits
|
||||
5. Skill asks "May I write" and writes on approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not error when no git tags exist
|
||||
- [ ] Output explicitly notes that no version baseline was found
|
||||
- [ ] Full commit history is used as the source
|
||||
- [ ] Changelog is still organized into sections despite missing tag
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Commit Messages Without Task IDs — Grouped by date with note
|
||||
|
||||
**Fixture:**
|
||||
- Git log since last tag has 8 commits
|
||||
- 5 commits have no task ID in the message (e.g., "fix typo", "tweak values")
|
||||
- 3 commits reference task IDs matching sprint stories
|
||||
|
||||
**Input:** `/changelog`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads commits and sprint stories
|
||||
2. 3 commits are matched to sprint stories and placed in appropriate sections
|
||||
3. 5 untagged commits are grouped by date under a "Misc" or "Other Changes" section
|
||||
4. Output notes: "5 commits without task IDs — grouped by date"
|
||||
5. Skill writes changelog on approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Commits with task IDs are placed in appropriate sections (Features or Fixes)
|
||||
- [ ] Commits without task IDs are grouped separately with a note
|
||||
- [ ] Output flags the number of commits missing task references
|
||||
- [ ] No commits are silently dropped from the changelog
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Existing CHANGELOG.md — New section prepended, old entries preserved
|
||||
|
||||
**Fixture:**
|
||||
- `docs/CHANGELOG.md` already exists with sections for `v0.2.0` and `v0.3.0`
|
||||
- New commits exist since `v0.3.0` tag
|
||||
|
||||
**Input:** `/changelog`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects that `docs/CHANGELOG.md` already exists
|
||||
2. Skill compiles new entries for the period since `v0.3.0`
|
||||
3. Skill presents draft with new section prepended above existing content
|
||||
4. Skill asks "May I write to `docs/CHANGELOG.md`?" (confirming prepend strategy)
|
||||
5. User approves; new content is prepended, old entries intact; verdict COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads existing changelog before writing to detect prior content
|
||||
- [ ] New section is prepended (not appended or overwriting) existing entries
|
||||
- [ ] Old changelog entries for v0.2.0 and v0.3.0 are preserved in the written file
|
||||
- [ ] "May I write" prompt reflects the prepend operation
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; read-then-write with approval
|
||||
|
||||
**Fixture:**
|
||||
- Git history has commits since last tag
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/changelog`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill compiles changelog in full mode
|
||||
2. No director gate is invoked (changelog generation is compilation, not a delivery gate)
|
||||
3. Skill runs on Haiku model — fast compilation
|
||||
4. Skill asks user for approval and writes file on confirmation
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked regardless of review mode
|
||||
- [ ] Output does not reference any gate result
|
||||
- [ ] Skill proceeds directly from compilation to "May I write" prompt
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads git log and sprint story files before compiling
|
||||
- [ ] Always asks "May I write" before writing changelog
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is always COMPLETE
|
||||
- [ ] Runs on Haiku model tier (fast, low-cost)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where git is not initialized in the repository is not tested;
|
||||
behavior would depend on git command failure handling.
|
||||
- Merge commits vs. squash commits are not explicitly differentiated in
|
||||
these tests; implementation detail of the git log parsing phase.
|
||||
- The `/patch-notes` skill should be run after `/changelog` for player-facing
|
||||
output; that handoff is verified in the patch-notes spec.
|
||||
171
CCGS Skill Testing Framework/skills/sprint/milestone-review.md
Normal file
171
CCGS Skill Testing Framework/skills/sprint/milestone-review.md
Normal file
@@ -0,0 +1,171 @@
|
||||
# Skill Test Spec: /milestone-review
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/milestone-review` generates a comprehensive review of a completed milestone:
|
||||
what shipped, velocity metrics, deferred items, risks surfaced, and retrospective
|
||||
seeds. In full mode the PR-MILESTONE director gate runs after the review is
|
||||
compiled (producer reviews scope delivery). In lean and solo modes the gate is
|
||||
skipped. The skill asks "May I write to `production/milestones/review-milestone-N.md`?"
|
||||
before persisting. Verdicts: MILESTONE COMPLETE or MILESTONE INCOMPLETE.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: MILESTONE COMPLETE, MILESTONE INCOMPLETE
|
||||
- [ ] Contains "May I write" language (skill writes review document)
|
||||
- [ ] Has a next-step handoff (what to do after review is written)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
| Gate ID | Trigger condition | Mode guard |
|
||||
|---------------|--------------------------------|-------------------------|
|
||||
| PR-MILESTONE | After review document compiled | full only (not lean/solo) |
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Nearly complete milestone with one deferred story
|
||||
|
||||
**Fixture:**
|
||||
- `production/milestones/milestone-03.md` exists with 8 stories
|
||||
- 7 stories have `Status: Complete`
|
||||
- 1 story has `Status: Deferred` (deferred to milestone-04)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/milestone-review milestone-03`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `milestone-03.md` and all referenced sprint files
|
||||
2. Skill compiles: 7 shipped, 1 deferred; velocity; no blockers
|
||||
3. Skill presents review draft to user
|
||||
4. PR-MILESTONE gate invoked; producer approves
|
||||
5. Skill asks "May I write to `production/milestones/review-milestone-03.md`?"
|
||||
6. User approves; file is written; verdict MILESTONE COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Deferred story is noted in the review with its target milestone
|
||||
- [ ] Verdict is MILESTONE COMPLETE despite the one deferred story
|
||||
- [ ] PR-MILESTONE gate is invoked after draft compilation in full mode
|
||||
- [ ] Skill asks "May I write" before writing review file
|
||||
- [ ] Review document path matches `production/milestones/review-milestone-03.md`
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Blocked Milestone — Multiple blocked stories
|
||||
|
||||
**Fixture:**
|
||||
- `production/milestones/milestone-03.md` exists with 5 stories
|
||||
- 2 stories have `Status: Complete`
|
||||
- 3 stories have `Status: Blocked` (named blockers listed in each story)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/milestone-review milestone-03`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads milestone and sprint files
|
||||
2. Skill finds 3 blocked stories; compiles blocker details
|
||||
3. Verdict is MILESTONE INCOMPLETE
|
||||
4. PR-MILESTONE gate runs; producer notes the unresolved blockers
|
||||
5. Review is written with blocker list on approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is MILESTONE INCOMPLETE when any stories are Blocked
|
||||
- [ ] Each blocked story's name and blocker reason is listed in the review
|
||||
- [ ] PR-MILESTONE gate is still invoked in full mode even for INCOMPLETE verdict
|
||||
- [ ] "May I write" prompt still appears before file write
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Full Mode — PR-MILESTONE returns CONCERNS
|
||||
|
||||
**Fixture:**
|
||||
- Milestone-03 has 6 complete stories but 2 were not in the original scope (added mid-sprint)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/milestone-review milestone-03`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill compiles review; notes 2 out-of-scope stories shipped
|
||||
2. PR-MILESTONE gate invoked; producer returns CONCERNS about scope drift
|
||||
3. Skill surfaces the CONCERNS to the user and adds a "scope drift" note to the review
|
||||
4. User approves revised review; file written as MILESTONE COMPLETE with caveat
|
||||
|
||||
**Assertions:**
|
||||
- [ ] CONCERNS from PR-MILESTONE gate are shown to user before write
|
||||
- [ ] Scope drift is explicitly noted in the written review document
|
||||
- [ ] Verdict is MILESTONE COMPLETE (stories shipped) with CONCERNS annotation
|
||||
- [ ] Skill does not suppress gate feedback
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No milestone file found for specified milestone
|
||||
|
||||
**Fixture:**
|
||||
- User calls `/milestone-review milestone-07`
|
||||
- `production/milestones/milestone-07.md` does NOT exist
|
||||
|
||||
**Input:** `/milestone-review milestone-07`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read `production/milestones/milestone-07.md`
|
||||
2. File not found; skill outputs an error message
|
||||
3. Skill suggests checking available milestones in `production/milestones/`
|
||||
4. No gate is invoked; no file is written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash when milestone file is absent
|
||||
- [ ] Output names the expected file path in the error message
|
||||
- [ ] Output suggests checking `production/milestones/` for valid milestone names
|
||||
- [ ] Verdict is BLOCKED (cannot review a non-existent milestone)
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Lean/Solo Mode — PR-MILESTONE gate skipped
|
||||
|
||||
**Fixture:**
|
||||
- `production/milestones/milestone-03.md` exists with 5 complete stories
|
||||
- `review-mode.txt` contains `solo`
|
||||
|
||||
**Input:** `/milestone-review milestone-03`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads review mode — determines `solo`
|
||||
2. Skill compiles review draft
|
||||
3. PR-MILESTONE gate is skipped; output notes "[PR-MILESTONE] skipped — Solo mode"
|
||||
4. Skill asks user for direct approval of the review
|
||||
5. User approves; review file is written; verdict MILESTONE COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] PR-MILESTONE gate is NOT invoked in solo (or lean) mode
|
||||
- [ ] Skip is explicitly noted in skill output
|
||||
- [ ] User direct approval is still required before write
|
||||
- [ ] Verdict is MILESTONE COMPLETE after successful write
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Shows compiled review draft before invoking PR-MILESTONE or asking to write
|
||||
- [ ] Always asks "May I write" before writing review document
|
||||
- [ ] PR-MILESTONE gate only runs in full mode
|
||||
- [ ] Skip message appears in lean and solo output
|
||||
- [ ] Verdict is MILESTONE COMPLETE or MILESTONE INCOMPLETE, stated clearly
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where the milestone has zero stories is not tested; it follows the
|
||||
MILESTONE INCOMPLETE pattern with a note suggesting the milestone may not
|
||||
have been planned.
|
||||
- Velocity calculation specifics (story points vs. story count) are not
|
||||
verified here; they are implementation details of the review compilation phase.
|
||||
170
CCGS Skill Testing Framework/skills/sprint/patch-notes.md
Normal file
170
CCGS Skill Testing Framework/skills/sprint/patch-notes.md
Normal file
@@ -0,0 +1,170 @@
|
||||
# Skill Test Spec: /patch-notes
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/patch-notes` is a Haiku-tier skill that generates player-facing patch notes
|
||||
from existing changelog content, stripping internal task IDs and technical
|
||||
jargon in favor of plain language. It filters entries to only those relevant
|
||||
to players (visible features and bug fixes; internal refactors are excluded).
|
||||
No director gates are used. The skill asks "May I write to
|
||||
`docs/patch-notes-vX.X.md`?" before persisting. Verdict is always COMPLETE.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" language (skill writes patch notes file)
|
||||
- [ ] Has a next-step handoff (e.g., share with community manager)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Patch notes generation is a fast compilation task; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Changelog filtered to player-facing entries
|
||||
|
||||
**Fixture:**
|
||||
- `docs/CHANGELOG.md` exists with 5 entries:
|
||||
- "Add dual-wield melee system" (Features — player-facing)
|
||||
- "Fix crash on level transition" (Fixes — player-facing)
|
||||
- "Add enemy patrol AI" (Features — player-facing)
|
||||
- "Refactor input handler to use event bus" (Fixes — internal only)
|
||||
- "Update dependency: Godot 4.6" (internal only)
|
||||
- Version is `v0.4.0`
|
||||
|
||||
**Input:** `/patch-notes v0.4.0`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `docs/CHANGELOG.md`
|
||||
2. Skill filters to 3 player-facing entries; excludes 2 internal entries
|
||||
3. Skill rewrites entries in plain language (no task IDs, no tech jargon)
|
||||
4. Skill presents draft to user
|
||||
5. Skill asks "May I write to `docs/patch-notes-v0.4.0.md`?"
|
||||
6. User approves; file written; verdict COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Only 3 entries appear in the patch notes (2 internal entries excluded)
|
||||
- [ ] Entries are written in plain language without internal task IDs
|
||||
- [ ] File path matches `docs/patch-notes-v0.4.0.md`
|
||||
- [ ] "May I write" prompt appears before file write
|
||||
- [ ] Verdict is COMPLETE after write
|
||||
|
||||
---
|
||||
|
||||
### Case 2: No Changelog Found — Directed to run /changelog first
|
||||
|
||||
**Fixture:**
|
||||
- `docs/CHANGELOG.md` does NOT exist
|
||||
|
||||
**Input:** `/patch-notes v0.4.0`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read `docs/CHANGELOG.md` — not found
|
||||
2. Skill outputs: "No changelog found — run /changelog first to generate one"
|
||||
3. No patch notes are generated; no file is written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash when changelog is absent
|
||||
- [ ] Output explicitly directs user to run `/changelog`
|
||||
- [ ] No "May I write" prompt appears (nothing to write)
|
||||
- [ ] Verdict is BLOCKED (dependency not met)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Tone Guidance from Design Folder — Incorporated into output
|
||||
|
||||
**Fixture:**
|
||||
- `docs/CHANGELOG.md` exists with player-facing entries
|
||||
- `design/community/tone-guide.md` exists with guidance: "upbeat, encouraging tone; avoid passive voice"
|
||||
|
||||
**Input:** `/patch-notes v0.4.0`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads changelog
|
||||
2. Skill detects tone guide at `design/community/tone-guide.md`
|
||||
3. Skill applies tone guidance when rewriting entries in plain language
|
||||
4. Patch notes use upbeat, active-voice phrasing
|
||||
5. Skill presents draft, asks to write, writes on approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks `design/` for a community or tone guidance file
|
||||
- [ ] Tone guide content influences phrasing of patch note entries
|
||||
- [ ] Output reflects active voice and upbeat tone where applicable
|
||||
- [ ] Skill notes that tone guidance was applied
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Patch Note Template Exists — Used instead of generated structure
|
||||
|
||||
**Fixture:**
|
||||
- `.claude/docs/templates/patch-notes-template.md` exists with a structured header format
|
||||
- `docs/CHANGELOG.md` exists with player-facing entries
|
||||
|
||||
**Input:** `/patch-notes v0.4.0`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads changelog and detects template exists
|
||||
2. Skill populates the template with player-facing entries
|
||||
3. Template header/footer structure is preserved in the output
|
||||
4. Skill asks "May I write" and writes on approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks for a patch notes template before generating from scratch
|
||||
- [ ] Template structure is used when found (not overridden by default format)
|
||||
- [ ] Player-facing entries are inserted into the correct template section
|
||||
- [ ] Output note confirms template was used
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; community-manager is separate
|
||||
|
||||
**Fixture:**
|
||||
- `docs/CHANGELOG.md` exists with player-facing entries
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/patch-notes v0.4.0`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill compiles patch notes in full mode
|
||||
2. No director gate is invoked (community review is a separate, manual step)
|
||||
3. Skill runs on Haiku model — fast compilation
|
||||
4. Skill notes in output: "Consider sharing draft with community manager before publishing"
|
||||
5. Skill asks user for approval and writes on confirmation
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked regardless of review mode
|
||||
- [ ] Output suggests (but does not require) community manager review
|
||||
- [ ] Skill proceeds directly from compilation to "May I write" prompt
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads `docs/CHANGELOG.md` before generating patch notes
|
||||
- [ ] Filters entries to player-facing items only
|
||||
- [ ] Rewrites entries in plain language without internal IDs
|
||||
- [ ] Always asks "May I write" before writing patch notes file
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Runs on Haiku model tier (fast, low-cost)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where all changelog entries are internal (zero player-facing items)
|
||||
is not tested; behavior is an empty patch notes draft with a warning.
|
||||
- Version number parsing from the changelog header is an implementation detail
|
||||
not verified here.
|
||||
- The community manager consultation noted in Case 5 is advisory; a separate
|
||||
skill or manual review handles that step.
|
||||
169
CCGS Skill Testing Framework/skills/sprint/retrospective.md
Normal file
169
CCGS Skill Testing Framework/skills/sprint/retrospective.md
Normal file
@@ -0,0 +1,169 @@
|
||||
# Skill Test Spec: /retrospective
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/retrospective` generates a structured sprint or milestone retrospective
|
||||
covering three categories: what went well, what didn't, and action items.
|
||||
It reads sprint files and session logs to compile observations, then produces
|
||||
a retrospective document. No director gates are used — retrospectives are
|
||||
team self-reflection artifacts. The skill asks "May I write to
|
||||
`production/retrospectives/retro-sprint-NNN.md`?" before persisting.
|
||||
Verdict is always COMPLETE (retrospective is structured output, not a pass/fail
|
||||
assessment).
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" language (skill writes retrospective document)
|
||||
- [ ] Has a next-step handoff (what to do after retrospective is written)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Retrospectives are team self-reflection documents; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Sprint with mixed outcomes
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-005.md` exists with 6 stories (4 Complete, 1 Blocked, 1 Deferred)
|
||||
- `production/session-logs/` contains log entries for the sprint period
|
||||
- No prior retrospective exists for sprint-005
|
||||
|
||||
**Input:** `/retrospective sprint-005`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads sprint-005 and session logs
|
||||
2. Skill compiles three retrospective categories: went well (4 stories shipped),
|
||||
didn't (1 blocked, 1 deferred), and action items (address blocker root cause)
|
||||
3. Skill presents retrospective draft to user
|
||||
4. Skill asks "May I write to `production/retrospectives/retro-sprint-005.md`?"
|
||||
5. User approves; file is written; verdict COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Retrospective contains all three categories (went well / didn't / actions)
|
||||
- [ ] Blocked and deferred stories appear in the "what didn't" section
|
||||
- [ ] At least one action item is generated from the blocked story
|
||||
- [ ] Skill asks "May I write" before writing file
|
||||
- [ ] Verdict is COMPLETE after successful write
|
||||
|
||||
---
|
||||
|
||||
### Case 2: No Sprint Data — Manual input fallback
|
||||
|
||||
**Fixture:**
|
||||
- User calls `/retrospective sprint-009`
|
||||
- `production/sprints/sprint-009.md` does NOT exist
|
||||
- No session logs reference sprint-009
|
||||
|
||||
**Input:** `/retrospective sprint-009`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read sprint-009 — not found
|
||||
2. Skill informs user that no sprint data was found for sprint-009
|
||||
3. Skill prompts user to provide retrospective input manually (went well, didn't, actions)
|
||||
4. User provides input; skill formats it into the retrospective structure
|
||||
5. Skill asks "May I write" and writes the document on approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash or produce an empty document when sprint file is absent
|
||||
- [ ] User is prompted to provide manual input
|
||||
- [ ] Manual input is formatted into the three-category structure
|
||||
- [ ] "May I write" prompt still appears before file write
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Prior Retrospective Exists — Offer to append or replace
|
||||
|
||||
**Fixture:**
|
||||
- `production/retrospectives/retro-sprint-005.md` already exists with content
|
||||
- User re-runs `/retrospective sprint-005` after changes
|
||||
|
||||
**Input:** `/retrospective sprint-005`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects that `retro-sprint-005.md` already exists
|
||||
2. Skill presents user with choice: append new observations or replace existing file
|
||||
3. User selects "replace"; skill compiles fresh retrospective
|
||||
4. Skill asks "May I write to `production/retrospectives/retro-sprint-005.md`?" (confirming overwrite)
|
||||
5. File is overwritten; verdict COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks for existing retrospective file before compiling
|
||||
- [ ] User is offered append or replace choice — not silently overwritten
|
||||
- [ ] "May I write" prompt reflects the overwrite scenario
|
||||
- [ ] Verdict is COMPLETE after write regardless of append vs. replace
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — Unresolved action items from previous retrospective
|
||||
|
||||
**Fixture:**
|
||||
- `production/retrospectives/retro-sprint-004.md` exists with 2 action items marked `[ ]` (not done)
|
||||
- User runs `/retrospective sprint-005`
|
||||
|
||||
**Input:** `/retrospective sprint-005`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the most recent prior retrospective (retro-sprint-004)
|
||||
2. Skill detects 2 unchecked action items from sprint-004
|
||||
3. Skill includes a "Carry-over from Sprint 004" section in the new retrospective
|
||||
4. The unresolved items are listed with a note that they were not followed up
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads the most recent prior retrospective to check for open action items
|
||||
- [ ] Unresolved action items appear in the new retrospective under a carry-over section
|
||||
- [ ] Carry-over items are distinct from newly generated action items
|
||||
- [ ] Output notes that these items were not followed up in the previous sprint
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate invoked in any mode
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-005.md` exists with complete stories
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/retrospective sprint-005`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill compiles retrospective in full mode
|
||||
2. No director gate is invoked (retrospectives are team self-reflection, not delivery gates)
|
||||
3. Skill asks user for approval and writes file on confirmation
|
||||
4. Verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked regardless of review mode
|
||||
- [ ] Output does not contain any gate invocation or gate result notation
|
||||
- [ ] Skill proceeds directly from compilation to "May I write" prompt
|
||||
- [ ] Review mode file content is irrelevant to this skill's behavior
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Always shows retrospective draft before asking to write
|
||||
- [ ] Always asks "May I write" before writing retrospective file
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is always COMPLETE (not a pass/fail skill)
|
||||
- [ ] Checks prior retrospective for unresolved action items
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Milestone retrospectives (as opposed to sprint retrospectives) follow the
|
||||
same pattern but read milestone files instead of sprint files; not
|
||||
separately tested here.
|
||||
- The case where session logs are empty is similar to Case 2 (no data);
|
||||
the skill falls back to manual input in both situations.
|
||||
177
CCGS Skill Testing Framework/skills/sprint/sprint-plan.md
Normal file
177
CCGS Skill Testing Framework/skills/sprint/sprint-plan.md
Normal file
@@ -0,0 +1,177 @@
|
||||
# Skill Test Spec: /sprint-plan
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/sprint-plan` reads the current milestone file and backlog stories, then
|
||||
generates a new numbered sprint with stories prioritized by implementation layer
|
||||
and priority score. In full mode the PR-SPRINT director gate runs after the
|
||||
sprint draft is compiled (producer reviews the plan). In lean and solo modes
|
||||
the gate is skipped. The skill asks "May I write to `production/sprints/sprint-NNN.md`?"
|
||||
before persisting. Verdicts: COMPLETE (sprint generated and written) or
|
||||
BLOCKED (cannot proceed due to missing data or gate failure).
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
|
||||
- [ ] Contains "May I write" language (skill writes sprint file)
|
||||
- [ ] Has a next-step handoff (what to do after sprint is written)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
| Gate ID | Trigger condition | Mode guard |
|
||||
|-----------|--------------------------|--------------------|
|
||||
| PR-SPRINT | After sprint draft built | full only (not lean/solo) |
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Backlog with stories generates sprint
|
||||
|
||||
**Fixture:**
|
||||
- `production/milestones/milestone-02.md` exists with capacity `10 story points`
|
||||
- Backlog contains 5 unstarted stories across 2 epics, mixed priorities
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
- Next sprint number is `003` (sprints 001 and 002 already exist)
|
||||
|
||||
**Input:** `/sprint-plan`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads current milestone to obtain capacity and goals
|
||||
2. Skill reads all unstarted stories from backlog; sorts by layer + priority
|
||||
3. Skill drafts sprint-003 with stories fitting within capacity
|
||||
4. Skill presents draft to user before invoking gate
|
||||
5. Skill invokes PR-SPRINT gate (full mode); producer approves
|
||||
6. Skill asks "May I write to `production/sprints/sprint-003.md`?"
|
||||
7. User approves; file is written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Stories are sorted by implementation layer before priority
|
||||
- [ ] Sprint draft is shown before any write or gate invocation
|
||||
- [ ] PR-SPRINT gate is invoked in full mode after draft is ready
|
||||
- [ ] Skill asks "May I write" before writing the sprint file
|
||||
- [ ] Written file path matches `production/sprints/sprint-003.md`
|
||||
- [ ] Verdict is COMPLETE after successful write
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Blocked Path — Backlog is empty
|
||||
|
||||
**Fixture:**
|
||||
- `production/milestones/milestone-02.md` exists
|
||||
- No unstarted stories exist in any epic backlog
|
||||
|
||||
**Input:** `/sprint-plan`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads backlog — finds no unstarted stories
|
||||
2. Skill outputs "No unstarted stories in backlog"
|
||||
3. Skill suggests running `/create-stories` to populate the backlog
|
||||
4. No gate is invoked; no file is written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is BLOCKED
|
||||
- [ ] Output contains "No unstarted stories" or equivalent message
|
||||
- [ ] Output recommends `/create-stories`
|
||||
- [ ] PR-SPRINT gate is NOT invoked
|
||||
- [ ] No write tool is called
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Gate returns CONCERNS — Sprint overloaded, revised before write
|
||||
|
||||
**Fixture:**
|
||||
- Backlog has 8 stories totalling 16 points; milestone capacity is 10 points
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/sprint-plan`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill drafts sprint with all 8 stories (over capacity)
|
||||
2. PR-SPRINT gate runs; producer returns CONCERNS: sprint is overloaded
|
||||
3. Skill presents concern to user and asks which stories to defer
|
||||
4. User selects 3 stories to defer; sprint is revised to 5 stories / 10 points
|
||||
5. Skill asks "May I write" with revised sprint; writes on approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] CONCERNS from PR-SPRINT gate surfaces to user before any write
|
||||
- [ ] Skill allows sprint to be revised after gate feedback
|
||||
- [ ] Revised sprint (not original) is written to file
|
||||
- [ ] Verdict is COMPLETE after revision and write
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Lean Mode — PR-SPRINT gate skipped
|
||||
|
||||
**Fixture:**
|
||||
- Backlog has 4 stories; milestone capacity is 8 points
|
||||
- `review-mode.txt` contains `lean`
|
||||
|
||||
**Input:** `/sprint-plan`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads review mode — determines `lean`
|
||||
2. Skill drafts sprint and presents it to user
|
||||
3. PR-SPRINT gate is skipped; output notes "[PR-SPRINT] skipped — Lean mode"
|
||||
4. Skill asks user for direct approval of the sprint
|
||||
5. User approves; sprint file is written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] PR-SPRINT gate is NOT invoked in lean mode
|
||||
- [ ] Skip is explicitly noted in output
|
||||
- [ ] User approval is still required before write (gate skip ≠ approval skip)
|
||||
- [ ] Verdict is COMPLETE after write
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Edge Case — Previous sprint still has open stories
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-002.md` exists with 2 stories still `Status: In Progress`
|
||||
- Backlog has 5 new unstarted stories
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/sprint-plan`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads sprint-002 and detects 2 open (in-progress) stories
|
||||
2. Skill flags: "Sprint 002 has 2 open stories — confirm carry-over before planning sprint 003"
|
||||
3. Skill presents user with choice: carry stories over, defer them, or cancel
|
||||
4. User confirms carry-over; carried stories are prepended to new sprint with `[CARRY]` tag
|
||||
5. Sprint draft is built; PR-SPRINT gate runs; sprint is written on approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks the most recent sprint file for open stories
|
||||
- [ ] User is asked to confirm carry-over before sprint planning continues
|
||||
- [ ] Carried stories appear in the new sprint draft with a distinguishing label
|
||||
- [ ] Skill does not silently ignore open stories from the previous sprint
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Shows draft sprint before invoking PR-SPRINT gate or asking to write
|
||||
- [ ] Always asks "May I write" before writing sprint file
|
||||
- [ ] PR-SPRINT gate only runs in full mode
|
||||
- [ ] Skip message appears in lean and solo mode output
|
||||
- [ ] Verdict is clearly stated at the end of the skill output
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where no milestone file exists is not explicitly tested; behavior
|
||||
follows the BLOCKED pattern with a suggestion to run `/gate-check` for
|
||||
milestone progression.
|
||||
- Solo mode behavior is equivalent to lean (gate skipped, user approval
|
||||
required) and is not separately tested.
|
||||
- Parallel story selection algorithms are not tested here; those are unit
|
||||
concerns for the sprint-plan subagent.
|
||||
167
CCGS Skill Testing Framework/skills/sprint/sprint-status.md
Normal file
167
CCGS Skill Testing Framework/skills/sprint/sprint-status.md
Normal file
@@ -0,0 +1,167 @@
|
||||
# Skill Test Spec: /sprint-status
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/sprint-status` is a Haiku-tier read-only skill that reads the current active
|
||||
sprint file and the session state to produce a concise sprint health summary.
|
||||
It reports story counts by status (Complete / In Progress / Blocked / Not Started)
|
||||
and emits one of three sprint-health verdicts: ON TRACK, AT RISK, or BLOCKED.
|
||||
It never writes files and does not invoke any director gates. It is designed for
|
||||
fast, low-cost status checks during a session.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings or numbered check sections
|
||||
- [ ] Contains verdict keywords: ON TRACK, AT RISK, BLOCKED
|
||||
- [ ] Does NOT require "May I write" language (read-only skill)
|
||||
- [ ] Has a next-step handoff (what to do based on the verdict)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/sprint-status` is a read-only reporting skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Mixed sprint, AT RISK with named blocker
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-004.md` exists (active sprint, linked in `active.md`)
|
||||
- Sprint contains 6 stories:
|
||||
- 3 with `Status: Complete`
|
||||
- 2 with `Status: In Progress`
|
||||
- 1 with `Status: Blocked` (blocker: "Waiting on physics ADR acceptance")
|
||||
- Sprint end date is 2 days away
|
||||
|
||||
**Input:** `/sprint-status`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `production/session-state/active.md` to find active sprint reference
|
||||
2. Skill reads `production/sprints/sprint-004.md`
|
||||
3. Skill counts stories by status: 3 Complete, 2 In Progress, 1 Blocked
|
||||
4. Skill detects a Blocked story and the approaching deadline
|
||||
5. Skill outputs AT RISK verdict with the blocker named explicitly
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Output includes story count breakdown by status
|
||||
- [ ] Output names the specific blocked story and its blocker reason
|
||||
- [ ] Verdict is AT RISK (not BLOCKED, not ON TRACK) when any story is Blocked
|
||||
- [ ] Skill does not write any files
|
||||
|
||||
---
|
||||
|
||||
### Case 2: All Stories Complete — Sprint COMPLETE verdict
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-004.md` exists
|
||||
- All 5 stories have `Status: Complete`
|
||||
|
||||
**Input:** `/sprint-status`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads sprint file — all stories are Complete
|
||||
2. Skill outputs ON TRACK verdict or SPRINT COMPLETE label
|
||||
3. Skill suggests running `/milestone-review` or `/sprint-plan` as next steps
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is ON TRACK or SPRINT COMPLETE when all stories are Complete
|
||||
- [ ] Output notes that the sprint is fully done
|
||||
- [ ] Next-step suggestion references `/milestone-review` or `/sprint-plan`
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Active Sprint File — Guidance to run /sprint-plan
|
||||
|
||||
**Fixture:**
|
||||
- `production/session-state/active.md` does not reference an active sprint
|
||||
- `production/sprints/` directory is empty or absent
|
||||
|
||||
**Input:** `/sprint-status`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `active.md` — finds no active sprint reference
|
||||
2. Skill checks `production/sprints/` — finds no files
|
||||
3. Skill outputs an informational message: no active sprint detected
|
||||
4. Skill suggests running `/sprint-plan` to create one
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not error or crash when no sprint file exists
|
||||
- [ ] Output clearly states no active sprint was found
|
||||
- [ ] Output recommends `/sprint-plan` as the next action
|
||||
- [ ] No verdict keyword is emitted (no sprint to assess)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — Stale In Progress Story (flagged)
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-004.md` exists
|
||||
- One story has `Status: In Progress` with a note in `active.md`:
|
||||
`Last updated: 2026-03-30` (more than 2 days before today's session date)
|
||||
- No stories are Blocked
|
||||
|
||||
**Input:** `/sprint-status`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads sprint file and session state
|
||||
2. Skill detects the story has been In Progress for >2 days without update
|
||||
3. Skill flags the story as "stale" in the output
|
||||
4. Verdict is AT RISK (stale in-progress stories indicate a hidden blocker)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill compares story "last updated" metadata against session date
|
||||
- [ ] Stale In Progress story is flagged by name in the output
|
||||
- [ ] Verdict is AT RISK, not ON TRACK, when a stale story is detected
|
||||
- [ ] Output does not conflate "stale" with "Blocked" — the label is distinct
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — Read-only; no gate invocation
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-004.md` exists with 4 stories (2 Complete, 2 In Progress)
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/sprint-status`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads sprint and produces status summary
|
||||
2. Skill does NOT invoke any director gate regardless of review mode
|
||||
3. Output is a plain status report with ON TRACK, AT RISK, or BLOCKED verdict
|
||||
4. Skill does not prompt for user approval or ask to write any file
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Output does not contain any "May I write" prompt
|
||||
- [ ] Skill completes and returns a verdict without user interaction
|
||||
- [ ] Review mode file is ignored (or confirmed irrelevant) by this skill
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Does NOT use Write or Edit tools (read-only skill)
|
||||
- [ ] Presents story count breakdown before emitting verdict
|
||||
- [ ] Does not ask for approval
|
||||
- [ ] Ends with a recommended next step based on verdict
|
||||
- [ ] Runs on Haiku model tier (fast, low-cost)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where multiple sprints are active simultaneously is not tested;
|
||||
the skill reads whichever sprint `active.md` references.
|
||||
- Partial sprint completion percentages are not explicitly verified; the
|
||||
count-by-status output implies them.
|
||||
- The `solo` mode review-mode variant is not separately tested; gate
|
||||
behavior in Case 5 applies to all modes equally.
|
||||
210
CCGS Skill Testing Framework/skills/team/team-audio.md
Normal file
210
CCGS Skill Testing Framework/skills/team/team-audio.md
Normal file
@@ -0,0 +1,210 @@
|
||||
# Skill Test Spec: /team-audio
|
||||
|
||||
## Skill Summary
|
||||
|
||||
Orchestrates the audio team through a four-step pipeline: audio direction
|
||||
(audio-director) → sound design + accessibility review in parallel (sound-designer
|
||||
+ accessibility-specialist) → technical implementation + engine validation in
|
||||
parallel (technical-artist + primary engine specialist) → code integration
|
||||
(gameplay-programmer). Reads relevant GDDs, the sound bible (if present), and
|
||||
existing audio asset lists before spawning agents. Compiles all outputs into an
|
||||
audio design document saved to `design/gdd/audio-[feature].md`. Uses
|
||||
`AskUserQuestion` at each step transition. Verdict is COMPLETE when the audio
|
||||
design document is produced. Skips the engine specialist spawn gracefully when no
|
||||
engine is configured.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 step/phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
|
||||
- [ ] Contains "File Write Protocol" section
|
||||
- [ ] File writes are delegated to sub-agents — orchestrator does not write files directly
|
||||
- [ ] Sub-agents enforce "May I write to [path]?" before any write
|
||||
- [ ] Has a next-step handoff at the end (references `/dev-story`, `/asset-audit`)
|
||||
- [ ] Error Recovery Protocol section is present
|
||||
- [ ] `AskUserQuestion` is used at step transitions before proceeding
|
||||
- [ ] Step 2 explicitly spawns sound-designer and accessibility-specialist in parallel
|
||||
- [ ] Step 3 explicitly spawns technical-artist and engine specialist in parallel (when engine is configured)
|
||||
- [ ] Skill reads `design/gdd/sound-bible.md` during context gathering if it exists
|
||||
- [ ] Output document is saved to `design/gdd/audio-[feature].md`
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All steps complete, audio design document saved
|
||||
|
||||
**Fixture:**
|
||||
- GDD for the target feature exists at `design/gdd/combat.md`
|
||||
- Sound bible exists at `design/gdd/sound-bible.md`
|
||||
- Existing audio assets are listed in `assets/audio/`
|
||||
- Engine is configured in `.claude/docs/technical-preferences.md`
|
||||
- No accessibility gaps exist in the planned audio event list
|
||||
|
||||
**Input:** `/team-audio combat`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Context gathering: orchestrator reads `design/gdd/combat.md`, `design/gdd/sound-bible.md`, and `assets/audio/` asset list before spawning any agent
|
||||
2. Step 1: audio-director is spawned; defines sonic identity, emotional tone, adaptive music direction, mix targets, and adaptive audio rules for combat
|
||||
3. `AskUserQuestion` presents audio direction; user approves before Step 2 begins
|
||||
4. Step 2: sound-designer and accessibility-specialist are spawned in parallel; sound-designer produces SFX specifications, audio event list with trigger conditions, and mixing groups; accessibility-specialist identifies critical gameplay audio events and specifies visual fallback and subtitle requirements
|
||||
5. `AskUserQuestion` presents SFX spec and accessibility requirements; user approves before Step 3 begins
|
||||
6. Step 3: technical-artist and primary engine specialist are spawned in parallel; technical-artist designs bus structure, middleware integration, memory budgets, and streaming strategy; engine specialist validates that the integration approach is idiomatic for the configured engine
|
||||
7. `AskUserQuestion` presents technical plan; user approves before Step 4 begins
|
||||
8. Step 4: gameplay-programmer is spawned; wires up audio events to gameplay triggers, implements adaptive music, sets up occlusion zones, writes unit tests for audio event triggers
|
||||
9. Orchestrator compiles all outputs into a single audio design document
|
||||
10. Subagent asks "May I write the audio design document to `design/gdd/audio-combat.md`?" before writing
|
||||
11. Summary output lists: audio event count, estimated asset count, implementation tasks, and any open questions
|
||||
12. Verdict: COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Sound bible is read during context gathering (before Step 1) when it exists
|
||||
- [ ] audio-director is spawned before sound-designer or accessibility-specialist
|
||||
- [ ] `AskUserQuestion` appears after Step 1 output and before Step 2 launch
|
||||
- [ ] sound-designer and accessibility-specialist Task calls are issued simultaneously in Step 2
|
||||
- [ ] technical-artist and engine specialist Task calls are issued simultaneously in Step 3
|
||||
- [ ] gameplay-programmer is not launched until Step 3 `AskUserQuestion` is approved
|
||||
- [ ] Audio design document is written to `design/gdd/audio-combat.md` (not another path)
|
||||
- [ ] Summary includes audio event count and estimated asset count
|
||||
- [ ] No files are written by the orchestrator directly
|
||||
- [ ] Verdict is COMPLETE after document delivery
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Accessibility Gap — Critical gameplay audio event has no visual fallback
|
||||
|
||||
**Fixture:**
|
||||
- GDD for the target feature exists
|
||||
- Step 1 and Step 2 are in progress
|
||||
- sound-designer's audio event list includes "EnemyNearbyAlert" — a spatial audio cue that warns the player an enemy is approaching from off-screen
|
||||
- accessibility-specialist reviews the event list and finds "EnemyNearbyAlert" has no visual fallback (no on-screen indicator, no subtitle, no controller rumble specified)
|
||||
|
||||
**Input:** `/team-audio stealth` (Step 2 scenario)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Steps 1–2 proceed; accessibility-specialist and sound-designer are spawned in parallel
|
||||
2. accessibility-specialist returns its review with a BLOCKING concern: "`EnemyNearbyAlert` is a critical gameplay audio event (warns player of off-screen threat) with no visual fallback — hearing-impaired players cannot detect this threat. This is a BLOCKING accessibility gap."
|
||||
3. Orchestrator surfaces the concern immediately in conversation before presenting `AskUserQuestion`
|
||||
4. `AskUserQuestion` presents the accessibility concern as a BLOCKING issue with options:
|
||||
- Add a visual indicator for EnemyNearbyAlert (e.g., directional arrow on HUD) and continue
|
||||
- Add controller haptic feedback as the fallback and continue
|
||||
- Stop here and resolve all accessibility gaps before proceeding to Step 3
|
||||
5. Step 3 (technical-artist + engine specialist) is not launched until the user resolves or explicitly accepts the gap
|
||||
6. The accessibility gap is included in the final audio design document under "Open Accessibility Issues" if unresolved
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Accessibility gap is labeled BLOCKING (not advisory) in the report
|
||||
- [ ] The specific event name ("EnemyNearbyAlert") and the nature of the gap are stated
|
||||
- [ ] `AskUserQuestion` surfaces the gap before Step 3 is launched
|
||||
- [ ] At least one resolution option is offered (add visual fallback, add haptic fallback)
|
||||
- [ ] Step 3 is not launched while the gap is unresolved without explicit user authorization
|
||||
- [ ] If the gap is carried forward unresolved, it is documented in the audio design doc as an open issue
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Argument — Usage guidance or design doc inference
|
||||
|
||||
**Fixture:**
|
||||
- Any project state
|
||||
|
||||
**Input:** `/team-audio` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no argument is provided
|
||||
2. Outputs usage guidance: e.g., "Usage: `/team-audio [feature or area]` — specify the feature or area to design audio for (e.g., `combat`, `main menu`, `forest biome`, `boss encounter`)"
|
||||
3. Skill exits without spawning any agents
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT spawn any agents when no argument is provided
|
||||
- [ ] Usage message includes the correct invocation format with argument examples
|
||||
- [ ] Skill does NOT attempt to infer a feature from existing design docs without user direction
|
||||
- [ ] No `AskUserQuestion` is used — output is direct guidance
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Missing Sound Bible — Skill notes the gap and proceeds without it
|
||||
|
||||
**Fixture:**
|
||||
- GDD for the target feature exists at `design/gdd/main-menu.md`
|
||||
- `design/gdd/sound-bible.md` does NOT exist
|
||||
- Engine is configured; other context files are present
|
||||
|
||||
**Input:** `/team-audio main menu`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Context gathering: orchestrator reads `design/gdd/main-menu.md` and checks for `design/gdd/sound-bible.md`
|
||||
2. Sound bible is not found; orchestrator notes the gap in conversation: "Note: `design/gdd/sound-bible.md` not found — audio direction will proceed without a project-wide sonic identity reference. Consider creating a sound bible if this is an ongoing project."
|
||||
3. Pipeline proceeds normally through all four steps without the sound bible as input
|
||||
4. audio-director in Step 1 is informed that no sound bible exists and must establish sonic identity from the feature GDD alone
|
||||
5. The missing sound bible is mentioned in the final summary as a recommended next step
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Orchestrator checks for the sound bible during context gathering (before Step 1)
|
||||
- [ ] Missing sound bible is noted explicitly in conversation — not silently ignored
|
||||
- [ ] Pipeline does NOT halt due to the missing sound bible
|
||||
- [ ] audio-director is notified that no sound bible exists in its prompt context
|
||||
- [ ] Summary or Next Steps section recommends creating a sound bible
|
||||
- [ ] Verdict is still COMPLETE if all other steps succeed
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Engine Not Configured — Engine specialist step skipped gracefully
|
||||
|
||||
**Fixture:**
|
||||
- Engine is NOT configured in `.claude/docs/technical-preferences.md` (shows `[TO BE CONFIGURED]`)
|
||||
- GDD for the target feature exists
|
||||
- Sound bible may or may not exist
|
||||
|
||||
**Input:** `/team-audio boss encounter`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Context gathering: orchestrator reads `.claude/docs/technical-preferences.md` and detects no engine is configured
|
||||
2. Steps 1–2 proceed normally (audio-director, sound-designer, accessibility-specialist)
|
||||
3. Step 3: technical-artist is spawned normally; engine specialist spawn is SKIPPED
|
||||
4. Orchestrator notes in conversation: "Engine specialist not spawned — no engine configured in technical-preferences.md. Engine integration validation will be deferred until an engine is selected."
|
||||
5. Step 4: gameplay-programmer proceeds with a note that engine-specific audio integration patterns could not be validated
|
||||
6. The engine specialist gap is included in the audio design document under "Deferred Validation"
|
||||
7. Verdict: COMPLETE (skip is graceful, not a blocker)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Engine specialist is NOT spawned when no engine is configured
|
||||
- [ ] Skill does NOT error out due to the missing engine configuration
|
||||
- [ ] The skip is explicitly noted in conversation — not silently omitted
|
||||
- [ ] technical-artist is still spawned in Step 3 (skip applies only to the engine specialist)
|
||||
- [ ] gameplay-programmer proceeds in Step 4 with the deferred validation noted
|
||||
- [ ] Deferred engine validation is recorded in the audio design document
|
||||
- [ ] Verdict is COMPLETE (engine not configured is a known graceful case)
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Context gathering (GDDs, sound bible, asset list) runs before any agent is spawned
|
||||
- [ ] `AskUserQuestion` is used after every step output before the next step launches
|
||||
- [ ] Parallel spawning: Step 2 (sound-designer + accessibility-specialist) and Step 3 (technical-artist + engine specialist) issue all Task calls before waiting for results
|
||||
- [ ] No files are written by the orchestrator directly — all writes are delegated to sub-agents
|
||||
- [ ] Each sub-agent enforces the "May I write to [path]?" protocol before any write
|
||||
- [ ] BLOCKED status from any agent is surfaced immediately — not silently skipped
|
||||
- [ ] A partial report is always produced when some agents complete and others block
|
||||
- [ ] Audio design document path follows the pattern `design/gdd/audio-[feature].md`
|
||||
- [ ] Verdict is exactly COMPLETE or BLOCKED — no other verdict values used
|
||||
- [ ] Next Steps handoff references `/dev-story` and `/asset-audit`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The "Retry with narrower scope" and "Skip this agent" resolution paths from the Error
|
||||
Recovery Protocol are not separately tested — they follow the same `AskUserQuestion`
|
||||
+ partial-report pattern validated in Cases 2 and 5.
|
||||
- Step 4 (gameplay-programmer) happy-path behavior is validated implicitly by Case 1.
|
||||
Failure modes for this step follow the standard Error Recovery Protocol.
|
||||
- The accessibility-specialist's subtitle and caption requirements (beyond visual fallbacks)
|
||||
are validated implicitly by Case 1. Case 2 focuses on the more severe case where a
|
||||
critical gameplay event has no fallback at all.
|
||||
- Engine specialist validation logic (idiomatic integration, version-specific changes) is
|
||||
tested only for the configured and unconfigured states. The specific content of the
|
||||
engine specialist's output is out of scope for this behavioral spec.
|
||||
180
CCGS Skill Testing Framework/skills/team/team-combat.md
Normal file
180
CCGS Skill Testing Framework/skills/team/team-combat.md
Normal file
@@ -0,0 +1,180 @@
|
||||
# Skill Test Spec: /team-combat
|
||||
|
||||
## Skill Summary
|
||||
|
||||
Orchestrates the full combat team pipeline end-to-end for a single combat feature.
|
||||
Coordinates game-designer, gameplay-programmer, ai-programmer, technical-artist,
|
||||
sound-designer, the primary engine specialist, and qa-tester through six structured
|
||||
phases: Design → Architecture (with engine specialist validation) → Implementation
|
||||
(parallel) → Integration → Validation → Sign-off. Uses `AskUserQuestion` at each
|
||||
phase transition. Delegates all file writes to sub-agents. Produces a summary report
|
||||
with verdict COMPLETE / NEEDS WORK / BLOCKED and handoffs to `/code-review`,
|
||||
`/balance-check`, and `/team-polish`.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings (Phase 1 through Phase 6 are all present)
|
||||
- [ ] Contains verdict keywords: COMPLETE, NEEDS WORK, BLOCKED
|
||||
- [ ] Contains "May I write" or "File Write Protocol" — writes delegated to sub-agents, orchestrator does not write files directly
|
||||
- [ ] Has a next-step handoff at the end (references `/code-review`, `/balance-check`, `/team-polish`)
|
||||
- [ ] Error Recovery Protocol section is present with all four recovery steps
|
||||
- [ ] Uses `AskUserQuestion` at phase transitions for user approval before proceeding
|
||||
- [ ] Phase 3 is explicitly marked as parallel (gameplay-programmer, ai-programmer, technical-artist, sound-designer)
|
||||
- [ ] Phase 2 includes spawning the primary engine specialist (read from `.claude/docs/technical-preferences.md`)
|
||||
- [ ] Team Composition lists all seven roles (game-designer, gameplay-programmer, ai-programmer, technical-artist, sound-designer, engine specialist, qa-tester)
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All agents succeed, full pipeline runs to completion
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/game-concept.md` exists and is populated
|
||||
- Engine is configured in `.claude/docs/technical-preferences.md` (Engine Specialists section filled)
|
||||
- No existing GDD for the requested combat feature
|
||||
|
||||
**Input:** `/team-combat parry and riposte system`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1 — game-designer spawned; produces `design/gdd/parry-riposte.md` covering all 8 required sections (overview, player fantasy, rules, formulas, edge cases, dependencies, tuning knobs, acceptance criteria); asks user to approve design doc
|
||||
2. Phase 2 — gameplay-programmer + ai-programmer spawned; produce architecture sketch with class structure, interfaces, and file list; then primary engine specialist is spawned to validate idioms; engine specialist output incorporated; `AskUserQuestion` presented with architecture options before Phase 3 begins
|
||||
3. Phase 3 — gameplay-programmer, ai-programmer, technical-artist, sound-designer spawned in parallel; all four return outputs before Phase 4 begins
|
||||
4. Phase 4 — integration wires together all Phase 3 outputs; tuning knobs verified as data-driven; `AskUserQuestion` confirms integration before Phase 5
|
||||
5. Phase 5 — qa-tester spawned; writes test cases from acceptance criteria; verifies edge cases; performance impact checked against budget
|
||||
6. Phase 6 — summary report produced: design COMPLETE, all team members COMPLETE, test cases listed, verdict: COMPLETE
|
||||
7. Next steps listed: `/code-review`, `/balance-check`, `/team-polish`
|
||||
|
||||
**Assertions:**
|
||||
- [ ] `AskUserQuestion` called at each phase gate (at minimum before Phase 3 and before Phase 5)
|
||||
- [ ] Phase 3 agents launched simultaneously — no sequential dependency between gameplay-programmer, ai-programmer, technical-artist, sound-designer
|
||||
- [ ] Engine specialist runs in Phase 2 before Phase 3 begins (output incorporated into architecture)
|
||||
- [ ] All file writes delegated to sub-agents (orchestrator never calls Write/Edit directly)
|
||||
- [ ] Verdict COMPLETE present in final report
|
||||
- [ ] Next steps include `/code-review`, `/balance-check`, `/team-polish`
|
||||
- [ ] Design doc covers all 8 required GDD sections
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Blocked Agent — One subagent returns BLOCKED mid-pipeline
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/parry-riposte.md` exists (Phase 1 already complete)
|
||||
- ai-programmer agent returns BLOCKED because no AI system architecture ADR exists (ADR status is Proposed)
|
||||
|
||||
**Input:** `/team-combat parry and riposte system`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1 — design doc found; game-designer confirms it is valid; phase approved
|
||||
2. Phase 2 — gameplay-programmer completes architecture sketch; ai-programmer returns BLOCKED: "ADR for AI behavior system is Proposed — cannot implement until ADR is Accepted"
|
||||
3. Error Recovery Protocol triggered: "ai-programmer: BLOCKED — AI behavior ADR is Proposed"
|
||||
4. `AskUserQuestion` presented with options: (a) Skip ai-programmer and note the gap; (b) Retry with narrower scope; (c) Stop here and run `/architecture-decision` first
|
||||
5. If user chooses (a): Phase 3 proceeds with gameplay-programmer, technical-artist, sound-designer only; ai-programmer gap noted in partial report
|
||||
6. Final report produced: partial implementation documented, ai-programmer section marked BLOCKED, overall verdict: BLOCKED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] BLOCKED surface message appears before any dependent phase continues
|
||||
- [ ] `AskUserQuestion` offers at minimum three options: skip / retry / stop
|
||||
- [ ] Partial report produced — completed agents' work is not discarded
|
||||
- [ ] Overall verdict is BLOCKED (not COMPLETE) when any agent is unresolved
|
||||
- [ ] Blocked reason references the ADR and suggests `/architecture-decision`
|
||||
- [ ] Orchestrator does not silently proceed past the blocked dependency
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Argument — Clear usage guidance shown
|
||||
|
||||
**Fixture:**
|
||||
- Any project state
|
||||
|
||||
**Input:** `/team-combat` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no argument provided
|
||||
2. Outputs usage message explaining the required argument (combat feature description)
|
||||
3. Provides an example invocation: `/team-combat [combat feature description]`
|
||||
4. Skill exits without spawning any subagents
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT spawn any subagents when no argument is given
|
||||
- [ ] Usage message includes the argument-hint format from frontmatter
|
||||
- [ ] Error message includes at least one example of a valid invocation
|
||||
- [ ] No file reads beyond what is needed to detect the missing argument
|
||||
- [ ] Verdict is NOT shown (pipeline never runs)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Parallel Phase Validation — Phase 3 agents run simultaneously
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/parry-riposte.md` exists and is complete
|
||||
- Architecture sketch has been approved
|
||||
- Engine specialist has validated architecture
|
||||
|
||||
**Input:** `/team-combat parry and riposte system` (resuming from Phase 2 complete)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 3 begins after architecture approval
|
||||
2. All four Task calls — gameplay-programmer, ai-programmer, technical-artist, sound-designer — are issued before any result is awaited
|
||||
3. Skill waits for all four agents to complete before proceeding to Phase 4
|
||||
4. If any single agent completes early, skill does not begin Phase 4 until all four have returned
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Four Task calls issued in a single batch (no sequential waiting between them)
|
||||
- [ ] Phase 4 does not begin until all four Phase 3 agents have returned results
|
||||
- [ ] Skill does not pass one Phase 3 agent's output as input to another Phase 3 agent (they are independent)
|
||||
- [ ] All four Phase 3 agent results referenced in the Phase 4 integration step
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Architecture Phase Engine Routing — Engine specialist receives correct context
|
||||
|
||||
**Fixture:**
|
||||
- `.claude/docs/technical-preferences.md` has Engine Specialists section populated (e.g., Primary: godot-specialist)
|
||||
- Architecture sketch produced by gameplay-programmer is available
|
||||
- Engine version pinned in `docs/engine-reference/godot/VERSION.md`
|
||||
|
||||
**Input:** `/team-combat parry and riposte system`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 2 — gameplay-programmer produces architecture sketch
|
||||
2. Skill reads `.claude/docs/technical-preferences.md` Engine Specialists section to identify the primary engine specialist agent type
|
||||
3. Engine specialist is spawned with: the architecture sketch, the GDD path, the engine version from `VERSION.md`, and explicit instructions to check for deprecated APIs
|
||||
4. Engine specialist output (idiom notes, deprecated API warnings, native system recommendations) is returned to orchestrator
|
||||
5. Orchestrator incorporates engine notes into the architecture before presenting Phase 2 results to user
|
||||
6. `AskUserQuestion` includes engine specialist's notes alongside the architecture sketch
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Engine specialist agent type is read from `.claude/docs/technical-preferences.md` — not hardcoded
|
||||
- [ ] Engine specialist prompt includes the architecture sketch and GDD path
|
||||
- [ ] Engine specialist checks for deprecated APIs against the pinned engine version
|
||||
- [ ] Engine specialist output is incorporated before Phase 3 begins (not skipped or appended separately)
|
||||
- [ ] If no engine is configured, engine specialist step is skipped and a note is added to the report
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] `AskUserQuestion` used at each phase transition — user approves before pipeline advances
|
||||
- [ ] All file writes delegated to sub-agents via Task — orchestrator does not call Write or Edit directly
|
||||
- [ ] Error Recovery Protocol followed: surface → assess → offer options → partial report
|
||||
- [ ] Phase 3 agents launched in parallel per skill spec
|
||||
- [ ] Partial report always produced even when agents are BLOCKED
|
||||
- [ ] Verdict is one of COMPLETE / NEEDS WORK / BLOCKED
|
||||
- [ ] Next steps present at end of output: `/code-review`, `/balance-check`, `/team-polish`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The NEEDS WORK verdict path (qa-tester finds failures in Phase 5) is not separately tested
|
||||
here; it follows the same error recovery and partial report protocol as Case 2.
|
||||
- "Retry with narrower scope" error recovery option is listed in assertions but its full
|
||||
recursive behavior (splitting via `/create-stories`) is covered by the `/create-stories` spec.
|
||||
- Phase 4 integration logic (wiring gameplay, AI, VFX, audio) is validated implicitly by
|
||||
the Happy Path case; a dedicated integration test would require fixture code files.
|
||||
- Engine specialist unavailable (no engine configured) is partially covered in Case 5
|
||||
assertions — a dedicated fixture for unconfigured engine state would strengthen coverage.
|
||||
209
CCGS Skill Testing Framework/skills/team/team-level.md
Normal file
209
CCGS Skill Testing Framework/skills/team/team-level.md
Normal file
@@ -0,0 +1,209 @@
|
||||
# Skill Test Spec: /team-level
|
||||
|
||||
## Skill Summary
|
||||
|
||||
Orchestrates the full level design team for a single level or area. Coordinates
|
||||
narrative-director, world-builder, level-designer, systems-designer, art-director,
|
||||
accessibility-specialist, and qa-tester through five sequential steps with one
|
||||
parallel phase (Step 4). Compiles all team outputs into a single level design
|
||||
document saved to `design/levels/[level-name].md`. Uses `AskUserQuestion` at each
|
||||
step transition. Delegates all file writes to sub-agents. Produces a summary report
|
||||
with verdict COMPLETE / BLOCKED and handoffs to `/design-review`, `/dev-story`,
|
||||
`/qa-plan`.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase/step headings (Step 1 through Step 5 are all present)
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
|
||||
- [ ] Contains "May I write" or "File Write Protocol" — writes delegated to sub-agents, orchestrator does not write files directly
|
||||
- [ ] Has a next-step handoff at the end (references `/design-review`, `/dev-story`, `/qa-plan`)
|
||||
- [ ] Error Recovery Protocol section is present with all four recovery steps
|
||||
- [ ] Uses `AskUserQuestion` at step transitions for user approval before proceeding
|
||||
- [ ] Step 4 is explicitly marked as parallel (art-director and accessibility-specialist run simultaneously)
|
||||
- [ ] Context gathering reads: `design/gdd/game-concept.md`, `design/gdd/game-pillars.md`, `design/levels/`, `design/narrative/`, and relevant world-building docs
|
||||
- [ ] Team Composition lists all seven roles (narrative-director, world-builder, level-designer, systems-designer, art-director, accessibility-specialist, qa-tester)
|
||||
- [ ] accessibility-specialist output includes severity ratings (BLOCKING / RECOMMENDED / NICE TO HAVE)
|
||||
- [ ] Final level design document saved to `design/levels/[level-name].md`
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All team members produce outputs, document compiled and saved
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/game-concept.md` exists and is populated
|
||||
- `design/gdd/game-pillars.md` exists
|
||||
- `design/levels/` directory exists (may contain other level docs)
|
||||
- `design/narrative/` directory exists with relevant narrative docs
|
||||
|
||||
**Input:** `/team-level forest dungeon`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Context gathering — orchestrator reads game-concept.md, game-pillars.md, existing level docs in `design/levels/`, narrative docs in `design/narrative/`, and world-building docs for the forest region
|
||||
2. Step 1 — narrative-director spawned: defines narrative purpose, key characters, dialogue triggers, emotional arc; world-builder spawned: provides lore context, environmental storytelling opportunities, world rules; `AskUserQuestion` confirms Step 1 outputs before Step 2
|
||||
3. Step 2 — level-designer spawned: designs spatial layout (critical path, optional paths, secrets), pacing curve, encounters, puzzles, entry/exit points and connections to adjacent areas; `AskUserQuestion` confirms layout before Step 3
|
||||
4. Step 3 — systems-designer spawned: specifies enemy compositions, loot tables, difficulty balance, area-specific mechanics, resource distribution; `AskUserQuestion` confirms systems before Step 4
|
||||
5. Step 4 — art-director and accessibility-specialist spawned in parallel; art-director: visual theme, color palette, lighting, asset list, VFX needs; accessibility-specialist: navigation clarity, colorblind safety, cognitive load check — each concern rated BLOCKING / RECOMMENDED / NICE TO HAVE; `AskUserQuestion` presents both outputs before Step 5
|
||||
6. Step 5 — qa-tester spawned: test cases for critical path, boundary/edge cases (sequence breaks, softlocks), playtest checklist, acceptance criteria
|
||||
7. Orchestrator compiles all team outputs into level design document format; sub-agent asked "May I write to `design/levels/forest-dungeon.md`?"; file saved
|
||||
8. Summary report: area overview, encounter count, estimated asset list, narrative beats, cross-team dependencies, verdict: COMPLETE
|
||||
9. Next steps listed: `/design-review design/levels/forest-dungeon.md`, `/dev-story`, `/qa-plan`
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All five sources read during context gathering before any agent is spawned
|
||||
- [ ] narrative-director and world-builder both spawned in Step 1 (may be sequential or parallel — both must complete before Step 2)
|
||||
- [ ] `AskUserQuestion` called at each step gate (minimum: after Step 1, Step 2, Step 3, Step 4)
|
||||
- [ ] Step 4 agents (art-director, accessibility-specialist) launched simultaneously
|
||||
- [ ] All file writes delegated to sub-agents — orchestrator does not write directly
|
||||
- [ ] Level doc saved to `design/levels/forest-dungeon.md` (slugified from argument)
|
||||
- [ ] Verdict COMPLETE in final summary report
|
||||
- [ ] Next steps include `/design-review`, `/dev-story`, `/qa-plan`
|
||||
- [ ] Summary report includes: area overview, encounter count, estimated asset list, narrative beats
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Blocked Agent (world-builder) — Partial report produced with gap noted
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/game-concept.md` exists
|
||||
- World-building docs for the forest region do NOT exist
|
||||
- world-builder agent returns BLOCKED: "No world-building docs found for the forest region — cannot provide lore context"
|
||||
|
||||
**Input:** `/team-level forest dungeon`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Context gathering completes; missing world-building docs noted
|
||||
2. Step 1 — narrative-director completes successfully; world-builder spawned and returns BLOCKED
|
||||
3. Error Recovery Protocol triggered: "world-builder: BLOCKED — no world-building docs for forest region"
|
||||
4. `AskUserQuestion` presented with options:
|
||||
- (a) Skip world-builder and note the lore gap in the level doc
|
||||
- (b) Retry with narrower scope (world-builder focuses only on what can be inferred from game-concept.md)
|
||||
- (c) Stop here and create world-building docs first
|
||||
5. If user chooses (a): pipeline continues with Steps 2–5 using narrative-director context only; level doc compiled with a clearly marked gap section: "World-building context: NOT PROVIDED — see open dependency"
|
||||
6. Final report produced: partial outputs documented, world-builder section marked BLOCKED, overall verdict: BLOCKED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] BLOCKED surface message appears immediately when world-builder fails — before Step 2 begins without user input
|
||||
- [ ] `AskUserQuestion` offers at minimum three options (skip / retry / stop)
|
||||
- [ ] Partial report produced — narrative-director's completed work is not discarded
|
||||
- [ ] Level doc (if compiled) contains an explicit gap notation for the missing world-building context
|
||||
- [ ] Overall verdict is BLOCKED (not COMPLETE) when world-builder remains unresolved
|
||||
- [ ] Skill does NOT silently fabricate lore content to fill the gap
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Argument — Usage guidance shown
|
||||
|
||||
**Fixture:**
|
||||
- Any project state
|
||||
|
||||
**Input:** `/team-level` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no argument provided
|
||||
2. Outputs usage message explaining the required argument (level name or area to design)
|
||||
3. Provides example invocations: `/team-level tutorial`, `/team-level forest dungeon`, `/team-level final boss arena`
|
||||
4. Skill exits without reading any project files or spawning any subagents
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT spawn any subagents when no argument is given
|
||||
- [ ] Usage message includes the argument-hint format from frontmatter
|
||||
- [ ] At least one example of a valid invocation is shown
|
||||
- [ ] No GDD or level files read before failing
|
||||
- [ ] Verdict is NOT shown (pipeline never starts)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Accessibility Review Gate — Blocking concern surfaces before sign-off
|
||||
|
||||
**Fixture:**
|
||||
- Steps 1–3 complete successfully
|
||||
- `design/accessibility-requirements.md` committed tier: Enhanced
|
||||
- accessibility-specialist (Step 4, parallel) flags a BLOCKING concern: the critical path through the forest dungeon requires players to distinguish between two environmental hazards (toxic pools vs. shallow water) using color alone — no shape, icon, or audio cue differentiates them
|
||||
|
||||
**Input:** `/team-level forest dungeon`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Steps 1–3 complete; Step 4 parallel phase begins
|
||||
2. accessibility-specialist returns: BLOCKING concern — "Critical path hazard distinction relies on color only (toxic pools vs. shallow water). Shape, icon, or audio cue required per Enhanced accessibility tier."
|
||||
3. art-director returns Step 4 output (complete)
|
||||
4. Skill presents both Step 4 results via `AskUserQuestion` — BLOCKING concern highlighted prominently
|
||||
5. `AskUserQuestion` offers:
|
||||
- (a) Return to level-designer + art-director to redesign hazard visual/audio language before Step 5
|
||||
- (b) Document as a known accessibility gap and proceed to Step 5 with the concern logged
|
||||
6. Skill does NOT silently proceed past the BLOCKING concern
|
||||
7. If user chooses (a): level-designer and art-director revision spawned; re-run Step 4 accessibility check
|
||||
8. Final report includes BLOCKING concern and its resolution status regardless of user choice
|
||||
|
||||
**Assertions:**
|
||||
- [ ] BLOCKING accessibility concern is not treated as advisory — it is surfaced as a blocker
|
||||
- [ ] `AskUserQuestion` presents the specific concern text (not just "accessibility issue found")
|
||||
- [ ] Step 5 (qa-tester) does NOT begin without user acknowledging the BLOCKING concern
|
||||
- [ ] Revision path offered: level-designer + art-director can be sent back before proceeding
|
||||
- [ ] Final report includes the accessibility concern and its resolution status
|
||||
- [ ] art-director's completed output is NOT discarded when accessibility-specialist blocks
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Circular Level Reference — Adjacent area dependency flagged
|
||||
|
||||
**Fixture:**
|
||||
- Steps 1–3 in progress
|
||||
- level-designer (Step 2) produces a layout that specifies entry/exit points connecting to "the crystal caves" (an adjacent area)
|
||||
- `design/levels/crystal-caves.md` does NOT exist — the crystal caves area has not been designed yet
|
||||
|
||||
**Input:** `/team-level forest dungeon`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Step 2 — level-designer produces layout including: "West exit connects to crystal-caves entry point A"
|
||||
2. Orchestrator (or level-designer subagent) checks `design/levels/` for `crystal-caves.md`; file not found
|
||||
3. Dependency gap surfaced: "Level references crystal-caves as an adjacent area but `design/levels/crystal-caves.md` does not exist"
|
||||
4. `AskUserQuestion` presented with options:
|
||||
- (a) Proceed with a placeholder reference — note the dependency in the level doc as UNRESOLVED
|
||||
- (b) Pause and run `/team-level crystal caves` first to establish that area
|
||||
5. Skill does NOT invent crystal caves content to satisfy the reference
|
||||
6. If user chooses (a): level doc compiled with the west exit marked "→ crystal-caves (UNRESOLVED — area not yet designed)"; flagged in the open dependencies section of the summary report
|
||||
7. Final report includes open cross-level dependencies section
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill detects the missing adjacent area by checking `design/levels/` — does not assume it will be created later
|
||||
- [ ] Skill does NOT fabricate crystal caves content (lore, layout, connections) to resolve the reference
|
||||
- [ ] `AskUserQuestion` offers a "design crystal caves first" option referencing `/team-level`
|
||||
- [ ] If user proceeds with placeholder, level doc explicitly marks the west exit as UNRESOLVED
|
||||
- [ ] Summary report includes an open cross-level dependencies section listing unresolved references
|
||||
- [ ] Circular or forward references do not cause the skill to loop or crash
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] `AskUserQuestion` used at each step transition — user approves before pipeline advances
|
||||
- [ ] All file writes delegated to sub-agents via Task — orchestrator does not call Write or Edit directly
|
||||
- [ ] Error Recovery Protocol followed: surface → assess → offer options → partial report
|
||||
- [ ] Step 4 agents (art-director, accessibility-specialist) launched in parallel per skill spec
|
||||
- [ ] Partial report always produced even when agents are BLOCKED
|
||||
- [ ] Accessibility BLOCKING concerns surface before sign-off and require explicit user acknowledgment
|
||||
- [ ] Verdict is one of COMPLETE / BLOCKED
|
||||
- [ ] Next steps present at end: `/design-review`, `/dev-story`, `/qa-plan`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- narrative-director and world-builder in Step 1 may be sequential or parallel — the skill spec
|
||||
spawns both but does not mandate simultaneous launch; coverage of parallel Step 1 would require
|
||||
an explicit timing assertion fixture.
|
||||
- The "Retry with narrower scope" option in the blocked world-builder case (Case 2) — the
|
||||
retry behavior itself is not tested in depth; its full path is analogous to the blocked agent
|
||||
pattern covered in Case 2 and in other team-* specs.
|
||||
- systems-designer (Step 3) block scenarios are not separately tested; the same Error Recovery
|
||||
Protocol applies and the pattern is validated by Case 2.
|
||||
- Step 4 parallel ordering (art-director completing before or after accessibility-specialist)
|
||||
does not affect outcomes — both must return before Step 5 regardless of order.
|
||||
- The level doc slug convention (argument → filename) is implicitly tested by Case 1
|
||||
(`forest dungeon` → `forest-dungeon.md`); multi-word slugification edge cases (special
|
||||
characters, very long names) are not covered.
|
||||
178
CCGS Skill Testing Framework/skills/team/team-live-ops.md
Normal file
178
CCGS Skill Testing Framework/skills/team/team-live-ops.md
Normal file
@@ -0,0 +1,178 @@
|
||||
# Skill Test Spec: /team-live-ops
|
||||
|
||||
## Skill Summary
|
||||
|
||||
Orchestrates the live-ops team through a 7-phase planning pipeline to produce a
|
||||
season or event plan. Coordinates live-ops-designer, economy-designer,
|
||||
analytics-engineer, community-manager, narrative-director, and writer. Phases 3
|
||||
and 4 (economy design and analytics) run simultaneously. Ends with a consolidated
|
||||
season plan requiring user approval before handoff to production.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
|
||||
- [ ] Contains "May I write" language in the File Write Protocol section (delegated to sub-agents)
|
||||
- [ ] Has a File Write Protocol section stating that the orchestrator does not write files directly
|
||||
- [ ] Has a next-step handoff at the end referencing `/design-review`, `/sprint-plan`, and `/team-release`
|
||||
- [ ] Uses `AskUserQuestion` at phase transitions to capture user approval before proceeding
|
||||
- [ ] States explicitly that Phases 3 and 4 can run simultaneously (parallel spawning)
|
||||
- [ ] Error recovery section present (or implied through BLOCKED handling)
|
||||
- [ ] Output documents section specifies paths under `design/live-ops/seasons/`
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All 7 phases complete, season plan produced
|
||||
|
||||
**Fixture:**
|
||||
- `design/live-ops/economy-rules.md` exists with current economy configuration
|
||||
- `design/live-ops/ethics-policy.md` exists with the project ethics policy
|
||||
- Game concept document exists at its standard path
|
||||
- No existing season documents for the new season name being planned
|
||||
|
||||
**Input:** `/team-live-ops "Season 2: The Frozen Wastes"`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1: Spawns `live-ops-designer` via Task; receives season brief with scope, content list, and retention mechanic; presents to user
|
||||
2. AskUserQuestion: user approves Phase 1 output before Phase 2 begins
|
||||
3. Phase 2: Spawns `narrative-director` via Task; reads the Phase 1 season brief; produces narrative framing document (theme, story hook, lore connections); presents to user
|
||||
4. Phase 3 and 4 (parallel): Spawns `economy-designer` and `analytics-engineer` simultaneously via two Task calls before waiting for either result; economy-designer reads `design/live-ops/economy-rules.md`
|
||||
5. Phase 5: Spawns `narrative-director` and `writer` in parallel to produce in-game narrative text and player-facing copy; both read Phase 2 narrative framing doc
|
||||
6. Phase 6: Spawns `community-manager` via Task; reads season brief, economy design, and narrative framing; produces communication calendar with draft copy
|
||||
7. Phase 7: Collects all phase outputs; presents consolidated season plan summary including economy health check, analytics readiness, ethics review, and open questions
|
||||
8. AskUserQuestion: user approves the full season plan
|
||||
9. Sub-agents ask "May I write to `design/live-ops/seasons/S2_The_Frozen_Wastes.md`?", `...analytics.md`, and `...comms.md` before writing
|
||||
10. Verdict: COMPLETE — season plan produced and handed off for production
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 7 phases execute in order; Phase 3 and 4 are issued as parallel Task calls
|
||||
- [ ] Phase 7 consolidated summary includes all six sections (season brief, narrative framing, economy design, analytics plan, content inventory, communication calendar)
|
||||
- [ ] Ethics review section in Phase 7 explicitly references `design/live-ops/ethics-policy.md`
|
||||
- [ ] Three output documents written to `design/live-ops/seasons/` with correct naming convention
|
||||
- [ ] File writes are delegated to sub-agents — orchestrator does not write directly
|
||||
- [ ] Verdict: COMPLETE appears in final output
|
||||
- [ ] Next steps reference `/design-review`, `/sprint-plan`, and `/team-release`
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Ethics Violation Found — Reward element violates ethics policy
|
||||
|
||||
**Fixture:**
|
||||
- All standard live-ops fixtures present (economy-rules.md, ethics-policy.md)
|
||||
- `design/live-ops/ethics-policy.md` explicitly prohibits loot boxes targeting players under 18
|
||||
- economy-designer (Phase 3) proposes a "Mystery Chest" mechanic with randomized premium rewards and no pity timer
|
||||
|
||||
**Input:** `/team-live-ops "Season 3: Shadow Tournament"`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phases 1–4 proceed normally; economy-designer proposes Mystery Chest mechanic
|
||||
2. Phase 7: Orchestrator reviews Phase 3 output against ethics policy; identifies Mystery Chest as a violation of the "no untransparent random premium rewards" rule in the ethics policy
|
||||
3. Ethics review section of the Phase 7 summary flags the violation explicitly: "ETHICS FLAG: Mystery Chest mechanic in Phase 3 economy design violates [policy rule]. Approval is blocked until this is resolved."
|
||||
4. AskUserQuestion presented with resolution options before season plan approval is offered
|
||||
5. Skill does NOT issue a COMPLETE verdict or write output documents until the ethics violation is resolved or explicitly waived by the user
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Phase 7 ethics review section explicitly names the violating element and the policy rule it breaks
|
||||
- [ ] Skill does not auto-approve the season plan when an ethics violation is present
|
||||
- [ ] AskUserQuestion is used to surface the violation and offer resolution options (revise economy design, override with documented rationale, cancel)
|
||||
- [ ] Output documents are NOT written while the violation is unresolved
|
||||
- [ ] If user chooses to revise: skill re-spawns economy-designer to produce a corrected design before returning to Phase 7 review
|
||||
- [ ] Verdict: COMPLETE is only issued after the ethics flag is cleared
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Argument — Usage guidance shown
|
||||
|
||||
**Fixture:**
|
||||
- Any project state
|
||||
|
||||
**Input:** `/team-live-ops` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1: No argument detected
|
||||
2. Outputs: "Usage: `/team-live-ops [season name or event description]` — Provide the name or description of the season or live event to plan."
|
||||
3. Skill exits immediately without spawning any subagents
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT guess a season name or fabricate a scope
|
||||
- [ ] Error message includes the correct usage format with the argument-hint
|
||||
- [ ] No Task calls are issued before the argument check fails
|
||||
- [ ] No files are read or written
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Parallel Phase Validation — Phases 3 and 4 run simultaneously
|
||||
|
||||
**Fixture:**
|
||||
- All standard live-ops fixtures present
|
||||
- Phase 1 (season brief) and Phase 2 (narrative framing) already approved
|
||||
- Phase 3 (economy-designer) and Phase 4 (analytics-engineer) inputs are independent of each other
|
||||
|
||||
**Input:** `/team-live-ops "Season 1: The First Thaw"` (observed at Phase 3/4 transition)
|
||||
|
||||
**Expected behavior:**
|
||||
1. After Phase 2 is approved by the user, the orchestrator issues both Task calls (economy-designer and analytics-engineer) before awaiting either result
|
||||
2. Both agents receive the season brief as context; analytics-engineer does NOT wait for economy-designer output to begin
|
||||
3. Economy-designer output and analytics-engineer output are collected together before Phase 5 begins
|
||||
4. If one of the two parallel agents blocks, the other continues; a partial result is reported
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Both Task calls for Phase 3 and Phase 4 are issued before either result is awaited — they are not sequential
|
||||
- [ ] Analytics-engineer prompt does NOT include economy-designer output as a required input (the inputs are independent)
|
||||
- [ ] If economy-designer blocks but analytics-engineer succeeds, analytics output is preserved and the block is surfaced via AskUserQuestion
|
||||
- [ ] Phase 5 does not begin until BOTH Phase 3 and Phase 4 results are collected
|
||||
- [ ] Skill documentation explicitly states "Phases 3 and 4 can run simultaneously"
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Missing Ethics Policy — `design/live-ops/ethics-policy.md` does not exist
|
||||
|
||||
**Fixture:**
|
||||
- `design/live-ops/economy-rules.md` exists
|
||||
- `design/live-ops/ethics-policy.md` does NOT exist
|
||||
- All other fixtures are present
|
||||
|
||||
**Input:** `/team-live-ops "Season 4: Desert Heat"`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phases 1–4 proceed; economy-designer and analytics-engineer are given the ethics policy path but it is absent
|
||||
2. Phase 7: Orchestrator attempts to run ethics review; detects that `design/live-ops/ethics-policy.md` is missing
|
||||
3. Phase 7 summary includes a gap flag: "ETHICS REVIEW SKIPPED: `design/live-ops/ethics-policy.md` not found. Economy design was not reviewed against an ethics policy. Recommend creating one before production begins."
|
||||
4. Skill still completes the season plan and reaches COMPLETE verdict, but the gap is prominently flagged in the output and in the season design document
|
||||
5. Next steps include a recommendation to create the ethics policy document
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT error out when the ethics policy file is missing
|
||||
- [ ] Skill does NOT fabricate ethics policy rules in the absence of the file
|
||||
- [ ] Phase 7 summary explicitly notes that ethics review was skipped and why
|
||||
- [ ] Verdict: COMPLETE is still reachable despite the missing file
|
||||
- [ ] Gap flag appears in the season design output document (not just in conversation)
|
||||
- [ ] Next steps recommend creating `design/live-ops/ethics-policy.md`
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] `AskUserQuestion` used at every phase transition — user approves before the next phase begins
|
||||
- [ ] Phases 3 and 4 are always spawned in parallel, not sequentially
|
||||
- [ ] File Write Protocol: orchestrator never calls Write/Edit directly — all writes are delegated to sub-agents
|
||||
- [ ] Each output document gets its own "May I write to [path]?" ask from the relevant sub-agent
|
||||
- [ ] Ethics review in Phase 7 always references the ethics policy file path explicitly
|
||||
- [ ] Error recovery: any BLOCKED agent is surfaced immediately with AskUserQuestion options (skip / retry / stop)
|
||||
- [ ] Partial reports are produced if any phase blocks — work is never discarded
|
||||
- [ ] Verdict: COMPLETE only after user approves the consolidated season plan; BLOCKED if any unresolved ethics violation exists
|
||||
- [ ] Next steps always include `/design-review`, `/sprint-plan`, and `/team-release`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Phase 5 parallel spawning (narrative-director + writer) follows the same pattern as Phases 3/4 but is not separately tested here — it uses the same parallel Task protocol validated in Case 4.
|
||||
- The "economy-rules.md absent" edge case is not separately tested — it would surface as a BLOCKED result from economy-designer and follow the standard error recovery path tested implicitly in Case 4.
|
||||
- The full content writing pipeline (Phase 5 output validation) is validated implicitly by the Case 1 happy path consolidated summary check.
|
||||
- Community manager communication calendar format (pre-launch, launch day, mid-season, final week) is validated implicitly by Case 1; no separate edge case is needed.
|
||||
209
CCGS Skill Testing Framework/skills/team/team-narrative.md
Normal file
209
CCGS Skill Testing Framework/skills/team/team-narrative.md
Normal file
@@ -0,0 +1,209 @@
|
||||
# Skill Test Spec: /team-narrative
|
||||
|
||||
## Skill Summary
|
||||
|
||||
Orchestrates the narrative team through a five-phase pipeline: narrative direction
|
||||
(narrative-director) → world foundation + dialogue drafting (world-builder and writer
|
||||
in parallel) → level narrative integration (level-designer) → consistency review
|
||||
(narrative-director) → polish + localization compliance (writer, localization-lead,
|
||||
and world-builder in parallel). Uses `AskUserQuestion` at each phase transition to
|
||||
present proposals as selectable options. Produces a narrative summary report and
|
||||
delivers narrative documents via subagents that each enforce the "May I write?"
|
||||
protocol. Verdict is COMPLETE when all phases succeed, or BLOCKED when a dependency
|
||||
is unresolved.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
|
||||
- [ ] Contains "File Write Protocol" section
|
||||
- [ ] File writes are delegated to sub-agents — orchestrator does not write files directly
|
||||
- [ ] Sub-agents enforce "May I write to [path]?" before any write
|
||||
- [ ] Has a next-step handoff at the end (references `/design-review`, `/localize extract`, `/dev-story`)
|
||||
- [ ] Error Recovery Protocol section is present
|
||||
- [ ] `AskUserQuestion` is used at phase transitions before proceeding
|
||||
- [ ] Phase 2 explicitly spawns world-builder and writer in parallel
|
||||
- [ ] Phase 5 explicitly spawns writer, localization-lead, and world-builder in parallel
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All five phases complete, narrative doc delivered
|
||||
|
||||
**Fixture:**
|
||||
- A game concept and GDD exist for the target feature (e.g., `design/gdd/faction-intro.md`)
|
||||
- Character voice profiles exist (e.g., `design/narrative/characters/`)
|
||||
- Existing lore entries exist for cross-reference (e.g., `design/narrative/lore/`)
|
||||
- No lore contradictions exist between existing entries and the new content
|
||||
|
||||
**Input:** `/team-narrative faction introduction cutscene for the Ironveil faction`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1: narrative-director is spawned; outputs a narrative brief defining the story beat, characters involved, emotional tone, and lore dependencies
|
||||
2. `AskUserQuestion` presents the narrative brief; user approves before Phase 2 begins
|
||||
3. Phase 2: world-builder and writer are spawned in parallel; world-builder produces lore entries for the Ironveil faction; writer drafts dialogue lines using character voice profiles
|
||||
4. `AskUserQuestion` presents world foundation and dialogue drafts; user approves before Phase 3 begins
|
||||
5. Phase 3: level-designer is spawned; produces environmental storytelling layout, trigger placement, and pacing plan
|
||||
6. `AskUserQuestion` presents level narrative plan; user approves before Phase 4 begins
|
||||
7. Phase 4: narrative-director reviews all dialogue against voice profiles, verifies lore consistency, confirms pacing; approves or flags issues
|
||||
8. `AskUserQuestion` presents review results; user approves before Phase 5 begins
|
||||
9. Phase 5: writer, localization-lead, and world-builder are spawned in parallel; writer performs final self-review; localization-lead validates i18n compliance; world-builder finalizes canon levels
|
||||
10. Final summary report is presented; subagent asks "May I write the narrative document to [path]?" before writing
|
||||
11. Verdict: COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] narrative-director is spawned in Phase 1 before any other agents
|
||||
- [ ] `AskUserQuestion` appears after Phase 1 output and before Phase 2 launch
|
||||
- [ ] world-builder and writer Task calls are issued simultaneously in Phase 2 (not sequentially)
|
||||
- [ ] level-designer is not launched until Phase 2 `AskUserQuestion` is approved
|
||||
- [ ] narrative-director is re-spawned in Phase 4 for consistency review
|
||||
- [ ] Phase 5 spawns all three agents (writer, localization-lead, world-builder) simultaneously
|
||||
- [ ] Summary report includes: narrative brief status, lore entries created/updated, dialogue lines written, level narrative integration points, consistency review results
|
||||
- [ ] No files are written by the orchestrator directly
|
||||
- [ ] Verdict is COMPLETE after delivery
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Lore Contradiction Found — world-builder finds conflict before writer proceeds
|
||||
|
||||
**Fixture:**
|
||||
- Existing lore entry at `design/narrative/lore/ironveil-history.md` states the Ironveil faction was founded 200 years ago
|
||||
- The new narrative brief (from Phase 1) states the Ironveil were founded 50 years ago
|
||||
- The writer has been spawned in parallel with the world-builder in Phase 2
|
||||
|
||||
**Input:** `/team-narrative ironveil faction introduction cutscene`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phases 1–2 begin normally
|
||||
2. Phase 2 world-builder detects a factual contradiction between the narrative brief and existing lore: founding date conflict
|
||||
3. world-builder returns BLOCKED with reason: "Lore contradiction found — founding date conflicts with `design/narrative/lore/ironveil-history.md`"
|
||||
4. Orchestrator surfaces the contradiction immediately: "world-builder: BLOCKED — Lore contradiction: founding date in narrative brief (50 years ago) conflicts with existing canon (200 years ago in `ironveil-history.md`)"
|
||||
5. Orchestrator assesses dependency: the writer's dialogue depends on canon lore — the writer's draft cannot be finalized without resolving the contradiction
|
||||
6. `AskUserQuestion` presents options:
|
||||
- Revise the narrative brief to match existing canon (200 years ago)
|
||||
- Update the existing lore entry to reflect the new canon (50 years ago)
|
||||
- Stop here and resolve the contradiction in the lore docs first
|
||||
7. Writer output is preserved but flagged as pending canon resolution — work is not discarded
|
||||
8. Orchestrator does NOT proceed to Phase 3 until the contradiction is resolved or user explicitly chooses to skip
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Contradiction is surfaced before Phase 3 begins
|
||||
- [ ] Orchestrator does not silently resolve the contradiction by picking one version
|
||||
- [ ] `AskUserQuestion` presents at least 3 options including "stop and resolve first"
|
||||
- [ ] Writer's draft output is preserved in the partial report, not discarded
|
||||
- [ ] Phase 3 (level-designer) is not launched until the user resolves the contradiction
|
||||
- [ ] Verdict is BLOCKED (not COMPLETE) if the user stops to resolve the contradiction
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Argument — Usage guidance shown
|
||||
|
||||
**Fixture:**
|
||||
- Any project state
|
||||
|
||||
**Input:** `/team-narrative` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no argument is provided
|
||||
2. Outputs usage guidance: e.g., "Usage: `/team-narrative [narrative content description]` — describe the story content, scene, or narrative area to work on (e.g., `boss encounter cutscene`, `faction intro dialogue`, `tutorial narrative`)"
|
||||
3. Skill exits without spawning any agents
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT spawn any agents when no argument is provided
|
||||
- [ ] Usage message includes the correct invocation format with an argument example
|
||||
- [ ] Skill does NOT attempt to guess or infer a narrative topic from project files
|
||||
- [ ] No `AskUserQuestion` is used — output is direct guidance
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Localization Compliance — localization-lead flags a non-translatable string
|
||||
|
||||
**Fixture:**
|
||||
- Phases 1–4 complete successfully
|
||||
- Phase 5 begins; writer and world-builder complete without issues
|
||||
- localization-lead finds a dialogue line that uses a hardcoded formatted date string (e.g., `"On March 12th, Year 3"`) that cannot survive locale-specific translation without a locale-aware formatter
|
||||
|
||||
**Input:** `/team-narrative ironveil faction introduction cutscene` (Phase 5 scenario)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 5 spawns writer, localization-lead, and world-builder in parallel
|
||||
2. localization-lead completes its review and flags: "String key `dialogue.ironveil.intro.003` contains a hardcoded date format (`March 12th, Year 3`) that will not localize correctly — requires a locale-aware date placeholder"
|
||||
3. Orchestrator surfaces the localization blocker in the summary report
|
||||
4. The localization issue is labeled as BLOCKING in the final report (not advisory)
|
||||
5. `AskUserQuestion` presents options:
|
||||
- Fix the string now (writer revises the line)
|
||||
- Note the gap and deliver the narrative doc with the issue flagged
|
||||
- Stop and resolve before finalizing
|
||||
6. If the user chooses to proceed with the issue flagged, verdict is COMPLETE with noted localization debt; if user stops, verdict is BLOCKED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] localization-lead is spawned in Phase 5 simultaneously with writer and world-builder
|
||||
- [ ] Hardcoded date format is identified as a localization blocker (not silently passed)
|
||||
- [ ] The specific string key and reason are included in the issue report
|
||||
- [ ] `AskUserQuestion` offers the option to fix now vs. flag and proceed
|
||||
- [ ] Verdict notes the localization debt if the user proceeds without fixing
|
||||
- [ ] Skill does NOT automatically rewrite the offending line without user approval
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Writer Blocked — Missing character voice profiles
|
||||
|
||||
**Fixture:**
|
||||
- Phase 1 narrative-director produces a narrative brief referencing two characters: Commander Varek and Advisor Selene
|
||||
- No character voice profiles exist in `design/narrative/characters/` for either character
|
||||
- Phase 2 begins; world-builder proceeds normally
|
||||
|
||||
**Input:** `/team-narrative ironveil surrender negotiation scene`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1 completes; narrative brief lists Commander Varek and Advisor Selene as characters
|
||||
2. Phase 2: writer is spawned in parallel with world-builder
|
||||
3. writer returns BLOCKED: "Cannot produce dialogue — no voice profiles found for Commander Varek or Advisor Selene in `design/narrative/characters/`. Voice profiles required to match character tone and speech patterns."
|
||||
4. Orchestrator surfaces the blocker immediately: "writer: BLOCKED — Missing prerequisite: character voice profiles for Commander Varek and Advisor Selene"
|
||||
5. world-builder output is preserved; partial report is produced with lore entries
|
||||
6. `AskUserQuestion` presents options:
|
||||
- Create voice profiles first (redirects to the narrative-director or design workflow)
|
||||
- Provide minimal voice direction inline and retry the writer with that context
|
||||
- Stop here and create voice profiles before proceeding
|
||||
7. Orchestrator does NOT proceed to Phase 3 (level-designer) without writer output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Writer block is surfaced before Phase 3 begins
|
||||
- [ ] world-builder's completed lore output is preserved in the partial report
|
||||
- [ ] Missing prerequisite (voice profiles) is named specifically (character names and expected file path)
|
||||
- [ ] `AskUserQuestion` offers at least one option to resolve the missing prerequisite
|
||||
- [ ] Orchestrator does not fabricate voice profiles or invent character voices
|
||||
- [ ] Phase 3 is not launched while writer is BLOCKED without explicit user authorization
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] `AskUserQuestion` is used after every phase output before the next phase launches
|
||||
- [ ] Parallel spawning: Phase 2 (world-builder + writer) and Phase 5 (writer + localization-lead + world-builder) issue all Task calls before waiting for results
|
||||
- [ ] No files are written by the orchestrator directly — all writes are delegated to sub-agents
|
||||
- [ ] Each sub-agent enforces the "May I write to [path]?" protocol before any write
|
||||
- [ ] BLOCKED status from any agent is surfaced immediately — not silently skipped
|
||||
- [ ] A partial report is always produced when some agents complete and others block
|
||||
- [ ] Verdict is exactly COMPLETE or BLOCKED — no other verdict values used
|
||||
- [ ] Next Steps handoff references `/design-review`, `/localize extract`, and `/dev-story`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Phase 3 (level-designer) and Phase 4 (narrative-director review) happy-path behavior are
|
||||
validated implicitly by Case 1. Separate edge cases are not needed for these phases as
|
||||
their failure modes follow the standard Error Recovery Protocol.
|
||||
- The "Retry with narrower scope" and "Skip this agent" resolution paths from the Error
|
||||
Recovery Protocol are not separately tested — they follow the same `AskUserQuestion`
|
||||
+ partial-report pattern validated in Cases 2 and 5.
|
||||
- Localization concerns that are advisory (e.g., German/Finnish +30% expansion warnings)
|
||||
vs. blocking (hardcoded formats) are distinguished in Case 4; advisory-only scenarios
|
||||
follow the same pattern but do not change the verdict.
|
||||
- The writer's "all lines under 120 characters" and "string keys not raw strings" checks
|
||||
in Phase 5 are covered implicitly by Case 4's localization compliance scenario.
|
||||
218
CCGS Skill Testing Framework/skills/team/team-polish.md
Normal file
218
CCGS Skill Testing Framework/skills/team/team-polish.md
Normal file
@@ -0,0 +1,218 @@
|
||||
# Skill Test Spec: /team-polish
|
||||
|
||||
## Skill Summary
|
||||
|
||||
Orchestrates the polish team through a six-phase pipeline: performance assessment
|
||||
(performance-analyst) → optimization (performance-analyst, optionally with
|
||||
engine-programmer when engine-level root causes are found) → visual polish
|
||||
(technical-artist, parallel with Phase 2) → audio polish (sound-designer, parallel
|
||||
with Phase 2) → hardening (qa-tester) → sign-off (orchestrator collects all results
|
||||
and issues READY FOR RELEASE or NEEDS MORE WORK). Uses `AskUserQuestion` at each
|
||||
phase transition. Engine-programmer is spawned conditionally only when Phase 1
|
||||
identifies engine-level root causes. Verdict is READY FOR RELEASE or NEEDS MORE WORK.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: READY FOR RELEASE, NEEDS MORE WORK
|
||||
- [ ] Contains "File Write Protocol" section
|
||||
- [ ] File writes are delegated to sub-agents — orchestrator does not write files directly
|
||||
- [ ] Sub-agents enforce "May I write to [path]?" before any write
|
||||
- [ ] Has a next-step handoff at the end (references `/release-checklist`, `/sprint-plan update`, `/gate-check`)
|
||||
- [ ] Error Recovery Protocol section is present
|
||||
- [ ] `AskUserQuestion` is used at phase transitions before proceeding
|
||||
- [ ] Phase 3 (visual polish) and Phase 4 (audio polish) are explicitly run in parallel with Phase 2
|
||||
- [ ] engine-programmer is conditionally spawned in Phase 2 only when Phase 1 identifies engine-level root causes
|
||||
- [ ] Phase 6 sign-off compares metrics against budgets before issuing verdict
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Full pipeline completes, READY FOR RELEASE verdict
|
||||
|
||||
**Fixture:**
|
||||
- Feature exists and is functionally complete (e.g., `combat` system)
|
||||
- Performance budgets are defined in technical-preferences.md (e.g., target 60fps, 16ms frame budget)
|
||||
- No frame budget violations exist before polishing begins
|
||||
- No audio events are missing; VFX assets are complete
|
||||
- No regressions are introduced by polish changes
|
||||
|
||||
**Input:** `/team-polish combat`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1: performance-analyst is spawned; profiles the combat system, measures frame budget, checks memory usage; output: performance report showing all metrics within budget, no violations
|
||||
2. `AskUserQuestion` presents performance report; user approves before Phases 2, 3, and 4 begin
|
||||
3. Phase 2: performance-analyst applies minor optimizations (e.g., draw call batching); no engine-programmer needed (no engine-level root causes identified)
|
||||
4. Phases 3 and 4 are launched in parallel alongside Phase 2:
|
||||
- Phase 3: technical-artist reviews VFX for quality, optimizes particle systems, adds screen shake and visual juice
|
||||
- Phase 4: sound-designer reviews audio events for completeness, checks mix levels, adds ambient audio layers
|
||||
5. All three parallel phases complete; `AskUserQuestion` presents results; user approves before Phase 5 begins
|
||||
6. Phase 5: qa-tester runs edge case tests, soak tests, stress tests, and regression tests; all pass
|
||||
7. `AskUserQuestion` presents test results; user approves before Phase 6
|
||||
8. Phase 6: orchestrator collects all results; compares before/after performance metrics against budgets; all metrics pass
|
||||
9. Subagent asks "May I write the polish report to `production/qa/evidence/polish-combat-[date].md`?" before writing
|
||||
10. Verdict: READY FOR RELEASE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] performance-analyst is spawned first in Phase 1 before any other agents
|
||||
- [ ] `AskUserQuestion` appears after Phase 1 output and before Phases 2/3/4 launch
|
||||
- [ ] Phases 3 and 4 Task calls are issued at the same time as Phase 2 (not after Phase 2 completes)
|
||||
- [ ] engine-programmer is NOT spawned when Phase 1 finds no engine-level root causes
|
||||
- [ ] qa-tester (Phase 5) is not launched until the parallel phases complete and user approves
|
||||
- [ ] Phase 6 verdict is based on comparison of metrics against defined budgets
|
||||
- [ ] Summary report includes: before/after performance metrics, visual polish changes, audio polish changes, test results
|
||||
- [ ] No files are written by the orchestrator directly
|
||||
- [ ] Verdict is READY FOR RELEASE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Performance Blocker — Frame budget violation cannot be fully resolved
|
||||
|
||||
**Fixture:**
|
||||
- Feature being polished: `particle-storm` VFX system
|
||||
- Phase 1 identifies a frame budget violation: particle-storm costs 12ms on target hardware (budget is 6ms for this system)
|
||||
- Phase 2 performance-analyst applies optimizations reducing cost to 9ms — still over the 6ms budget
|
||||
- Phase 2 cannot fully resolve the violation without a fundamental design change
|
||||
|
||||
**Input:** `/team-polish particle-storm`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1: performance-analyst identifies the 12ms frame cost vs. 6ms budget; reports "FRAME BUDGET VIOLATION: particle-storm costs 12ms, budget is 6ms"
|
||||
2. `AskUserQuestion` presents the violation; user chooses to proceed with optimization attempt
|
||||
3. Phase 2: performance-analyst applies optimizations; achieves 9ms — reduced but still over budget; reports "Optimization reduced cost to 9ms (was 12ms) — 3ms over budget. No further gains achievable without design changes."
|
||||
4. Phases 3 and 4 run in parallel with Phase 2 (visual and audio polish)
|
||||
5. Phase 5: qa-tester runs regression and edge case tests; all pass
|
||||
6. Phase 6: orchestrator collects results; frame budget violation (9ms vs 6ms budget) remains unresolved
|
||||
7. Verdict: NEEDS MORE WORK
|
||||
8. Report lists the specific unresolved issue: "particle-storm frame cost (9ms) exceeds budget (6ms) by 3ms — requires design scope reduction or budget renegotiation"
|
||||
9. Next Steps: schedule the remaining issue in `/sprint-plan update`; re-run `/team-polish` after fix
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Frame budget violation is flagged in Phase 1 with specific numbers (actual vs. budget)
|
||||
- [ ] Phase 2 reports the post-optimization metric explicitly (9ms achieved, 3ms still over)
|
||||
- [ ] Verdict is NEEDS MORE WORK (not READY FOR RELEASE) when a budget violation remains
|
||||
- [ ] The specific unresolved issue is listed by name with the remaining gap quantified
|
||||
- [ ] Next Steps references `/sprint-plan update` for scheduling the remaining fix
|
||||
- [ ] Phases 3 and 4 still run (polish work is not abandoned due to a Phase 2 partial resolution)
|
||||
- [ ] Phase 5 qa-tester still runs (regression testing is independent of the performance outcome)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Argument — Usage guidance shown
|
||||
|
||||
**Fixture:**
|
||||
- Any project state
|
||||
|
||||
**Input:** `/team-polish` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no argument is provided
|
||||
2. Outputs usage guidance: e.g., "Usage: `/team-polish [feature or area]` — specify the feature or area to polish (e.g., `combat`, `main menu`, `inventory system`, `level-1`)"
|
||||
3. Skill exits without spawning any agents
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT spawn any agents when no argument is provided
|
||||
- [ ] Usage message includes the correct invocation format with argument examples
|
||||
- [ ] Skill does NOT attempt to guess a feature from project files
|
||||
- [ ] No `AskUserQuestion` is used — output is direct guidance
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Engine-Level Bottleneck — engine-programmer spawned conditionally in Phase 2
|
||||
|
||||
**Fixture:**
|
||||
- Feature being polished: `open-world` environment streaming
|
||||
- Phase 1 identifies a performance bottleneck with a root cause in the rendering pipeline: "draw call overhead is caused by the engine's scene tree traversal in the spatial indexer — this is an engine-level issue, not a game code issue"
|
||||
- Performance budgets are defined; the rendering overhead exceeds target frame budget
|
||||
|
||||
**Input:** `/team-polish open-world`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1: performance-analyst profiles the environment; identifies frame budget violation; root cause analysis points to engine-level rendering pipeline (spatial indexer traversal overhead)
|
||||
2. Phase 1 output explicitly classifies the root cause as engine-level
|
||||
3. `AskUserQuestion` presents the performance report including the engine-level root cause; user approves before Phase 2
|
||||
4. Phase 2: performance-analyst is spawned for game-code-level optimizations AND engine-programmer is spawned in parallel for the engine-level rendering fix
|
||||
5. Phases 3 and 4 also run in parallel with Phase 2 (visual and audio polish)
|
||||
6. engine-programmer addresses the spatial indexer traversal; provides profiler validation showing the fix reduces overhead
|
||||
7. Phase 5: qa-tester runs regression tests including tests for the engine-level fix
|
||||
8. Phase 6: orchestrator collects all results; if metrics are now within budget, verdict is READY FOR RELEASE; if not, NEEDS MORE WORK
|
||||
|
||||
**Assertions:**
|
||||
- [ ] engine-programmer is NOT spawned in Phase 2 unless Phase 1 explicitly identifies an engine-level root cause
|
||||
- [ ] engine-programmer is spawned in Phase 2 when Phase 1 identifies an engine-level root cause
|
||||
- [ ] engine-programmer and performance-analyst Task calls in Phase 2 are issued simultaneously (not sequentially)
|
||||
- [ ] Phases 3 and 4 also run in parallel with Phase 2 (not deferred until Phase 2 completes)
|
||||
- [ ] engine-programmer's output includes profiler validation of the fix
|
||||
- [ ] qa-tester in Phase 5 runs regression tests that cover the engine-level change
|
||||
- [ ] Verdict correctly reflects whether all metrics including the engine fix now meet budgets
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Regression Found — Polish change broke an existing feature
|
||||
|
||||
**Fixture:**
|
||||
- Feature being polished: `inventory-ui`
|
||||
- Phases 1–4 complete successfully; performance and polish changes are applied
|
||||
- Phase 5: qa-tester runs regression tests and finds that a shader optimization applied in Phase 3 broke the item highlight glow effect on hover — an existing feature that was working before the polish pass
|
||||
|
||||
**Input:** `/team-polish inventory-ui` (Phase 5 scenario)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phases 1–4 complete; polish changes include a shader optimization from technical-artist
|
||||
2. Phase 5: qa-tester runs regression tests and detects "Item highlight glow on hover no longer renders — regression introduced by shader optimization in Phase 3"
|
||||
3. qa-tester returns test results with the regression noted
|
||||
4. Orchestrator surfaces the regression immediately: "qa-tester: REGRESSION FOUND — `item-highlight-hover` glow broken by Phase 3 shader optimization"
|
||||
5. Subagent files a bug report asking "May I write the bug report to `production/qa/evidence/bug-polish-inventory-ui-[date].md`?" before writing
|
||||
6. Bug report is written after approval; it includes: the broken behavior, the polish change that caused it, reproduction steps, and severity
|
||||
7. `AskUserQuestion` presents the regression with options:
|
||||
- Revert the shader optimization and find an alternative approach
|
||||
- Fix the shader optimization to preserve the glow effect
|
||||
- Accept the regression and schedule a fix in the next sprint
|
||||
8. Verdict: NEEDS MORE WORK (regression present regardless of user's chosen resolution path, unless fix is applied within the current session)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Regression is surfaced before Phase 6 sign-off
|
||||
- [ ] The specific broken behavior and the responsible change are both named in the report
|
||||
- [ ] Subagent asks "May I write the bug report to [path]?" before filing
|
||||
- [ ] Bug report includes: broken behavior, causal change, reproduction steps, severity
|
||||
- [ ] `AskUserQuestion` offers options including revert, fix in place, and schedule later
|
||||
- [ ] Verdict is NEEDS MORE WORK when a regression is present and unresolved
|
||||
- [ ] Verdict may become READY FOR RELEASE only if the regression is fixed within the current polish session and qa-tester re-runs to confirm
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Phase 1 (assessment) must complete before any other phase begins
|
||||
- [ ] `AskUserQuestion` is used after every phase output before the next phase launches
|
||||
- [ ] Phases 3 and 4 are always launched in parallel with Phase 2 (not deferred)
|
||||
- [ ] engine-programmer is only spawned when Phase 1 explicitly identifies engine-level root causes
|
||||
- [ ] No files are written by the orchestrator directly — all writes are delegated to sub-agents
|
||||
- [ ] Each sub-agent enforces the "May I write to [path]?" protocol before any write
|
||||
- [ ] BLOCKED status from any agent is surfaced immediately — not silently skipped
|
||||
- [ ] A partial report is always produced when some agents complete and others block
|
||||
- [ ] Verdict is exactly READY FOR RELEASE or NEEDS MORE WORK — no other verdict values used
|
||||
- [ ] NEEDS MORE WORK verdict always lists specific remaining issues with severity
|
||||
- [ ] Next Steps handoff references `/release-checklist` (on success) and `/sprint-plan update` + `/gate-check` (on failure)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The tools-programmer optional agent (for content pipeline tool verification) is not
|
||||
separately tested — it follows the same conditional spawn pattern as engine-programmer
|
||||
and is invoked only when content authoring tools are involved in the polished area.
|
||||
- The "Retry with narrower scope" and "Skip this agent" resolution paths from the Error
|
||||
Recovery Protocol are not separately tested — they follow the same `AskUserQuestion`
|
||||
+ partial-report pattern validated in Cases 2 and 5.
|
||||
- Phase 6 sign-off logic (collecting and comparing all metrics) is validated implicitly
|
||||
by Cases 1 and 2. The distinction between READY FOR RELEASE and NEEDS MORE WORK is
|
||||
exercised in both directions across these cases.
|
||||
- Soak testing and stress testing (Phase 5) are validated implicitly by Case 1's
|
||||
qa-tester output. Case 5 focuses on the regression detection aspect of Phase 5.
|
||||
- The "minimum spec hardware" test path in Phase 5 is not separately tested — it follows
|
||||
the same qa-tester delegation pattern when the hardware is available.
|
||||
204
CCGS Skill Testing Framework/skills/team/team-qa.md
Normal file
204
CCGS Skill Testing Framework/skills/team/team-qa.md
Normal file
@@ -0,0 +1,204 @@
|
||||
# Skill Test Spec: /team-qa
|
||||
|
||||
## Skill Summary
|
||||
|
||||
Orchestrates the QA team through a 7-phase structured testing cycle. Coordinates
|
||||
qa-lead (strategy, test plan, sign-off report) and qa-tester (test case writing,
|
||||
bug report writing). Covers scope detection, story classification, QA plan
|
||||
generation, smoke check gate, test case writing, manual QA execution with bug
|
||||
filing, and a final sign-off report with an APPROVED / APPROVED WITH CONDITIONS /
|
||||
NOT APPROVED verdict. Parallel qa-tester spawning is used in Phase 5 for
|
||||
independent stories.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
|
||||
- [ ] Contains verdict keywords for sign-off report: APPROVED, APPROVED WITH CONDITIONS, NOT APPROVED
|
||||
- [ ] Contains "May I write" language for both the QA plan and the sign-off report
|
||||
- [ ] Has an Error Recovery Protocol section
|
||||
- [ ] Uses `AskUserQuestion` at phase transitions to capture user approval before proceeding
|
||||
- [ ] Phase 4 (smoke check) is a hard gate: FAIL stops the cycle
|
||||
- [ ] Bug reports are written to `production/qa/bugs/` with `BUG-[NNN]-[short-slug].md` naming
|
||||
- [ ] Next-step guidance differs by verdict (APPROVED / APPROVED WITH CONDITIONS / NOT APPROVED)
|
||||
- [ ] Independent qa-tester tasks in Phase 5 are spawned in parallel
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All stories pass manual QA, APPROVED verdict
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-03/` exists with 4 story files
|
||||
- Stories are a mix of types: 1 Logic, 1 Integration, 2 Visual/Feel
|
||||
- All stories have acceptance criteria populated
|
||||
- `tests/smoke/` contains a smoke test list; all items are verifiable
|
||||
- No existing bugs in `production/qa/bugs/`
|
||||
|
||||
**Input:** `/team-qa sprint-03`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1: Reads all story files in `production/sprints/sprint-03/`; reads `production/stage.txt`; reports "Found 4 stories. Current stage: [stage]. Ready to begin QA strategy?"
|
||||
2. Phase 2: Spawns `qa-lead` via Task; produces strategy table classifying all 4 stories; no blockers flagged; presents to user; AskUserQuestion: user selects "Looks good — proceed to test plan"
|
||||
3. Phase 3: Produces QA plan document; asks "May I write the QA plan to `production/qa/qa-plan-sprint-03-[date].md`?"; writes after approval
|
||||
4. Phase 4: Spawns `qa-lead` via Task; reviews `tests/smoke/`; returns PASS; reports "Smoke check passed. Proceeding to test case writing."
|
||||
5. Phase 5: Spawns `qa-tester` via Task for each Visual/Feel and Integration story (2–3 stories); run in parallel; presents test cases grouped by story; AskUserQuestion per group; user approves
|
||||
6. Phase 6: Walks through each approved story; user marks all as PASS; result summary: "Stories PASS: 4, FAIL: 0, BLOCKED: 0"
|
||||
7. Phase 7: Spawns `qa-lead` via Task to produce sign-off report; report shows all stories PASS; no bugs filed; Verdict: APPROVED; asks "May I write this QA sign-off report to `production/qa/qa-signoff-sprint-03-[date].md`?"; writes after approval
|
||||
8. Verdict: COMPLETE — QA cycle finished
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Phase 1 correctly counts and reports 4 stories with current stage
|
||||
- [ ] Strategy table in Phase 2 classifies all 4 stories with correct types
|
||||
- [ ] QA plan written only after "May I write?" approval
|
||||
- [ ] Smoke check PASS allows pipeline to continue without user intervention
|
||||
- [ ] Phase 5 qa-tester tasks for independent stories are issued in parallel
|
||||
- [ ] Sign-off report includes Test Coverage Summary table and Verdict: APPROVED
|
||||
- [ ] Sign-off report written only after "May I write?" approval
|
||||
- [ ] Verdict: COMPLETE appears in final output
|
||||
- [ ] Next step: "Run `/gate-check` to validate advancement."
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Smoke Check Fail — QA cycle stops at Phase 4
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-04/` exists with 3 story files
|
||||
- `tests/smoke/` exists with 5 smoke test items; 2 items cannot be verified (e.g., build is unstable, core navigation broken)
|
||||
|
||||
**Input:** `/team-qa sprint-04`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phases 1–3 complete normally; QA plan is written
|
||||
2. Phase 4: Spawns `qa-lead` via Task; smoke check returns FAIL; two specific failures are identified
|
||||
3. Skill reports: "Smoke check failed. QA cannot begin until these issues are resolved: [list of 2 failures]. Fix them and re-run `/smoke-check`, or re-run `/team-qa` once resolved."
|
||||
4. Skill stops immediately after Phase 4 — no Phase 5, 6, or 7 is executed
|
||||
5. No sign-off report is produced; no "May I write?" for a sign-off is issued
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Smoke check FAIL causes the pipeline to halt at Phase 4 — Phases 5, 6, 7 are NOT executed
|
||||
- [ ] Failure list is shown to the user explicitly (not summarized vaguely)
|
||||
- [ ] Skill recommends `/smoke-check` and `/team-qa` re-run as remediation steps
|
||||
- [ ] No QA sign-off report is written or offered
|
||||
- [ ] Skill does NOT produce a COMPLETE verdict
|
||||
- [ ] Any QA plan already written in Phase 3 is preserved (not deleted)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Bug Found — Visual/Feel story fails manual QA, bug report filed
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-05/` exists with 2 story files: 1 Logic (passes automated tests), 1 Visual/Feel
|
||||
- `tests/smoke/` smoke check passes
|
||||
- The Visual/Feel story's animation timing is visibly wrong (acceptance criterion not met)
|
||||
- `production/qa/bugs/` directory exists (empty or with existing bugs)
|
||||
|
||||
**Input:** `/team-qa sprint-05`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phases 1–5 complete normally; test cases are written for the Visual/Feel story
|
||||
2. Phase 6: User marks Visual/Feel story as FAIL; AskUserQuestion collects failure description: "Animation plays at 2x speed — jitter visible on every loop"
|
||||
3. Phase 6: Spawns `qa-tester` via Task to write a formal bug report; bug report written to `production/qa/bugs/BUG-001-animation-speed-jitter.md` (or next increment if bugs exist); report includes severity field
|
||||
4. Result summary: "Stories PASS: 1, FAIL: 1 — bugs filed: BUG-001"
|
||||
5. Phase 7: Spawns `qa-lead` to produce sign-off report; Bugs Found table lists BUG-001 with severity and status Open; Verdict: NOT APPROVED (S1/S2 bug open, or FAIL without documented workaround)
|
||||
6. Sign-off report write is offered; writes after approval
|
||||
7. Next step: "Resolve S1/S2 bugs and re-run `/team-qa` or targeted manual QA before advancing."
|
||||
|
||||
**Assertions:**
|
||||
- [ ] FAIL result in Phase 6 triggers AskUserQuestion to collect the failure description before the bug report is written
|
||||
- [ ] `qa-tester` is spawned via Task to write the bug report — orchestrator does not write it directly
|
||||
- [ ] Bug report follows naming convention: `BUG-[NNN]-[short-slug].md` in `production/qa/bugs/`
|
||||
- [ ] Bug report NNN is incremented correctly from existing bugs in the directory
|
||||
- [ ] Phase 7 sign-off report Bugs Found table includes the bug ID, story name, severity, and status
|
||||
- [ ] Verdict in sign-off report is NOT APPROVED
|
||||
- [ ] Next step explicitly mentions re-running `/team-qa`
|
||||
- [ ] Verdict: COMPLETE is still issued by the orchestrator (the QA cycle finished — the verdict is NOT APPROVED, but the skill completed its pipeline)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: No Argument — Skill infers active sprint or asks user
|
||||
|
||||
**Fixture (variant A — state files present):**
|
||||
- `production/session-state/active.md` exists and contains a reference to `sprint-06`
|
||||
- `production/sprint-status.yaml` exists and identifies `sprint-06` as active
|
||||
|
||||
**Fixture (variant B — state files absent):**
|
||||
- `production/session-state/active.md` does NOT exist
|
||||
- `production/sprint-status.yaml` does NOT exist
|
||||
|
||||
**Input:** `/team-qa` (no argument)
|
||||
|
||||
**Expected behavior (variant A):**
|
||||
1. Phase 1: No argument provided; reads `production/session-state/active.md`; reads `production/sprint-status.yaml`
|
||||
2. Detects `sprint-06` as the active sprint from both sources
|
||||
3. Proceeds as if `/team-qa sprint-06` was the input; reports "No sprint argument provided — inferred sprint-06 from session state. Found [N] stories."
|
||||
|
||||
**Expected behavior (variant B):**
|
||||
1. Phase 1: No argument provided; attempts to read `production/session-state/active.md` — file missing; attempts to read `production/sprint-status.yaml` — file missing
|
||||
2. Cannot infer sprint; uses AskUserQuestion: "Which sprint or feature should QA cover?" with options to type a sprint identifier or cancel
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT default to a hardcoded sprint name when no argument is provided
|
||||
- [ ] Skill reads both `production/session-state/active.md` AND `production/sprint-status.yaml` before asking the user (variant A)
|
||||
- [ ] When both state files are absent, skill uses AskUserQuestion rather than guessing (variant B)
|
||||
- [ ] Inferred sprint is reported to the user before proceeding (variant A transparency)
|
||||
- [ ] Skill does NOT error out when state files are missing — it falls back to asking (variant B)
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Mixed Results — Some PASS, one FAIL with S1 bug, one BLOCKED
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-07/` exists with 4 story files
|
||||
- Smoke check passes
|
||||
- Story A (Logic): automated test passes — PASS
|
||||
- Story B (UI): manual QA — PASS WITH NOTES (minor text overflow)
|
||||
- Story C (Visual/Feel): manual QA — FAIL; tester identifies S1 crash on ability activation
|
||||
- Story D (Integration): cannot test — BLOCKED (dependency system not yet implemented)
|
||||
|
||||
**Input:** `/team-qa sprint-07`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phases 1–5 proceed; Phase 5 test cases cover stories B, C, D
|
||||
2. Phase 6: User marks Story A as implicitly PASS (automated); Story B: PASS WITH NOTES; Story C: FAIL; Story D: BLOCKED
|
||||
3. After Story C FAIL: qa-tester spawned to write bug report `BUG-001-crash-ability-activation.md` with S1 severity
|
||||
4. Result summary presented: "Stories PASS: 1, PASS WITH NOTES: 1, FAIL: 1 — bugs filed: BUG-001 (S1), BLOCKED: 1"
|
||||
5. Phase 7: qa-lead produces sign-off report covering all 4 stories; BUG-001 listed as S1/Open; Story D listed as BLOCKED; Verdict: NOT APPROVED
|
||||
6. Sign-off report written after "May I write?" approval
|
||||
7. Next step: "Resolve S1/S2 bugs and re-run `/team-qa` or targeted manual QA before advancing."
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 4 stories appear in the Phase 7 sign-off report Test Coverage Summary table — none are silently omitted
|
||||
- [ ] Story D (BLOCKED) is listed in the report with a BLOCKED status, not silently dropped
|
||||
- [ ] S1 bug causes Verdict: NOT APPROVED regardless of the other stories passing
|
||||
- [ ] PASS WITH NOTES stories do not downgrade to FAIL — they are tracked separately
|
||||
- [ ] BUG-001 severity is listed as S1 in the Bugs Found table
|
||||
- [ ] Partial results are preserved — the sign-off report is still produced even with failures and blocks
|
||||
- [ ] Verdict: COMPLETE is issued by the orchestrator (pipeline completed); sign-off verdict is NOT APPROVED
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] `AskUserQuestion` used at Phase 2 (strategy review), Phase 5 (test case approval per group), and Phase 6 (per-story manual QA result)
|
||||
- [ ] Phase 4 smoke check is a hard gate: FAIL halts the pipeline at Phase 4 with no exceptions
|
||||
- [ ] "May I write?" asked separately for QA plan (Phase 3) and sign-off report (Phase 7)
|
||||
- [ ] Bug reports are always written by `qa-tester` via Task — orchestrator does not write directly
|
||||
- [ ] Phase 5 qa-tester tasks for independent stories are issued in parallel where possible
|
||||
- [ ] Error recovery: any BLOCKED agent is surfaced immediately with AskUserQuestion options
|
||||
- [ ] Partial report always produced — no work is discarded because one story failed or blocked
|
||||
- [ ] Sign-off verdict rules are strictly applied: any S1/S2 bug open = NOT APPROVED; no exceptions
|
||||
- [ ] Orchestrator-level Verdict: COMPLETE is distinct from the sign-off report's APPROVED/NOT APPROVED verdict
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The "APPROVED WITH CONDITIONS" verdict path (S3/S4 bugs, PASS WITH NOTES) is covered implicitly by Case 5's PASS WITH NOTES story (Story B) — if no S1/S2 bugs existed, that case would produce APPROVED WITH CONDITIONS. A dedicated case is not required as the verdict logic is table-driven.
|
||||
- The `feature: [system-name]` argument form is not separately tested — it follows the same Phase 1 logic as the sprint form, using glob instead of directory read. The no-argument inference path (Case 4) provides sufficient coverage of the detection logic.
|
||||
- Logic stories with passing automated tests do not need manual QA — this is validated implicitly by Case 5 (Story A) where the Logic story receives no manual QA phase.
|
||||
- Parallel qa-tester spawning in Phase 5 is validated implicitly by Case 1 (multiple Visual/Feel stories issued simultaneously); no dedicated parallelism case is required beyond the Static Assertions check.
|
||||
215
CCGS Skill Testing Framework/skills/team/team-release.md
Normal file
215
CCGS Skill Testing Framework/skills/team/team-release.md
Normal file
@@ -0,0 +1,215 @@
|
||||
# Skill Test Spec: /team-release
|
||||
|
||||
## Skill Summary
|
||||
|
||||
Orchestrates the release team through a 7-phase pipeline from release candidate to
|
||||
deployment and post-release monitoring. Coordinates release-manager, qa-lead,
|
||||
devops-engineer, producer, security-engineer (optional, required for online/
|
||||
multiplayer), network-programmer (optional, required for multiplayer),
|
||||
analytics-engineer, and community-manager. Phase 3 agents run in parallel. Ends
|
||||
with a go/no-go decision; deployment (Phase 6) is skipped if the producer calls
|
||||
NO-GO. Closes with a post-release monitoring plan.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
|
||||
- [ ] Contains "May I write" language in the File Write Protocol section (delegated to sub-agents)
|
||||
- [ ] Has a File Write Protocol section stating that the orchestrator does not write files directly
|
||||
- [ ] Has an Error Recovery Protocol section with four recovery options (surface / assess / offer options / partial report)
|
||||
- [ ] Has a next-step handoff referencing post-release monitoring, `/retrospective`, and `production/stage.txt`
|
||||
- [ ] Uses `AskUserQuestion` at phase transitions requiring user approval before proceeding
|
||||
- [ ] Phase 3 agents (qa-lead, devops-engineer, and optionally security-engineer, network-programmer) are explicitly stated to run in parallel
|
||||
- [ ] Phase 6 (Deployment) is conditional on a GO decision from Phase 5
|
||||
- [ ] security-engineer is described as conditional on online features / player data — not always spawned
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path (Single-Player) — All phases complete, version deployed
|
||||
|
||||
**Fixture:**
|
||||
- `production/stage.txt` exists and contains a Production-or-later stage
|
||||
- Milestone acceptance criteria are all met (producer can confirm)
|
||||
- No online features, no multiplayer, no player data collection
|
||||
- All CI builds are clean on the current branch
|
||||
- No open S1/S2 bugs
|
||||
- `production/sprints/` contains the completed sprint stories for this milestone
|
||||
|
||||
**Input:** `/team-release v1.0.0`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1: Spawns `producer` via Task; confirms all milestone acceptance criteria met; identifies any deferred scope; produces release authorization; presents to user; AskUserQuestion: user approves before Phase 2
|
||||
2. Phase 2: Spawns `release-manager` via Task; cuts release branch from agreed commit; bumps version numbers; invokes `/release-checklist`; freezes branch; output: branch name and checklist; AskUserQuestion: user approves before Phase 3
|
||||
3. Phase 3 (parallel): Issues Task calls simultaneously for `qa-lead` (regression suite, critical path sign-off) and `devops-engineer` (build artifacts, CI verification); security-engineer is NOT spawned (no online features); network-programmer is NOT spawned (no multiplayer); both complete successfully
|
||||
4. Phase 4: Verifies localization strings all translated; `analytics-engineer` verifies telemetry fires correctly on the release build; performance benchmarks pass; sign-off produced
|
||||
5. Phase 5: Spawns `producer` via Task; collects sign-offs from qa-lead, release-manager, devops-engineer; no open blocking issues; producer declares GO; AskUserQuestion: user sees GO decision and confirms deployment
|
||||
6. Phase 6: Spawns `release-manager` + `devops-engineer` (parallel); tags release in version control; invokes `/changelog`; deploys to staging; smoke test passes; deploys to production; simultaneously spawns `community-manager` to finalize patch notes via `/patch-notes v1.0.0` and prepare launch announcement
|
||||
7. Phase 7: release-manager generates release report; producer updates milestone tracking; qa-lead begins monitoring for regressions; community-manager publishes communication; analytics-engineer confirms live dashboards healthy
|
||||
8. Verdict: COMPLETE — release executed and deployed
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Phase 3 qa-lead and devops-engineer Task calls are issued simultaneously, not sequentially
|
||||
- [ ] security-engineer is NOT spawned when the game has no online features, multiplayer, or player data
|
||||
- [ ] Phase 5 producer collects sign-offs from all required parties before declaring GO
|
||||
- [ ] Phase 6 deployment only begins after GO decision is confirmed by the user
|
||||
- [ ] `/changelog` is invoked by release-manager in Phase 6 (not written directly)
|
||||
- [ ] `/patch-notes v1.0.0` is invoked by community-manager in Phase 6
|
||||
- [ ] Phase 7 monitoring plan includes a 48-hour post-release monitoring commitment
|
||||
- [ ] Next steps recommend updating `production/stage.txt` to `Live` after successful deployment
|
||||
- [ ] Verdict: COMPLETE appears in the final output
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Go/No-Go: NO — S1 bug found in Phase 3, deployment skipped
|
||||
|
||||
**Fixture:**
|
||||
- Release candidate branch exists for v0.9.0
|
||||
- qa-lead discovers a previously unreported S1 crash in the main menu during Phase 3 regression testing
|
||||
- devops-engineer build is clean and artifacts are ready
|
||||
- producer is aware of the S1 bug
|
||||
|
||||
**Input:** `/team-release v0.9.0`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phases 1–2 complete normally; release candidate is cut
|
||||
2. Phase 3 (parallel): devops-engineer returns clean build sign-off; qa-lead returns with an S1 bug identified and regression suite failing; qa-lead declares quality gate: NOT PASSED
|
||||
3. Orchestrator surfaces the qa-lead result immediately: "QA-LEAD: S1 bug found — [crash description]. Quality gate: NOT PASSED."
|
||||
4. Phase 4 proceeds cautiously or is paused (AskUserQuestion: continue to Phase 4 or skip to Phase 5 for go/no-go?)
|
||||
5. Phase 5: Spawns `producer` via Task; producer receives qa-lead's NOT PASSED verdict; no S1 sign-off available; producer declares NO-GO with rationale: "S1 bug [ID] is open and unresolved. Releasing is not safe."
|
||||
6. AskUserQuestion: user is presented with the NO-GO decision and the S1 bug details; options: fix the bug and re-run, defer the release, or override (with documented rationale)
|
||||
7. Phase 6 (Deployment) is SKIPPED entirely — no branch tagging, no deploy to staging, no deploy to production
|
||||
8. community-manager is NOT spawned in Phase 6 (no deployment to announce)
|
||||
9. Skill ends with a partial report summarizing what was completed (Phases 1–5) and what was skipped (Phase 6) and why
|
||||
10. Verdict: BLOCKED — release not deployed
|
||||
|
||||
**Assertions:**
|
||||
- [ ] qa-lead S1 bug finding is surfaced to the user immediately after Phase 3 completes — not suppressed until Phase 5
|
||||
- [ ] producer's NO-GO decision explicitly references the S1 bug and the quality gate result
|
||||
- [ ] Phase 6 Deployment is completely skipped when producer declares NO-GO
|
||||
- [ ] community-manager is NOT spawned for patch notes or launch announcement on NO-GO
|
||||
- [ ] The partial report clearly states which phases completed and which were skipped, with reasons
|
||||
- [ ] Verdict: BLOCKED (not COMPLETE) when deployment is skipped due to NO-GO
|
||||
- [ ] AskUserQuestion offers the user resolution options (fix and re-run / defer / override with rationale)
|
||||
- [ ] Override path (if chosen) requires user to provide a documented rationale before proceeding to Phase 6
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Security Audit for Online Game — security-engineer is spawned in Phase 3
|
||||
|
||||
**Fixture:**
|
||||
- Game has multiplayer features and stores player account data
|
||||
- Release candidate exists for v2.1.0
|
||||
- qa-lead and devops-engineer both return clean sign-offs
|
||||
- security-engineer audit is required per team composition rules
|
||||
|
||||
**Input:** `/team-release v2.1.0`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phases 1–2 complete normally
|
||||
2. Phase 3 (parallel): Orchestrator detects that the game has online/multiplayer features and player data; issues Task calls simultaneously for `qa-lead`, `devops-engineer`, AND `security-engineer`; also spawns `network-programmer` for netcode stability sign-off
|
||||
3. security-engineer conducts pre-release security audit: reviews authentication flows, anti-cheat presence, data privacy compliance; returns sign-off
|
||||
4. network-programmer verifies lag compensation, reconnect handling, and bandwidth under load; returns sign-off
|
||||
5. All four Phase 3 agents complete; their results are collected before Phase 4 begins
|
||||
6. Phase 5: producer collects sign-offs from all four Phase 3 agents (qa-lead, devops-engineer, security-engineer, network-programmer) before making the go/no-go call
|
||||
7. Remaining phases proceed normally to COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] security-engineer IS spawned in Phase 3 when the game has online features, multiplayer, or player data — this is not skipped
|
||||
- [ ] network-programmer IS spawned in Phase 3 when the game has multiplayer
|
||||
- [ ] All four Phase 3 Task calls (qa-lead, devops-engineer, security-engineer, network-programmer) are issued simultaneously
|
||||
- [ ] security-engineer audit covers authentication, anti-cheat, and data privacy compliance
|
||||
- [ ] Phase 5 producer sign-off collection includes security-engineer (four parties, not two)
|
||||
- [ ] Phase 6 deployment does not begin until security-engineer has signed off
|
||||
- [ ] Skill does NOT treat security-engineer as optional for a game with player data
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Localization Miss — Untranslated strings block the ship
|
||||
|
||||
**Fixture:**
|
||||
- Release candidate exists for v1.2.0
|
||||
- Phase 3 (qa-lead, devops-engineer) complete with clean sign-offs
|
||||
- Phase 4: localization verification detects 47 untranslated strings in the French locale (a supported language in the game's localization scope)
|
||||
- localization-lead is available as a delegatable agent
|
||||
|
||||
**Input:** `/team-release v1.2.0`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phases 1–3 complete with clean sign-offs
|
||||
2. Phase 4: Localization verification step detects untranslated strings; identifies 47 strings in French locale; localization-lead (if available) is spawned to assess the severity
|
||||
3. Orchestrator surfaces: "LOCALIZATION MISS: 47 untranslated strings found in French locale. Localization sign-off is required before shipping."
|
||||
4. AskUserQuestion: options presented — (a) Fix translations and re-run Phase 4, (b) Remove French locale from this release, (c) Ship as-is with a known issues note
|
||||
5. If user selects (a): Phase 4 is re-run after translations are provided; skill waits for localization sign-off
|
||||
6. Phase 5 go/no-go does NOT proceed while localization sign-off is outstanding
|
||||
7. Ship is blocked (Phase 6 not entered) until localization issue is resolved or explicitly waived
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Localization verification in Phase 4 detects untranslated strings and counts them (not just "some strings missing")
|
||||
- [ ] Untranslated strings for a supported locale block the pipeline before Phase 5
|
||||
- [ ] AskUserQuestion is used to offer the user resolution choices — the skill does not auto-waive
|
||||
- [ ] Phase 5 go/no-go is NOT called while localization sign-off is pending
|
||||
- [ ] If user chooses to re-run Phase 4: the skill does not require restarting from Phase 1
|
||||
- [ ] If user explicitly waives (ships as-is): the waiver is documented in the release report (Phase 7) as a known issue
|
||||
- [ ] Skill does NOT fabricate translated strings to unblock itself
|
||||
|
||||
---
|
||||
|
||||
### Case 5: No Argument — Skill infers version or asks
|
||||
|
||||
**Fixture (variant A — milestone data present):**
|
||||
- `production/milestones/` exists with a milestone file; most recent milestone is "v1.1.0 — Gold"
|
||||
- `production/session-state/active.md` references a version or milestone
|
||||
|
||||
**Fixture (variant B — no discoverable version):**
|
||||
- `production/milestones/` does not exist
|
||||
- `production/session-state/active.md` does not reference a version
|
||||
- No git tags are present from which to infer a version
|
||||
|
||||
**Input:** `/team-release` (no argument)
|
||||
|
||||
**Expected behavior (variant A):**
|
||||
1. Phase 1: No argument provided; reads `production/session-state/active.md`; reads most recent milestone file in `production/milestones/`
|
||||
2. Infers v1.1.0 as the target version; reports "No version argument provided — inferred v1.1.0 from milestone data. Proceeding."
|
||||
3. Confirms with AskUserQuestion before beginning Phase 1 proper: "Releasing v1.1.0. Is this correct?"
|
||||
4. Proceeds as if `/team-release v1.1.0` was the input
|
||||
|
||||
**Expected behavior (variant B):**
|
||||
1. Phase 1: No argument provided; reads available state files — no version discoverable
|
||||
2. Uses AskUserQuestion: "What version number should be released? (e.g., v1.0.0)"
|
||||
3. Waits for user input before proceeding
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT default to a hardcoded version string when no argument is provided
|
||||
- [ ] Skill reads `production/session-state/active.md` and milestone files before asking (variant A)
|
||||
- [ ] Inferred version is confirmed with the user via AskUserQuestion before proceeding (variant A)
|
||||
- [ ] When no version is discoverable, AskUserQuestion is used — skill does not guess (variant B)
|
||||
- [ ] Skill does NOT error out when milestone files are absent — it falls back to asking (variant B)
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] `AskUserQuestion` used at each phase transition gate (post-Phase 1, post-Phase 2, post-Phase 3/4 if issues, post-Phase 5 go/no-go)
|
||||
- [ ] Phase 3 agents are always issued as parallel Task calls — qa-lead and devops-engineer are never sequential
|
||||
- [ ] security-engineer is conditionally spawned based on game features — never silently skipped when features are present
|
||||
- [ ] File Write Protocol: orchestrator never calls Write/Edit directly — all writes are delegated to sub-agents or sub-skills
|
||||
- [ ] Phase 6 Deployment is strictly conditional on a GO verdict from Phase 5 — never auto-triggered
|
||||
- [ ] Error recovery: any BLOCKED agent is surfaced immediately before continuing to dependent phases
|
||||
- [ ] Partial reports are always produced if any phase fails or the pipeline is halted (Case 2)
|
||||
- [ ] Verdict: COMPLETE only when deployment completes; BLOCKED when go/no-go is NO or a hard blocker is unresolved
|
||||
- [ ] Next steps always include 48-hour post-release monitoring, `/retrospective` recommendation, and `production/stage.txt` update to `Live`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Phase 7 post-release actions (release report, milestone tracking, community publishing, dashboard monitoring) are validated implicitly by Case 1. No separate edge case is required as Phase 7 is non-gated and does not have a blocking failure mode.
|
||||
- The "devops-engineer build fails" path is not separately tested — it would surface as a BLOCKED result in Phase 3 and follow the standard error recovery protocol (surface → assess → AskUserQuestion options). This is validated structurally by the Static Assertions error recovery check.
|
||||
- The parallel Phase 4 path (localization + performance + analytics simultaneously with Phase 3) is a documented option in the skill ("can run in parallel with Phase 3 if resources available"). Case 4 tests Phase 4 as a sequential gate; the parallel variant is left to the skill's implementation judgment.
|
||||
- The `network-programmer` sign-off path for multiplayer is validated as part of Case 3 rather than a separate case, as it follows the same parallel-spawn pattern as security-engineer.
|
||||
- The "override NO-GO with documented rationale" path in Case 2 is referenced but not exhaustively tested — it is an escape hatch that the skill must support, and its existence is validated by the AskUserQuestion options assertion in Case 2.
|
||||
201
CCGS Skill Testing Framework/skills/team/team-ui.md
Normal file
201
CCGS Skill Testing Framework/skills/team/team-ui.md
Normal file
@@ -0,0 +1,201 @@
|
||||
# Skill Test Spec: /team-ui
|
||||
|
||||
## Skill Summary
|
||||
|
||||
Orchestrates the UI team through the full UX pipeline for a single UI feature.
|
||||
Coordinates ux-designer, ui-programmer, art-director, the engine UI specialist,
|
||||
and accessibility-specialist through five structured phases: Context Gathering +
|
||||
UX Spec (Phase 1a/1b) → UX Review Gate (Phase 1c) → Visual Design (Phase 2) →
|
||||
Implementation (Phase 3) → Review in parallel (Phase 4) → Polish (Phase 5).
|
||||
Uses `AskUserQuestion` at each phase transition. Delegates all file writes to
|
||||
sub-agents and sub-skills (`/ux-design`, `ui-programmer`). Produces a summary report
|
||||
with verdict COMPLETE / BLOCKED and handoffs to `/ux-review`, `/code-review`,
|
||||
`/team-polish`.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings (Phase 1a through Phase 5 are all present)
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
|
||||
- [ ] Contains "May I write" or "File Write Protocol" — writes delegated to sub-agents and sub-skills, orchestrator does not write files directly
|
||||
- [ ] Has a next-step handoff at the end (references `/ux-review`, `/code-review`, `/team-polish`)
|
||||
- [ ] Error Recovery Protocol section is present with all four recovery steps
|
||||
- [ ] Uses `AskUserQuestion` at phase transitions for user approval before proceeding
|
||||
- [ ] Phase 4 is explicitly marked as parallel (ux-designer, art-director, accessibility-specialist)
|
||||
- [ ] UX Review Gate (Phase 1c) is defined as a blocking gate — skill must not proceed to Phase 2 without APPROVED verdict
|
||||
- [ ] Team Composition lists all five roles (ux-designer, ui-programmer, art-director, engine UI specialist, accessibility-specialist)
|
||||
- [ ] References the interaction pattern library (`design/ux/interaction-patterns.md`) — ui-programmer must use existing patterns
|
||||
- [ ] Phase 1a reads `design/accessibility-requirements.md` before design begins
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Full pipeline from UX spec through polish succeeds
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/game-concept.md` exists with platform targets and intended audience
|
||||
- `design/player-journey.md` exists
|
||||
- `design/ux/interaction-patterns.md` exists with relevant patterns
|
||||
- `design/accessibility-requirements.md` exists with committed tier (e.g., Enhanced)
|
||||
- Engine UI specialist configured in `.claude/docs/technical-preferences.md`
|
||||
|
||||
**Input:** `/team-ui inventory screen`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1a — orchestrator reads game-concept.md, player-journey.md, relevant GDD UI sections, interaction-patterns.md, accessibility-requirements.md; summarizes a brief for the ux-designer
|
||||
2. Phase 1b — `/ux-design inventory-screen` invoked (or ux-designer spawned directly); produces `design/ux/inventory-screen.md` using `ux-spec.md` template; `AskUserQuestion` confirms spec before review
|
||||
3. Phase 1c — `/ux-review design/ux/inventory-screen.md` invoked; returns APPROVED; gate passed, proceed to Phase 2
|
||||
4. Phase 2 — art-director spawned; reviews full UX spec (not only wireframes); applies visual treatment; verifies color contrast; produces visual design spec with asset manifest; `AskUserQuestion` confirms before Phase 3
|
||||
5. Phase 3 — engine UI specialist spawned first (read from technical-preferences.md); produces implementation notes for ui-programmer; ui-programmer spawned with UX spec + visual spec + engine notes; implementation produced; interaction-patterns.md updated if new patterns introduced
|
||||
6. Phase 4 — ux-designer, art-director, accessibility-specialist spawned in parallel; all three return results before Phase 5
|
||||
7. Phase 5 — review feedback addressed; animations verified skippable; UI sounds confirmed through audio event system; interaction-patterns.md final check; verdict: COMPLETE
|
||||
8. Summary report: UX spec APPROVED, visual design COMPLETE, implementation COMPLETE, accessibility COMPLIANT, all input methods supported, pattern library updated, verdict: COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Phase 1a reads all five sources before briefing ux-designer
|
||||
- [ ] UX Review Gate checked before Phase 2 — Phase 2 does NOT begin until APPROVED
|
||||
- [ ] Art-director in Phase 2 reviews full spec, not just wireframe images
|
||||
- [ ] Engine UI specialist spawned before ui-programmer in Phase 3
|
||||
- [ ] Phase 4 agents launched simultaneously (ux-designer, art-director, accessibility-specialist)
|
||||
- [ ] All file writes delegated to sub-agents and sub-skills
|
||||
- [ ] Verdict COMPLETE in final summary report
|
||||
- [ ] Next steps include `/ux-review`, `/code-review`, `/team-polish`
|
||||
|
||||
---
|
||||
|
||||
### Case 2: UX Review Gate — Spec fails review; skill halts before implementation
|
||||
|
||||
**Fixture:**
|
||||
- `design/ux/inventory-screen.md` produced by Phase 1b
|
||||
- `/ux-review` returns verdict NEEDS REVISION with specific concerns flagged (e.g., gamepad navigation flow incomplete, contrast ratio below minimum)
|
||||
|
||||
**Input:** `/team-ui inventory screen`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1a + 1b complete — UX spec produced
|
||||
2. Phase 1c — `/ux-review design/ux/inventory-screen.md` returns NEEDS REVISION
|
||||
3. Skill does NOT advance to Phase 2
|
||||
4. `AskUserQuestion` presented with the specific flagged concerns and options:
|
||||
- (a) Return to ux-designer to address the issues and re-review
|
||||
- (b) Accept the risk and proceed to Phase 2 anyway (conscious decision)
|
||||
5. If user chooses (a): ux-designer revises spec, `/ux-review` re-run; loop continues until APPROVED or user overrides
|
||||
6. If user chooses (b): skill proceeds with an explicit NEEDS REVISION note in the final report
|
||||
7. Skill does NOT silently proceed past the gate
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Phase 2 does NOT begin while UX review verdict is NEEDS REVISION
|
||||
- [ ] `AskUserQuestion` presents the specific flagged concerns before offering options
|
||||
- [ ] User must make a conscious choice to override — skill does not assume override
|
||||
- [ ] If user accepts risk, NEEDS REVISION concern is documented in the final report
|
||||
- [ ] Revision-and-re-review loop is offered (not just a one-shot failure)
|
||||
- [ ] Skill does NOT discard the produced UX spec on review failure
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Argument — Usage guidance shown
|
||||
|
||||
**Fixture:**
|
||||
- Any project state
|
||||
|
||||
**Input:** `/team-ui` (no argument)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no argument provided
|
||||
2. Outputs usage message explaining the required argument (UI feature description)
|
||||
3. Provides an example invocation: `/team-ui [UI feature description]`
|
||||
4. Skill exits without spawning any subagents or reading any project files
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT spawn any subagents when no argument is given
|
||||
- [ ] Usage message includes the argument-hint format from frontmatter
|
||||
- [ ] At least one example of a valid invocation is shown
|
||||
- [ ] No UX spec files or GDDs read before failing
|
||||
- [ ] Verdict is NOT shown (pipeline never starts)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Accessibility Parallel Review — Phase 4 runs three streams simultaneously
|
||||
|
||||
**Fixture:**
|
||||
- `design/ux/inventory-screen.md` exists (APPROVED)
|
||||
- Visual design spec complete
|
||||
- Implementation complete
|
||||
- `design/accessibility-requirements.md` committed tier: Enhanced
|
||||
|
||||
**Input:** `/team-ui inventory screen` (resuming from Phase 3 complete)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 4 begins after implementation is confirmed complete
|
||||
2. Three Task calls issued simultaneously: ux-designer, art-director, accessibility-specialist
|
||||
3. Each stream operates independently:
|
||||
- ux-designer: verifies implementation matches wireframes, tests keyboard-only and gamepad-only navigation, checks accessibility features function
|
||||
- art-director: verifies visual consistency with art bible at minimum and maximum supported resolutions
|
||||
- accessibility-specialist: audits against the Enhanced accessibility tier in `design/accessibility-requirements.md`; any violation flagged as a blocker
|
||||
4. Skill waits for all three results before proceeding to Phase 5
|
||||
5. `AskUserQuestion` presents all three review results before Phase 5 begins
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All three Task calls issued before any result is awaited (parallel, not sequential)
|
||||
- [ ] Phase 5 does NOT begin until all three Phase 4 agents have returned
|
||||
- [ ] Accessibility-specialist explicitly reads `design/accessibility-requirements.md` for the committed tier
|
||||
- [ ] Accessibility violations flagged as BLOCKING (not merely advisory)
|
||||
- [ ] `AskUserQuestion` shows all three review streams' results together before Phase 5 approval
|
||||
- [ ] No Phase 4 agent's output is used as input for another Phase 4 agent
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Missing Interaction Pattern Library — Skill notes the gap rather than inventing patterns
|
||||
|
||||
**Fixture:**
|
||||
- `design/ux/interaction-patterns.md` does NOT exist
|
||||
- All other required files present
|
||||
|
||||
**Input:** `/team-ui settings menu`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1a — orchestrator attempts to read `design/ux/interaction-patterns.md`; file not found
|
||||
2. Skill surfaces the gap: "interaction-patterns.md does not exist — no existing patterns to reuse"
|
||||
3. `AskUserQuestion` presented with options:
|
||||
- (a) Run `/ux-design patterns` first to establish the pattern library, then continue
|
||||
- (b) Proceed without the pattern library — ux-designer will document new patterns as they are created
|
||||
4. Skill does NOT invent or assume patterns from other sources
|
||||
5. If user chooses (b): ui-programmer is explicitly instructed to treat all patterns created as new and to add each to a new `design/ux/interaction-patterns.md` at completion
|
||||
6. Final report notes that interaction-patterns.md was created (or is still absent if user skipped)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT silently ignore the missing pattern library
|
||||
- [ ] Skill does NOT invent patterns by guessing from the feature name or GDD alone
|
||||
- [ ] `AskUserQuestion` offers a "create pattern library first" option (referencing `/ux-design patterns`)
|
||||
- [ ] If user proceeds without the library, ui-programmer is told to treat all patterns as new
|
||||
- [ ] Final report documents pattern library status (created / absent / updated)
|
||||
- [ ] Skill does NOT fail entirely — the gap is noted and user is given a choice
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] `AskUserQuestion` used at each phase transition — user approves before pipeline advances
|
||||
- [ ] UX Review Gate (Phase 1c) is blocking — Phase 2 cannot begin without APPROVED or explicit user override
|
||||
- [ ] All file writes delegated to sub-agents and sub-skills — orchestrator does not call Write or Edit directly
|
||||
- [ ] Phase 4 agents launched in parallel per skill spec
|
||||
- [ ] Error Recovery Protocol followed: surface → assess → offer options → partial report
|
||||
- [ ] Partial report always produced even when agents are BLOCKED
|
||||
- [ ] Verdict is one of COMPLETE / BLOCKED
|
||||
- [ ] Next steps present at end: `/ux-review`, `/code-review`, `/team-polish`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The HUD-specific path (`/ux-design hud` + `hud-design.md` template + visual budget check in Phase 5)
|
||||
is not separately tested here; it shares the same phase structure but uses different templates.
|
||||
- The "Update in place" path for interaction-patterns.md (new pattern added during implementation)
|
||||
is exercised implicitly in Case 1 Step 5 — a dedicated fixture with a known new pattern would
|
||||
strengthen coverage.
|
||||
- Engine UI specialist unavailable (no engine configured) — skill spec states "skip if no engine
|
||||
configured"; this path is asserted in Case 1 but not given a dedicated fixture.
|
||||
- The NEEDS REVISION acceptance-risk override (Case 2 option b) requires the override to be
|
||||
explicitly documented in the report; this is asserted but not further tested for downstream effects.
|
||||
214
CCGS Skill Testing Framework/skills/utility/adopt.md
Normal file
214
CCGS Skill Testing Framework/skills/utility/adopt.md
Normal file
@@ -0,0 +1,214 @@
|
||||
# Skill Test Spec: /adopt
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/adopt` audits an existing project's artifacts — GDDs, ADRs, stories, infrastructure
|
||||
files, and `technical-preferences.md` — for format compliance with the template's
|
||||
skill pipeline. It classifies every gap by severity (BLOCKING / HIGH / MEDIUM / LOW),
|
||||
composes a numbered, ordered migration plan, and writes it to `docs/adoption-plan-[date].md`
|
||||
after explicit user approval via `AskUserQuestion`.
|
||||
|
||||
This skill is distinct from `/project-stage-detect` (which checks what exists).
|
||||
`/adopt` checks whether what exists will actually work with the template's skills.
|
||||
|
||||
No director gates apply. The skill does NOT invoke any director agents.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains severity tier keywords: BLOCKING, HIGH, MEDIUM, LOW
|
||||
- [ ] Contains "May I write" or `AskUserQuestion` language before writing the adoption plan
|
||||
- [ ] Has a next-step handoff at the end (e.g., offering to fix the highest-priority gap immediately)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/adopt` is a brownfield audit utility. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All GDDs compliant, no gaps, COMPLIANT
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` contains 3 GDD files; each has all 8 required sections with content
|
||||
- `docs/architecture/adr-0001.md` exists with `## Status`, `## Engine Compatibility`,
|
||||
and all other required sections
|
||||
- `production/stage.txt` exists
|
||||
- `docs/architecture/tr-registry.yaml` and `docs/architecture/control-manifest.md` exist
|
||||
- Engine configured in `technical-preferences.md`
|
||||
|
||||
**Input:** `/adopt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill emits "Scanning project artifacts..." then reads all artifacts silently
|
||||
2. Reports detected phase, GDD count, ADR count, story count
|
||||
3. Phase 2 audit: all 3 GDDs have all 8 sections, Status field present and valid
|
||||
4. ADR audit: all required sections present
|
||||
5. Infrastructure audit: all critical files exist
|
||||
6. Phase 3: zero BLOCKING, zero HIGH, zero MEDIUM, zero LOW gaps
|
||||
7. Summary reports: "No blocking gaps — this project is template-compatible"
|
||||
8. Uses `AskUserQuestion` to ask about writing the plan; user selects write
|
||||
9. Adoption plan is written to `docs/adoption-plan-[date].md`
|
||||
10. Phase 7 offers next action: no blocking gaps, offers options for next steps
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads silently before presenting any output
|
||||
- [ ] "Scanning project artifacts..." appears before the silent read phase
|
||||
- [ ] Gap counts show 0 BLOCKING, 0 HIGH, 0 MEDIUM (or only LOW)
|
||||
- [ ] `AskUserQuestion` is used before writing the adoption plan
|
||||
- [ ] Adoption plan file is written to `docs/adoption-plan-[date].md`
|
||||
- [ ] Phase 7 offers a specific next action (not just a list)
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Non-Compliant Documents — GDDs missing sections, NEEDS MIGRATION
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` contains 2 GDD files:
|
||||
- `combat.md` — missing `## Acceptance Criteria` and `## Formulas` sections
|
||||
- `movement.md` — all 8 sections present
|
||||
- One ADR (`adr-0001.md`) is missing `## Status` section
|
||||
- `docs/architecture/tr-registry.yaml` does not exist
|
||||
|
||||
**Input:** `/adopt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans all artifacts
|
||||
2. Phase 2 audit finds:
|
||||
- `combat.md`: 2 missing sections (Acceptance Criteria, Formulas)
|
||||
- `adr-0001.md`: missing `## Status` — BLOCKING impact
|
||||
- `tr-registry.yaml`: missing — HIGH impact
|
||||
3. Phase 3 classifies:
|
||||
- BLOCKING: `adr-0001.md` missing `## Status` (story-readiness silently passes)
|
||||
- HIGH: `tr-registry.yaml` missing; `combat.md` missing Acceptance Criteria (can't generate stories)
|
||||
- MEDIUM: `combat.md` missing Formulas
|
||||
4. Phase 4 builds ordered migration plan:
|
||||
- Step 1 (BLOCKING): Add `## Status` to `adr-0001.md` — command: `/architecture-decision retrofit`
|
||||
- Step 2 (HIGH): Run `/architecture-review` to bootstrap tr-registry.yaml
|
||||
- Step 3 (HIGH): Add Acceptance Criteria to `combat.md` — command: `/design-system retrofit`
|
||||
- Step 4 (MEDIUM): Add Formulas to `combat.md`
|
||||
5. Gap Preview shows BLOCKING items as bullets (actual file names), HIGH/MEDIUM as counts
|
||||
6. `AskUserQuestion` asks to write the plan; writes after approval
|
||||
7. Phase 7 offers to fix the highest-priority gap (ADR Status) immediately
|
||||
|
||||
**Assertions:**
|
||||
- [ ] BLOCKING gaps are listed as explicit file-name bullets in the Gap Preview
|
||||
- [ ] HIGH and MEDIUM shown as counts in Gap Preview
|
||||
- [ ] Migration plan items are in BLOCKING-first order
|
||||
- [ ] Each plan item includes the fix command or manual steps
|
||||
- [ ] `AskUserQuestion` is used before writing
|
||||
- [ ] Phase 7 offers to immediately retrofit the first BLOCKING item
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Mixed State — Some docs compliant, some not, partial report
|
||||
|
||||
**Fixture:**
|
||||
- 4 GDD files: 2 fully compliant, 2 with gaps (one missing Tuning Knobs, one missing Edge Cases)
|
||||
- ADRs: 3 files — 2 compliant, 1 missing `## ADR Dependencies`
|
||||
- Stories: 5 files — 3 have TR-ID references, 2 do not
|
||||
- Infrastructure: all critical files present; `technical-preferences.md` fully configured
|
||||
|
||||
**Input:** `/adopt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill audits all artifact types
|
||||
2. Audit summary shows totals: "4 GDDs (2 fully compliant, 2 with gaps); 3 ADRs
|
||||
(2 fully compliant, 1 with gaps); 5 stories (3 with TR-IDs, 2 without)"
|
||||
3. Gap classification:
|
||||
- No BLOCKING gaps
|
||||
- HIGH: 1 ADR missing `## ADR Dependencies`
|
||||
- MEDIUM: 2 GDDs with missing sections; 2 stories missing TR-IDs
|
||||
- LOW: none
|
||||
4. Migration plan lists HIGH gap first, then MEDIUM gaps in order
|
||||
5. Note included: "Existing stories continue to work — do not regenerate stories
|
||||
that are in progress or done"
|
||||
6. `AskUserQuestion` to write plan; writes after approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Per-artifact compliance tallies are shown (N compliant, M with gaps)
|
||||
- [ ] Existing story compatibility note is included in the plan
|
||||
- [ ] No BLOCKING gaps results in no BLOCKING section in migration plan
|
||||
- [ ] HIGH gap precedes MEDIUM gaps in plan ordering
|
||||
- [ ] `AskUserQuestion` is used before writing
|
||||
|
||||
---
|
||||
|
||||
### Case 4: No Artifacts Found — Fresh project, guidance to run /start
|
||||
|
||||
**Fixture:**
|
||||
- Repository has no files in `design/gdd/`, `docs/architecture/`, `production/epics/`
|
||||
- `production/stage.txt` does not exist
|
||||
- `src/` directory does not exist or has fewer than 10 files
|
||||
- No game-concept.md, no systems-index.md
|
||||
|
||||
**Input:** `/adopt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1 existence check finds no artifacts
|
||||
2. Skill infers "Fresh" — no brownfield work to migrate
|
||||
3. Uses `AskUserQuestion`:
|
||||
- "This looks like a fresh project — no existing artifacts found. `/adopt` is for
|
||||
projects with work to migrate. What would you like to do?"
|
||||
- Options: "Run `/start`", "My artifacts are in a non-standard location", "Cancel"
|
||||
4. Skill stops — does not proceed to audit regardless of user selection
|
||||
|
||||
**Assertions:**
|
||||
- [ ] `AskUserQuestion` is used (not a plain text message) when no artifacts are found
|
||||
- [ ] `/start` is presented as a named option
|
||||
- [ ] Skill stops after the question — no audit phases run
|
||||
- [ ] No adoption plan file is written
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; adopt is a utility audit skill
|
||||
|
||||
**Fixture:**
|
||||
- Project with a mix of compliant and non-compliant GDDs
|
||||
|
||||
**Input:** `/adopt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill completes full audit and produces migration plan
|
||||
2. No director agents are spawned at any point
|
||||
3. No gate IDs (CD-*, TD-*, AD-*, PR-*) appear in output
|
||||
4. No `/gate-check` is invoked during the skill run
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Skill reaches plan-writing or cancellation without any gate verdict
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Emits "Scanning project artifacts..." before silent read phase
|
||||
- [ ] Reads all artifacts silently before presenting any results
|
||||
- [ ] Shows Adoption Audit Summary and Gap Preview before asking to write
|
||||
- [ ] Uses `AskUserQuestion` before writing the adoption plan file
|
||||
- [ ] Adoption plan written to `docs/adoption-plan-[date].md` — not to any other path
|
||||
- [ ] Migration plan items ordered: BLOCKING first, HIGH second, MEDIUM third, LOW last
|
||||
- [ ] Phase 7 always offers a single specific next action (not a generic list)
|
||||
- [ ] Never regenerates existing artifacts — only fills gaps in what exists
|
||||
- [ ] Does not invoke director gates at any point
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The `gdds`, `adrs`, `stories`, and `infra` argument modes narrow the audit scope;
|
||||
each follows the same pattern as the full audit but limited to that artifact type.
|
||||
Not separately fixture-tested here.
|
||||
- The systems-index.md parenthetical status value check (BLOCKING) is a special case
|
||||
that triggers an immediate fix offer before writing the plan; not separately tested.
|
||||
- The review-mode.txt prompt (Phase 6b) runs after plan writing if `production/review-mode.txt`
|
||||
does not exist; not separately tested here.
|
||||
179
CCGS Skill Testing Framework/skills/utility/asset-spec.md
Normal file
179
CCGS Skill Testing Framework/skills/utility/asset-spec.md
Normal file
@@ -0,0 +1,179 @@
|
||||
# Skill Test Spec: /asset-spec
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/asset-spec` generates per-asset visual specification documents from design
|
||||
requirements. It reads the relevant GDD, art bible, and design system to produce
|
||||
a structured asset spec sheet that defines: dimensions, animation states (if
|
||||
applicable), color palette reference, style notes, technical constraints
|
||||
(format, file size budget), and deliverable checklist.
|
||||
|
||||
Spec sheets are written to `assets/specs/[asset-name]-spec.md` after a "May I write"
|
||||
ask. If a spec already exists, the skill offers to update it. When multiple assets
|
||||
are requested in a single invocation, a "May I write" ask is made per asset. No
|
||||
director gates apply. The verdict is COMPLETE when all requested specs are written.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" collaborative protocol language (per asset)
|
||||
- [ ] Has a next-step handoff (e.g., assign to an artist, or `/asset-audit` later)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/asset-spec` is a design documentation utility. Technical artists may
|
||||
review specs separately but this is not a gate within this skill.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Enemy sprite spec with full GDD and art bible
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/enemies.md` exists with enemy variants defined
|
||||
- `design/art-bible.md` exists with color palette and style notes
|
||||
- No existing asset spec for "goblin-enemy"
|
||||
|
||||
**Input:** `/asset-spec goblin-enemy`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads enemies GDD and art bible
|
||||
2. Skill generates a spec for the goblin enemy sprite:
|
||||
- Dimensions: inferred from engine defaults or explicitly from GDD
|
||||
- Animation states: idle, walk, attack, hurt, death
|
||||
- Color palette reference: links to art-bible palette section
|
||||
- Style notes: from art bible character design rules
|
||||
- Technical constraints: format (PNG), size budget
|
||||
- Deliverable checklist
|
||||
3. Skill asks "May I write to `assets/specs/goblin-enemy-spec.md`?"
|
||||
4. File written on approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 6 spec components are present (dimensions, animations, palette, style, tech, checklist)
|
||||
- [ ] Color palette reference links to art bible (not duplicated)
|
||||
- [ ] Animation states are drawn from GDD (not invented)
|
||||
- [ ] "May I write" is asked with the correct path
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: No Art Bible Found — Spec with Placeholder Style Notes, Dependency Flagged
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/player.md` exists
|
||||
- `design/art-bible.md` does NOT exist
|
||||
|
||||
**Input:** `/asset-spec player-sprite`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads player GDD but cannot find the art bible
|
||||
2. Skill generates spec with placeholder style notes: "DEPENDENCY GAP: art bible
|
||||
not found — style notes are placeholders"
|
||||
3. Color palette section uses: "TBD — see art bible when created"
|
||||
4. Skill asks "May I write to `assets/specs/player-sprite-spec.md`?"
|
||||
5. File written with placeholders and dependency flag; verdict is COMPLETE with advisory
|
||||
|
||||
**Assertions:**
|
||||
- [ ] DEPENDENCY GAP is flagged for the missing art bible
|
||||
- [ ] Spec is still generated (not blocked)
|
||||
- [ ] Style notes contain placeholder markers, not invented styles
|
||||
- [ ] Verdict is COMPLETE with advisory note
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Asset Spec Already Exists — Offers to Update
|
||||
|
||||
**Fixture:**
|
||||
- `assets/specs/goblin-enemy-spec.md` already exists
|
||||
- GDD has been updated since the spec was written (new attack animation added)
|
||||
|
||||
**Input:** `/asset-spec goblin-enemy`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects existing spec file
|
||||
2. Skill reports: "Asset spec already exists for goblin-enemy — checking for updates"
|
||||
3. Skill diffs GDD against existing spec and identifies: new "charge-attack" animation
|
||||
state added in GDD but not in spec
|
||||
4. Skill presents the diff: "1 new animation state found — offering to update spec"
|
||||
5. Skill asks "May I update `assets/specs/goblin-enemy-spec.md`?" (not overwrite)
|
||||
6. Spec is updated; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Existing spec is detected and "update" path is offered
|
||||
- [ ] Diff between GDD and existing spec is shown
|
||||
- [ ] "May I update" language is used (not "May I write")
|
||||
- [ ] Existing spec content is preserved; only the diff is applied
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Multiple Assets Requested — May-I-Write Per Asset
|
||||
|
||||
**Fixture:**
|
||||
- GDD and art bible exist
|
||||
- User requests specs for 3 assets: goblin-enemy, orc-enemy, treasure-chest
|
||||
|
||||
**Input:** `/asset-spec goblin-enemy orc-enemy treasure-chest`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill generates all 3 specs in sequence
|
||||
2. For each asset, skill shows the draft and asks "May I write to
|
||||
`assets/specs/[name]-spec.md`?" individually
|
||||
3. User can approve all 3 or skip individual assets
|
||||
4. All approved specs are written; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] "May I write" is asked 3 times (once per asset), not once for all
|
||||
- [ ] User can decline one asset without blocking the others
|
||||
- [ ] All 3 spec files are written for approved assets
|
||||
- [ ] Verdict is COMPLETE when all approved specs are written
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; asset-spec is a design utility
|
||||
|
||||
**Fixture:**
|
||||
- GDD and art bible exist
|
||||
|
||||
**Input:** `/asset-spec goblin-enemy`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill generates and writes the asset spec
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is COMPLETE without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads GDD, art bible, and design system before generating spec
|
||||
- [ ] Includes all 6 spec components (dimensions, animations, palette, style, tech, checklist)
|
||||
- [ ] Flags missing dependencies (art bible, GDD) with DEPENDENCY GAP notes
|
||||
- [ ] Asks "May I write" (or "May I update") per asset
|
||||
- [ ] Handles multiple assets with individual write confirmations
|
||||
- [ ] Verdict is COMPLETE when all approved specs are written
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Audio asset specs (sound effects, music) follow the same structure with
|
||||
different fields (duration, sample rate, looping) and are not separately tested.
|
||||
- UI asset specs (icons, button states) follow the same flow with interaction
|
||||
state requirements aligned to the UX spec.
|
||||
- The case where GDD is also missing (neither GDD nor art bible exists) is not
|
||||
separately tested; spec would be generated with both dependency gaps flagged.
|
||||
189
CCGS Skill Testing Framework/skills/utility/brainstorm.md
Normal file
189
CCGS Skill Testing Framework/skills/utility/brainstorm.md
Normal file
@@ -0,0 +1,189 @@
|
||||
# Skill Test Spec: /brainstorm
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/brainstorm` facilitates guided game concept ideation. It presents 2-4 concept
|
||||
options with pros/cons, lets the user choose and refine a concept, and produces
|
||||
a structured `design/gdd/game-concept.md` document. The skill is collaborative —
|
||||
it asks questions before proposing options and iterates until the user approves
|
||||
a concept direction.
|
||||
|
||||
In `full` review mode, four director gates spawn in parallel after the concept
|
||||
is drafted: CD-PILLARS (creative-director), AD-CONCEPT-VISUAL (art-director),
|
||||
TD-FEASIBILITY (technical-director), and PR-SCOPE (producer). In `lean` mode,
|
||||
all 4 inline gates are skipped (lean mode only runs PHASE-GATEs, and brainstorm
|
||||
has none). In `solo` mode, all gates are skipped. The skill asks "May I write"
|
||||
before writing `design/gdd/game-concept.md`.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: APPROVED, REJECTED, CONCERNS
|
||||
- [ ] Contains "May I write" collaborative protocol language (for game-concept.md)
|
||||
- [ ] Has a next-step handoff at the end (`/map-systems`)
|
||||
- [ ] Documents 4 director gates in full mode: CD-PILLARS, AD-CONCEPT-VISUAL, TD-FEASIBILITY, PR-SCOPE
|
||||
- [ ] Documents that all 4 gates are skipped in lean and solo modes
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
In `full` mode: CD-PILLARS, AD-CONCEPT-VISUAL, TD-FEASIBILITY, and PR-SCOPE
|
||||
spawn in parallel after the concept draft is approved by the user.
|
||||
|
||||
In `lean` mode: all 4 inline gates are skipped (brainstorm has no PHASE-GATEs,
|
||||
so lean mode skips everything). Output notes all 4 as: "[GATE-ID] skipped — lean mode".
|
||||
|
||||
In `solo` mode: all 4 gates are skipped. Output notes all 4 as: "[GATE-ID] skipped — solo mode".
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Full mode, 3 concepts, user picks one, all 4 directors approve
|
||||
|
||||
**Fixture:**
|
||||
- No existing `design/gdd/game-concept.md`
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/brainstorm`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill asks the user questions about genre, scope, and target feeling
|
||||
2. Skill presents 3 concept options with pros/cons each
|
||||
3. User selects one concept
|
||||
4. Skill elaborates the chosen concept into a structured draft
|
||||
5. All 4 director gates spawn in parallel: CD-PILLARS, AD-CONCEPT-VISUAL, TD-FEASIBILITY, PR-SCOPE
|
||||
6. All 4 return APPROVED
|
||||
7. Skill asks "May I write `design/gdd/game-concept.md`?"
|
||||
8. Concept written after approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Exactly 3 concept options are presented (not 1, not 5+)
|
||||
- [ ] All 4 director gates spawn in parallel (not sequentially)
|
||||
- [ ] All 4 gates complete before the "May I write" ask
|
||||
- [ ] "May I write `design/gdd/game-concept.md`?" is asked before writing
|
||||
- [ ] Concept file is NOT written without user approval
|
||||
- [ ] Next-step handoff to `/map-systems` is present
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — CD-PILLARS returns REJECT
|
||||
|
||||
**Fixture:**
|
||||
- Concept draft is complete
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
- CD-PILLARS gate returns REJECT: "The concept has no identifiable creative pillar"
|
||||
|
||||
**Input:** `/brainstorm`
|
||||
|
||||
**Expected behavior:**
|
||||
1. CD-PILLARS gate returns REJECT with specific feedback
|
||||
2. Skill surfaces the rejection to the user
|
||||
3. Concept is NOT written to file
|
||||
4. User is asked: rethink the concept direction, or override the rejection
|
||||
5. If rethinking: skill returns to the concept options phase
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Concept is NOT written when CD-PILLARS returns REJECT
|
||||
- [ ] Rejection feedback is shown to the user verbatim
|
||||
- [ ] User is given the option to rethink or override
|
||||
- [ ] Skill returns to concept ideation phase if user chooses to rethink
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Lean Mode — All 4 gates skipped; concept written after user confirms
|
||||
|
||||
**Fixture:**
|
||||
- No existing game concept
|
||||
- `production/session-state/review-mode.txt` contains `lean`
|
||||
|
||||
**Input:** `/brainstorm`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Concept options are presented and user selects one
|
||||
2. Concept is elaborated into a structured draft
|
||||
3. All 4 director gates are skipped — each noted: "[GATE-ID] skipped — lean mode"
|
||||
4. Skill asks user to confirm the concept is ready to write
|
||||
5. "May I write `design/gdd/game-concept.md`?" asked after confirmation
|
||||
6. Concept written after approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 4 gate skip notes appear: "CD-PILLARS skipped — lean mode", "AD-CONCEPT-VISUAL skipped — lean mode", "TD-FEASIBILITY skipped — lean mode", "PR-SCOPE skipped — lean mode"
|
||||
- [ ] Concept is written after user confirmation only (no director approval needed in lean)
|
||||
- [ ] "May I write" is still asked before writing
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Solo Mode — All gates skipped; concept written with only user approval
|
||||
|
||||
**Fixture:**
|
||||
- No existing game concept
|
||||
- `production/session-state/review-mode.txt` contains `solo`
|
||||
|
||||
**Input:** `/brainstorm`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Concept options are presented and user selects one
|
||||
2. Concept draft is shown to user
|
||||
3. All 4 director gates are skipped — each noted with "solo mode"
|
||||
4. "May I write `design/gdd/game-concept.md`?" asked
|
||||
5. Concept written after user approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 4 skip notes appear with "solo mode" label
|
||||
- [ ] No director agents are spawned
|
||||
- [ ] Concept is written with only user approval
|
||||
- [ ] Behavior is otherwise equivalent to lean mode for this skill
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — PR-SCOPE returns CONCERNS (scope too large)
|
||||
|
||||
**Fixture:**
|
||||
- Concept draft is complete
|
||||
- `production/session-state/review-mode.txt` contains `full`
|
||||
- PR-SCOPE gate returns CONCERNS: "The concept scope would require 18+ months for a solo developer"
|
||||
|
||||
**Input:** `/brainstorm`
|
||||
|
||||
**Expected behavior:**
|
||||
1. PR-SCOPE gate returns CONCERNS with specific scope feedback
|
||||
2. Skill surfaces the scope concerns to the user
|
||||
3. Scope concerns are documented in the concept draft before writing
|
||||
4. User is asked: reduce scope, accept concerns and document them, or rethink
|
||||
5. If concerns are accepted: concept is written with a "Scope Risk" note embedded
|
||||
|
||||
**Assertions:**
|
||||
- [ ] PR-SCOPE concerns are shown to the user before the "May I write" ask
|
||||
- [ ] Skill does NOT write concept without surfacing scope concerns
|
||||
- [ ] If user accepts: scope concerns are documented in the concept file
|
||||
- [ ] Skill does NOT auto-reject a concept due to PR-SCOPE CONCERNS (user decides)
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Presents 2-4 concept options with pros/cons before user commits
|
||||
- [ ] User confirms concept direction before director gates are invoked
|
||||
- [ ] All 4 director gates spawn in parallel in full mode
|
||||
- [ ] All 4 gates skipped in lean AND solo mode — each noted by name
|
||||
- [ ] "May I write `design/gdd/game-concept.md`?" asked before writing
|
||||
- [ ] Ends with next-step handoff: `/map-systems`
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- AD-CONCEPT-VISUAL gate (art director feasibility) is grouped with the other
|
||||
3 gates in the parallel spawn — not independently fixture-tested.
|
||||
- The iterative concept refinement loop (user rejects all options, skill
|
||||
generates new ones) is not fixture-tested — it follows the same pattern as
|
||||
the option selection phase.
|
||||
- The game-concept.md document structure (required sections) is defined in the
|
||||
skill body and not re-enumerated in test assertions.
|
||||
174
CCGS Skill Testing Framework/skills/utility/bug-report.md
Normal file
174
CCGS Skill Testing Framework/skills/utility/bug-report.md
Normal file
@@ -0,0 +1,174 @@
|
||||
# Skill Test Spec: /bug-report
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/bug-report` creates a structured bug report document from a user description.
|
||||
It produces a report with the following required fields: Title, Repro Steps,
|
||||
Expected Behavior, Actual Behavior, Severity (CRITICAL/HIGH/MEDIUM/LOW), Affected
|
||||
System(s), and Build/Version. If the user's initial description is missing any
|
||||
required field, the skill asks follow-up questions to fill the gaps before
|
||||
producing the draft.
|
||||
|
||||
The skill checks for possibly duplicate reports (by comparing to existing files
|
||||
in `production/bugs/`) and offers to link rather than create a new report. Each
|
||||
report is written to `production/bugs/bug-[date]-[slug].md` after a "May I write"
|
||||
ask. No director gates are used — bug reporting is an operational utility.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" collaborative protocol language before writing the report
|
||||
- [ ] Has a next-step handoff (e.g., `/bug-triage` to reprioritize, `/hotfix` for critical)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/bug-report` is an operational documentation skill. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — User describes a crash, full report produced
|
||||
|
||||
**Fixture:**
|
||||
- `production/bugs/` directory exists and is empty
|
||||
- No similar existing reports
|
||||
|
||||
**Input:** `/bug-report` (user describes: "Game crashes when player enters the boss arena")
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill extracts: Title = "Game crashes when entering boss arena"
|
||||
2. Skill recognizes crash reports as CRITICAL severity
|
||||
3. Skill confirms repro steps, expected (no crash), actual (crash), affected system
|
||||
(arena/boss), and build version with the user
|
||||
4. Skill drafts the full structured report
|
||||
5. Skill asks "May I write to `production/bugs/bug-2026-04-06-game-crashes-boss-arena.md`?"
|
||||
6. File is written on approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 7 required fields are present in the report
|
||||
- [ ] Severity is CRITICAL for a crash report
|
||||
- [ ] Filename follows the `bug-[date]-[slug].md` convention
|
||||
- [ ] "May I write" is asked with the full file path
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Minimal Input — Skill asks follow-up questions for missing fields
|
||||
|
||||
**Fixture:**
|
||||
- User provides: "Sometimes the audio cuts out"
|
||||
- No existing reports
|
||||
|
||||
**Input:** `/bug-report`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill identifies missing required fields: repro steps, expected vs. actual,
|
||||
severity, affected system, build
|
||||
2. Skill asks targeted follow-up questions for each missing field (one at a time
|
||||
or in a structured prompt)
|
||||
3. User provides answers
|
||||
4. Skill compiles complete report from answers
|
||||
5. Skill asks "May I write?" and writes on approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] At least 3 follow-up questions are asked to fill missing fields
|
||||
- [ ] Each required field is filled before the report is finalized
|
||||
- [ ] Report is not written until all required fields are present
|
||||
- [ ] Verdict is COMPLETE after all fields are filled and file is written
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Possible Duplicate — Offers to link rather than create new
|
||||
|
||||
**Fixture:**
|
||||
- `production/bugs/bug-2026-03-20-audio-cut-out.md` already exists with
|
||||
similar title and MEDIUM severity
|
||||
|
||||
**Input:** `/bug-report` (user describes: "Audio randomly stops working")
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans existing reports and finds the similar audio bug
|
||||
2. Skill reports: "A similar bug report exists: bug-2026-03-20-audio-cut-out.md"
|
||||
3. Skill presents options: link as duplicate (add note to existing), create new anyway
|
||||
4. If user chooses link: skill adds a cross-reference note to the existing file
|
||||
(asks "May I update the existing report?")
|
||||
5. If user chooses create new: normal report creation proceeds
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Existing similar report is surfaced before creating a new one
|
||||
- [ ] User is given the choice (not forced to link or create)
|
||||
- [ ] If linking: "May I update" is asked before modifying the existing file
|
||||
- [ ] Verdict is COMPLETE in either path
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Multi-System Bug — Report created with multiple system tags
|
||||
|
||||
**Fixture:**
|
||||
- No existing reports
|
||||
|
||||
**Input:** `/bug-report` (user describes: "After finishing a level, the save system
|
||||
freezes and the UI doesn't show the completion screen")
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill identifies 2 affected systems from the description: Save System and UI
|
||||
2. Report is drafted with both systems listed under Affected System(s)
|
||||
3. Severity is assessed (likely HIGH — data loss risk from save freeze)
|
||||
4. Skill asks "May I write" with the appropriate filename
|
||||
5. Report is written with both systems tagged; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Both affected systems are listed in the report
|
||||
- [ ] Single report is created (not one per system)
|
||||
- [ ] Severity reflects the most impactful component (save freeze → HIGH or CRITICAL)
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; bug reporting is operational
|
||||
|
||||
**Fixture:**
|
||||
- Any bug description provided
|
||||
|
||||
**Input:** `/bug-report`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill creates and writes the bug report
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Skill reaches COMPLETE without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Collects all 7 required fields before drafting the report
|
||||
- [ ] Asks follow-up questions for any missing required fields
|
||||
- [ ] Checks for similar existing reports before creating a new one
|
||||
- [ ] Asks "May I write to `production/bugs/bug-[date]-[slug].md`?" before writing
|
||||
- [ ] Verdict is COMPLETE when the report file is written
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where the user provides a severity that seems too low for the
|
||||
described impact (e.g., LOW for a crash) is not tested; the skill may suggest
|
||||
a higher severity but ultimately respects user input.
|
||||
- Build/version field is required but may be "unknown" if the user doesn't know —
|
||||
this is accepted as a valid value and not tested separately.
|
||||
- Report slug generation (sanitizing the title into a filename) is an
|
||||
implementation detail not assertion-tested here.
|
||||
174
CCGS Skill Testing Framework/skills/utility/bug-triage.md
Normal file
174
CCGS Skill Testing Framework/skills/utility/bug-triage.md
Normal file
@@ -0,0 +1,174 @@
|
||||
# Skill Test Spec: /bug-triage
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/bug-triage` reads all open bug reports in `production/bugs/` and produces a
|
||||
prioritized triage table sorted by severity (CRITICAL → HIGH → MEDIUM → LOW).
|
||||
It runs on the Haiku model (read-only, formatting/sorting task) and produces no
|
||||
file writes — the triage output is conversational. The skill flags bugs missing
|
||||
reproduction steps and identifies possible duplicates by comparing titles and
|
||||
affected systems.
|
||||
|
||||
The verdict is always TRIAGED — the skill is advisory and informational. No
|
||||
director gates apply. The output is intended to help a producer or QA lead
|
||||
prioritize which bugs to address next.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: TRIAGED
|
||||
- [ ] Does NOT contain "May I write" language (skill is read-only)
|
||||
- [ ] Has a next-step handoff (e.g., `/bug-report` to create new reports, `/hotfix` for critical bugs)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/bug-triage` is a read-only advisory skill. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — 5 bugs of varying severity, sorted table produced
|
||||
|
||||
**Fixture:**
|
||||
- `production/bugs/` contains 5 bug report files:
|
||||
- bug-2026-03-10-audio-crash.md (CRITICAL)
|
||||
- bug-2026-03-12-score-overflow.md (HIGH)
|
||||
- bug-2026-03-14-ui-overlap.md (MEDIUM)
|
||||
- bug-2026-03-15-typo-tutorial.md (LOW)
|
||||
- bug-2026-03-16-vfx-flicker.md (HIGH)
|
||||
|
||||
**Input:** `/bug-triage`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all 5 bug report files
|
||||
2. Skill extracts severity, title, system, and repro status from each
|
||||
3. Skill produces a triage table sorted: CRITICAL first, then HIGH, MEDIUM, LOW
|
||||
4. Within the same severity, bugs are ordered by date (oldest first)
|
||||
5. Verdict is TRIAGED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Triage table has exactly 5 rows
|
||||
- [ ] CRITICAL bug appears before both HIGH bugs
|
||||
- [ ] HIGH bugs appear before MEDIUM and LOW bugs
|
||||
- [ ] Verdict is TRIAGED
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: No Bug Reports Found — Guidance to run /bug-report
|
||||
|
||||
**Fixture:**
|
||||
- `production/bugs/` directory exists but is empty (or does not exist)
|
||||
|
||||
**Input:** `/bug-triage`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans `production/bugs/` and finds no reports
|
||||
2. Skill outputs: "No open bug reports found in production/bugs/"
|
||||
3. Skill suggests running `/bug-report` to create a bug report
|
||||
4. No triage table is produced
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Output explicitly states no bugs were found
|
||||
- [ ] `/bug-report` is suggested as the next step
|
||||
- [ ] Skill does not error out — it handles empty directory gracefully
|
||||
- [ ] Verdict is TRIAGED (with "no bugs found" context)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Bug Missing Reproduction Steps — Flagged as NEEDS REPRO INFO
|
||||
|
||||
**Fixture:**
|
||||
- `production/bugs/` contains 3 bug reports; one has an empty "Repro Steps" section
|
||||
|
||||
**Input:** `/bug-triage`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all 3 reports
|
||||
2. Skill detects the report with no repro steps
|
||||
3. That bug appears in the triage table with a `NEEDS REPRO INFO` tag
|
||||
4. Other bugs are triaged normally
|
||||
5. Verdict is TRIAGED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] `NEEDS REPRO INFO` tag appears next to the bug missing repro steps
|
||||
- [ ] The flagged bug is still included in the table (not excluded)
|
||||
- [ ] Other bugs are unaffected
|
||||
- [ ] Verdict is TRIAGED
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Possible Duplicate Bugs — Flagged in triage output
|
||||
|
||||
**Fixture:**
|
||||
- `production/bugs/` contains 2 bug reports with similar titles:
|
||||
- bug-2026-03-18-player-fall-through-floor.md
|
||||
- bug-2026-03-20-player-clips-through-floor.md
|
||||
- Both affect the "Physics" system with identical severity
|
||||
|
||||
**Input:** `/bug-triage`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads both reports and detects similar title + same system + same severity
|
||||
2. Both bugs are included in the triage table
|
||||
3. Each is tagged with `POSSIBLE DUPLICATE` and cross-references the other report
|
||||
4. No bugs are merged or deleted — flagging is advisory
|
||||
5. Verdict is TRIAGED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Both bugs appear in the table (not merged)
|
||||
- [ ] Both are tagged `POSSIBLE DUPLICATE`
|
||||
- [ ] Each cross-references the other (by filename or title)
|
||||
- [ ] Verdict is TRIAGED
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; triage is advisory
|
||||
|
||||
**Fixture:**
|
||||
- `production/bugs/` contains any number of reports
|
||||
|
||||
**Input:** `/bug-triage`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill produces the triage table
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
4. No write tool is called
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No write tool is called
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is TRIAGED without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads all files in `production/bugs/` before generating the table
|
||||
- [ ] Sorts by severity (CRITICAL → HIGH → MEDIUM → LOW)
|
||||
- [ ] Flags bugs missing repro steps
|
||||
- [ ] Flags possible duplicates by title/system similarity
|
||||
- [ ] Does not write any files
|
||||
- [ ] Verdict is TRIAGED in all cases (even empty)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where a bug report is malformed (missing severity field entirely)
|
||||
is not fixture-tested; skill would flag it as `UNKNOWN SEVERITY` and sort it
|
||||
last in the table.
|
||||
- Status transitions (marking bugs as resolved) are outside this skill's scope —
|
||||
bug-triage is read-only.
|
||||
- The duplicate detection heuristic (title similarity + same system) is
|
||||
approximate; exact matching logic is defined in the skill body.
|
||||
175
CCGS Skill Testing Framework/skills/utility/day-one-patch.md
Normal file
175
CCGS Skill Testing Framework/skills/utility/day-one-patch.md
Normal file
@@ -0,0 +1,175 @@
|
||||
# Skill Test Spec: /day-one-patch
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/day-one-patch` prepares a day-one patch plan for issues that are known at
|
||||
launch but deferred from the v1.0 release. It reads open bug reports in
|
||||
`production/bugs/`, deferred acceptance criteria from story files (stories
|
||||
marked `Status: Done` but with noted deferred ACs), and produces a prioritized
|
||||
patch plan with estimated fix timelines per issue.
|
||||
|
||||
The patch plan is written to `production/releases/day-one-patch.md` after a
|
||||
"May I write" ask. If a P0 (critical post-ship) issue is discovered, the skill
|
||||
triggers guidance to run `/hotfix` before the patch. No director gates apply.
|
||||
The verdict is always COMPLETE.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" collaborative protocol language before writing the plan
|
||||
- [ ] Has a next-step handoff (e.g., `/hotfix` for P0 issues, `/release-checklist` for follow-up)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/day-one-patch` is a release planning utility. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — 3 Known Issues, Patch Plan With Fix Estimates
|
||||
|
||||
**Fixture:**
|
||||
- `production/bugs/` contains 3 open bugs with severities: 1 MEDIUM, 2 LOW
|
||||
- No deferred ACs in sprint stories
|
||||
- All bugs have repro steps and system identifications
|
||||
|
||||
**Input:** `/day-one-patch`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all 3 open bugs
|
||||
2. Skill assigns fix effort estimates: MEDIUM bug = 1-2 days, LOW bugs = 4 hours each
|
||||
3. Skill produces a patch plan prioritizing MEDIUM bug first
|
||||
4. Plan includes: priority order, estimated timeline, responsible system, fix description
|
||||
5. Skill asks "May I write to `production/releases/day-one-patch.md`?"
|
||||
6. File written; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 3 bugs appear in the plan
|
||||
- [ ] Bugs are prioritized by severity (MEDIUM before LOW)
|
||||
- [ ] Fix estimates are provided per issue
|
||||
- [ ] "May I write" is asked before writing
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Critical Issue Discovered Post-Ship — P0, Triggers /hotfix Guidance
|
||||
|
||||
**Fixture:**
|
||||
- A CRITICAL severity bug is found in `production/bugs/` after the v1.0 release
|
||||
- The bug causes data loss for all save files
|
||||
|
||||
**Input:** `/day-one-patch`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads bugs and identifies the CRITICAL severity issue
|
||||
2. Skill escalates: "P0 ISSUE DETECTED — data loss bug requires immediate hotfix
|
||||
before patch planning can proceed"
|
||||
3. Skill does NOT include the P0 issue in the patch plan timeline
|
||||
4. Skill explicitly directs: "Run `/hotfix` to resolve this issue first"
|
||||
5. After P0 guidance is issued: plan for remaining lower-severity bugs is still
|
||||
generated and written; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] P0 escalation message appears prominently before the patch plan
|
||||
- [ ] `/hotfix` is explicitly directed for the P0 issue
|
||||
- [ ] P0 issue is NOT scheduled in the patch plan timeline (it needs immediate action)
|
||||
- [ ] Non-P0 issues are still planned; verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Deferred AC From Story-Done — Pulled Into Patch Plan Automatically
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-008.md` has a story with `Status: Done` and a note:
|
||||
"DEFERRED AC: Gamepad vibration on damage — deferred to post-launch patch"
|
||||
- No open bugs for the same system
|
||||
|
||||
**Input:** `/day-one-patch`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads sprint stories and detects the deferred AC note
|
||||
2. Deferred AC is automatically included in the patch plan as a work item
|
||||
3. Plan entry: "Deferred from sprint-008: Gamepad vibration on damage"
|
||||
4. Fix estimate is assigned; patch plan written after "May I write" approval
|
||||
5. Verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Deferred ACs from story files are automatically pulled into the plan
|
||||
- [ ] Deferred items are labeled by their source story (sprint-008)
|
||||
- [ ] Deferred AC gets a fix estimate like bug entries
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 4: No Known Issues — Empty Plan With Template Note
|
||||
|
||||
**Fixture:**
|
||||
- `production/bugs/` is empty
|
||||
- No stories have deferred ACs
|
||||
|
||||
**Input:** `/day-one-patch`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads bugs — none found
|
||||
2. Skill reads story deferred ACs — none found
|
||||
3. Skill produces an empty patch plan with a note: "No known issues at launch"
|
||||
4. Template structure is preserved (headers intact) for future use
|
||||
5. Skill asks "May I write to `production/releases/day-one-patch.md`?"
|
||||
6. File written; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] "No known issues at launch" note appears in the written file
|
||||
- [ ] Template headers are present in the empty plan
|
||||
- [ ] Skill does NOT error out when there are no issues to plan
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; day-one-patch is a planning utility
|
||||
|
||||
**Fixture:**
|
||||
- Known issues present in production/bugs/
|
||||
|
||||
**Input:** `/day-one-patch`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill generates and writes the patch plan
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is COMPLETE without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads open bugs from `production/bugs/` before generating the plan
|
||||
- [ ] Scans story files for deferred AC notes
|
||||
- [ ] Escalates CRITICAL (P0) bugs with explicit `/hotfix` guidance
|
||||
- [ ] Produces an empty plan with note when no issues exist (not an error)
|
||||
- [ ] Asks "May I write to `production/releases/day-one-patch.md`?" before writing
|
||||
- [ ] Verdict is COMPLETE in all paths
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where multiple CRITICAL bugs exist is handled the same as Case 2;
|
||||
all P0 issues are escalated together.
|
||||
- Timeline estimation for the patch (e.g., "patch available in 3 days")
|
||||
requires manual QA and build time estimates; the skill uses rough estimates
|
||||
based on severity, not actual team velocity.
|
||||
- The patch notes player communication document (`/patch-notes`) is a separate
|
||||
skill invoked after the patch plan is executed.
|
||||
172
CCGS Skill Testing Framework/skills/utility/help.md
Normal file
172
CCGS Skill Testing Framework/skills/utility/help.md
Normal file
@@ -0,0 +1,172 @@
|
||||
# Skill Test Spec: /help
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/help` analyzes what has been done and what comes next in the project workflow.
|
||||
It runs on the Haiku model (read-only, formatting task) and reads `production/stage.txt`,
|
||||
the active sprint file, and recent session state to produce a concise situational
|
||||
guidance summary. The skill optionally accepts a context query (e.g., `/help testing`)
|
||||
to surface relevant skills for a specific topic.
|
||||
|
||||
The output is always informational — no files are written and no director gates
|
||||
are invoked. The verdict is always HELP COMPLETE. The skill serves as a workflow
|
||||
navigator, suggesting 2-3 next skills based on the current project state.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: HELP COMPLETE
|
||||
- [ ] Does NOT contain "May I write" language (skill is read-only)
|
||||
- [ ] Has a next-step handoff (suggests 2-3 relevant skills based on state)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/help` is a read-only navigation skill. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Production stage with active sprint
|
||||
|
||||
**Fixture:**
|
||||
- `production/stage.txt` contains `Production`
|
||||
- `production/sprints/sprint-004.md` exists with in-progress stories
|
||||
- `production/session-state/active.md` has a recent checkpoint
|
||||
|
||||
**Input:** `/help`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads stage.txt and active sprint
|
||||
2. Skill identifies current sprint number and in-progress story count
|
||||
3. Skill outputs: current stage, sprint summary, and 3 suggested next skills
|
||||
(e.g., `/sprint-status`, `/dev-story`, `/story-done`)
|
||||
4. Suggestions are ranked by relevance to current sprint state
|
||||
5. Verdict is HELP COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Current stage is shown (Production)
|
||||
- [ ] Active sprint number and story count are mentioned
|
||||
- [ ] Exactly 2-3 next-skill suggestions are given (not a list of all skills)
|
||||
- [ ] Suggestions are appropriate for Production stage
|
||||
- [ ] Verdict is HELP COMPLETE
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Concept Stage — Shows concept-to-systems-design workflow path
|
||||
|
||||
**Fixture:**
|
||||
- `production/stage.txt` contains `Concept`
|
||||
- No sprint files, no GDD files
|
||||
- `technical-preferences.md` is configured (engine selected)
|
||||
|
||||
**Input:** `/help`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads stage.txt — detects Concept stage
|
||||
2. Skill outputs the Concept-stage workflow: brainstorm → map-systems → design-system
|
||||
3. Suggested skills are: `/brainstorm`, `/map-systems` (if concept exists)
|
||||
4. Current progress is noted: "Engine configured, concept not yet created"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Stage is identified as Concept
|
||||
- [ ] Workflow path shows the expected sequence for this stage
|
||||
- [ ] Suggestions do not include Production-stage skills (e.g., `/dev-story`)
|
||||
- [ ] Verdict is HELP COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No stage.txt — Shows full workflow overview
|
||||
|
||||
**Fixture:**
|
||||
- No `production/stage.txt`
|
||||
- No sprint files
|
||||
- `technical-preferences.md` has placeholders
|
||||
|
||||
**Input:** `/help`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill cannot determine stage from stage.txt
|
||||
2. Skill runs project-stage-detect logic to infer stage from artifacts
|
||||
3. If stage cannot be inferred: outputs the full workflow overview from
|
||||
Concept through Release as a reference map
|
||||
4. Primary suggestion is `/start` to begin configuration
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash when stage.txt is absent
|
||||
- [ ] Full workflow overview is shown when stage cannot be determined
|
||||
- [ ] `/start` or `/project-stage-detect` is a top suggestion
|
||||
- [ ] Verdict is HELP COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Context Query — User asks for help with testing
|
||||
|
||||
**Fixture:**
|
||||
- `production/stage.txt` contains `Production`
|
||||
- Active sprint has a story with `Status: In Review`
|
||||
|
||||
**Input:** `/help testing`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads context query: "testing"
|
||||
2. Skill surfaces skills relevant to testing: `/qa-plan`, `/smoke-check`,
|
||||
`/regression-suite`, `/test-setup`, `/test-evidence-review`
|
||||
3. Output is focused on testing workflow, not general sprint navigation
|
||||
4. Currently in-review story is highlighted as a testing candidate
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Context query is acknowledged in output ("Help topic: testing")
|
||||
- [ ] At least 3 testing-relevant skills are listed
|
||||
- [ ] General sprint skills (e.g., `/sprint-plan`) are not the primary suggestions
|
||||
- [ ] Verdict is HELP COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; help is read-only navigation
|
||||
|
||||
**Fixture:**
|
||||
- Any project state
|
||||
|
||||
**Input:** `/help`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill produces workflow guidance summary
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
4. No write tool is called
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No write tool is called
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is HELP COMPLETE without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads stage, sprint, and session state before generating suggestions
|
||||
- [ ] Suggestions are specific to the current project state (not generic)
|
||||
- [ ] Context query (if provided) narrows the suggestion set
|
||||
- [ ] Does not write any files
|
||||
- [ ] Verdict is HELP COMPLETE in all cases
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where the active sprint is complete (all stories Done) is not
|
||||
separately tested; the skill would suggest `/sprint-plan` for the next sprint.
|
||||
- The `/help` skill does not validate whether suggested skills are available —
|
||||
it assumes standard skill catalog availability.
|
||||
- Stage detection fallback (when stage.txt is absent) delegates to the same
|
||||
logic as `/project-stage-detect` and is not re-tested here in detail.
|
||||
173
CCGS Skill Testing Framework/skills/utility/hotfix.md
Normal file
173
CCGS Skill Testing Framework/skills/utility/hotfix.md
Normal file
@@ -0,0 +1,173 @@
|
||||
# Skill Test Spec: /hotfix
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/hotfix` manages an emergency fix workflow: it creates a hotfix branch from
|
||||
main, applies a targeted fix to the identified file(s), runs `/smoke-check` to
|
||||
validate the fix doesn't introduce regressions, and prompts the user to confirm
|
||||
merge back to main. Each code change requires a "May I write to [filepath]?" ask.
|
||||
Git operations (branch creation, merge) are presented as Bash commands for user
|
||||
confirmation before execution.
|
||||
|
||||
The skill is time-sensitive — director review is optional post-hoc, not a
|
||||
blocking gate. Verdicts: HOTFIX COMPLETE (fix applied, smoke check passed, merged)
|
||||
or HOTFIX BLOCKED (fix introduced regression or user declined).
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: HOTFIX COMPLETE, HOTFIX BLOCKED
|
||||
- [ ] Contains "May I write" language for code changes
|
||||
- [ ] Has a next-step handoff (e.g., `/bug-report` to document the issue, or version bump)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Hotfixes are time-critical. Director review may follow separately as a
|
||||
post-hoc step. No gate is invoked within this skill.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Critical crash bug fixed, smoke check passes
|
||||
|
||||
**Fixture:**
|
||||
- `main` branch is clean
|
||||
- Bug is identified in `src/gameplay/arena.gd` (crash on boss arena entry)
|
||||
- Repro steps are provided by user
|
||||
|
||||
**Input:** `/hotfix` (user describes the crash and affected file)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill proposes creating a hotfix branch: `hotfix/boss-arena-crash`
|
||||
2. User confirms; Bash command for branch creation is shown and confirmed
|
||||
3. Skill identifies the fix location in `arena.gd` and drafts the change
|
||||
4. Skill asks "May I write to `src/gameplay/arena.gd`?" and applies fix on approval
|
||||
5. Skill runs `/smoke-check` — PASS
|
||||
6. Skill presents the merge command and asks user to confirm merge to `main`
|
||||
7. User confirms; merge executes; verdict is HOTFIX COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Hotfix branch is created before any code changes
|
||||
- [ ] "May I write" is asked before modifying any source file
|
||||
- [ ] `/smoke-check` runs after the fix is applied
|
||||
- [ ] Merge requires explicit user confirmation (not automatic)
|
||||
- [ ] Verdict is HOTFIX COMPLETE after successful merge
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Smoke Check Fails — HOTFIX BLOCKED
|
||||
|
||||
**Fixture:**
|
||||
- Fix has been applied to `src/gameplay/arena.gd`
|
||||
- `/smoke-check` returns FAIL: "Player health clamping regression detected"
|
||||
|
||||
**Input:** `/hotfix`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill applies the fix and runs `/smoke-check`
|
||||
2. Smoke check returns FAIL with specific regression identified
|
||||
3. Skill reports: "HOTFIX BLOCKED — smoke check failed: [regression detail]"
|
||||
4. Skill presents options: attempt revised fix, revert changes, or merge with
|
||||
known regression (user acknowledges risk)
|
||||
5. No automatic merge occurs when smoke check fails
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is HOTFIX BLOCKED
|
||||
- [ ] Smoke check failure is shown verbatim to user
|
||||
- [ ] Merge is NOT performed automatically when smoke check fails
|
||||
- [ ] User is given explicit options for how to proceed
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Fix to Already-Released Build — Version tag noted, patch bump prompted
|
||||
|
||||
**Fixture:**
|
||||
- Latest git tag is `v1.2.0`
|
||||
- Hotfix targets a bug in the v1.2.0 release
|
||||
|
||||
**Input:** `/hotfix`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects that the current HEAD is a tagged release (v1.2.0)
|
||||
2. Skill notes: "Hotfix targeting tagged release v1.2.0"
|
||||
3. After smoke check passes, skill prompts: "Should version be bumped to v1.2.1?"
|
||||
4. If user confirms version bump: skill asks "May I write to VERSION or equivalent?"
|
||||
5. After version update and merge: verdict is HOTFIX COMPLETE with version noted
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Version tag context is detected and surfaced to user
|
||||
- [ ] Patch version bump is suggested (not required) after merge
|
||||
- [ ] Version bump requires its own "May I write" confirmation
|
||||
- [ ] Verdict is HOTFIX COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 4: No Repro Steps — Skill Asks Before Applying Fix
|
||||
|
||||
**Fixture:**
|
||||
- User invokes `/hotfix` with a vague description: "something is broken on level 3"
|
||||
- No repro steps provided
|
||||
|
||||
**Input:** `/hotfix` (vague description)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects insufficient information to identify the fix location
|
||||
2. Skill asks: "Please provide reproduction steps and the affected file or system"
|
||||
3. Skill does NOT create a branch or modify any file until repro steps are provided
|
||||
4. After user provides repro steps: normal hotfix flow begins
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No branch is created without repro steps
|
||||
- [ ] No code changes are made without a clearly identified fix location
|
||||
- [ ] Repro step request is specific (not a generic "please provide more info")
|
||||
- [ ] Normal hotfix flow resumes after user provides repro steps
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; hotfixes are time-critical
|
||||
|
||||
**Fixture:**
|
||||
- Critical bug with repro steps identified
|
||||
|
||||
**Input:** `/hotfix`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill completes the hotfix workflow
|
||||
2. No director agents are spawned during execution
|
||||
3. No gate IDs appear in output
|
||||
4. Post-hoc director review (if needed) is a manual follow-up, not invoked here
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is HOTFIX COMPLETE or HOTFIX BLOCKED — no gate verdict
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Creates hotfix branch before making any code changes
|
||||
- [ ] Asks "May I write" before modifying any source files
|
||||
- [ ] Runs `/smoke-check` after applying the fix
|
||||
- [ ] Requires explicit user confirmation before merging
|
||||
- [ ] HOTFIX BLOCKED when smoke check fails — no automatic merge
|
||||
- [ ] Verdict is HOTFIX COMPLETE or HOTFIX BLOCKED
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where multiple files need to be modified for one fix follows the same
|
||||
"May I write" per-file pattern and is not separately tested.
|
||||
- The post-hotfix steps (create bug report, update changelog) are suggested in
|
||||
the handoff but not tested as part of this skill's execution.
|
||||
- Conflict resolution during the merge (if main has diverged) is not tested;
|
||||
the skill would surface the conflict and ask the user to resolve it manually.
|
||||
180
CCGS Skill Testing Framework/skills/utility/launch-checklist.md
Normal file
180
CCGS Skill Testing Framework/skills/utility/launch-checklist.md
Normal file
@@ -0,0 +1,180 @@
|
||||
# Skill Test Spec: /launch-checklist
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/launch-checklist` generates and evaluates a complete launch readiness checklist
|
||||
covering: legal compliance (EULA, privacy policy, ESRB/PEGI ratings), platform
|
||||
certification status, store page completeness (screenshots, description, metadata),
|
||||
build validation (version tag, reproducible build), analytics and crash reporting
|
||||
configuration, and first-run experience verification.
|
||||
|
||||
The skill produces a checklist report written to `production/launch/launch-checklist-[date].md`
|
||||
after a "May I write" ask. If a previous launch checklist exists, it compares the
|
||||
new results against the old to highlight newly resolved and newly blocked items. No
|
||||
director gates apply — `/team-release` orchestrates the full release pipeline. Verdicts:
|
||||
LAUNCH READY, LAUNCH BLOCKED, or CONCERNS.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: LAUNCH READY, LAUNCH BLOCKED, CONCERNS
|
||||
- [ ] Contains "May I write" collaborative protocol language before writing the checklist
|
||||
- [ ] Has a next-step handoff (e.g., `/team-release` or `/day-one-patch`)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/launch-checklist` is a readiness audit utility. The full release pipeline
|
||||
is managed by `/team-release`.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All Checklist Items Verified, LAUNCH READY
|
||||
|
||||
**Fixture:**
|
||||
- Legal docs present: EULA, privacy policy in `production/legal/`
|
||||
- Platform certification: marked as submitted and approved in production notes
|
||||
- Store page assets: screenshots, description, metadata all present in `production/store/`
|
||||
- Build: version tag `v1.0.0` exists, reproducible build confirmed
|
||||
- Crash reporting: configured in `technical-preferences.md`
|
||||
|
||||
**Input:** `/launch-checklist`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill checks all checklist categories
|
||||
2. All items pass their verification checks
|
||||
3. Skill produces checklist report with all items marked PASS
|
||||
4. Skill asks "May I write to `production/launch/launch-checklist-2026-04-06.md`?"
|
||||
5. Report written on approval; verdict is LAUNCH READY
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All checklist categories are checked (legal, platform, store, build, analytics, UX)
|
||||
- [ ] All items appear in the report with PASS markers
|
||||
- [ ] Verdict is LAUNCH READY
|
||||
- [ ] "May I write" is asked with the correct dated filename
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Platform Certification Not Submitted — LAUNCH BLOCKED
|
||||
|
||||
**Fixture:**
|
||||
- All other checklist items pass
|
||||
- Platform certification section: "not submitted" (no submission record found)
|
||||
|
||||
**Input:** `/launch-checklist`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill checks all items
|
||||
2. Platform certification check fails: no submission record
|
||||
3. Skill reports: "LAUNCH BLOCKED — Platform certification not submitted"
|
||||
4. Specific platform(s) missing certification are named
|
||||
5. Verdict is LAUNCH BLOCKED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is LAUNCH BLOCKED (not CONCERNS)
|
||||
- [ ] Platform certification is identified as the blocking item
|
||||
- [ ] Missing platform names are specified
|
||||
- [ ] All other passing items are still shown in the report
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Manual Check Required — CONCERNS Verdict
|
||||
|
||||
**Fixture:**
|
||||
- All critical checklist items pass
|
||||
- First-run experience item: "MANUAL CHECK NEEDED — human must play the first 5
|
||||
minutes and verify tutorial completion flow"
|
||||
- Store screenshots item: "MANUAL CHECK NEEDED — art team must verify screenshot
|
||||
quality matches current build"
|
||||
|
||||
**Input:** `/launch-checklist`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill checks all items
|
||||
2. 2 items are flagged as requiring human verification
|
||||
3. Skill reports: "CONCERNS — 2 items require manual verification before launch"
|
||||
4. Both items are listed with instructions for what to manually verify
|
||||
5. Verdict is CONCERNS (not LAUNCH BLOCKED, since these are advisory)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is CONCERNS (not LAUNCH READY or LAUNCH BLOCKED)
|
||||
- [ ] Both manual check items are listed with verification instructions
|
||||
- [ ] Skill does not auto-block on MANUAL CHECK items
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Previous Checklist Exists — Delta Comparison
|
||||
|
||||
**Fixture:**
|
||||
- `production/launch/launch-checklist-2026-03-25.md` exists with previous results:
|
||||
- 2 items were BLOCKED (platform cert, crash reporting)
|
||||
- 1 item had a MANUAL CHECK
|
||||
- New checklist: platform cert is now PASS, crash reporting is now PASS,
|
||||
manual check still open; 1 new item flagged (EULA last updated date)
|
||||
|
||||
**Input:** `/launch-checklist`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill finds the previous checklist and loads it for comparison
|
||||
2. Skill produces the new checklist and compares:
|
||||
- Newly resolved: "Platform cert — was BLOCKED, now PASS"
|
||||
- Newly resolved: "Crash reporting — was BLOCKED, now PASS"
|
||||
- Still open: manual check (unchanged)
|
||||
- New issue: EULA last updated date (not in previous checklist)
|
||||
3. Delta is shown prominently in the report
|
||||
4. Verdict is CONCERNS (manual check + new EULA question)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Delta section shows newly resolved items
|
||||
- [ ] Delta section shows new issues (not present in previous checklist)
|
||||
- [ ] Still-open items from the previous checklist are noted as persistent
|
||||
- [ ] Verdict reflects the current state (not the previous state)
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; launch-checklist is an audit utility
|
||||
|
||||
**Fixture:**
|
||||
- All checklist dependencies present
|
||||
|
||||
**Input:** `/launch-checklist`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill runs the full checklist and writes the report
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is LAUNCH READY, LAUNCH BLOCKED, or CONCERNS — no gate verdict
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Checks all required categories (legal, platform, store, build, analytics, UX)
|
||||
- [ ] LAUNCH BLOCKED for hard failures (uncompleted certifications, missing legal docs)
|
||||
- [ ] CONCERNS for advisory items requiring manual verification
|
||||
- [ ] Compares against previous checklist when one exists
|
||||
- [ ] Asks "May I write" before creating the checklist report
|
||||
- [ ] Verdict is LAUNCH READY, LAUNCH BLOCKED, or CONCERNS
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Region-specific compliance (GDPR data handling, COPPA for under-13 audiences)
|
||||
is checked but the specific requirements are not enumerated in test assertions.
|
||||
- The store page completeness check (screenshots, description) relies on the
|
||||
presence of files in `production/store/`; it cannot verify visual quality.
|
||||
- Build reproducibility check validates the presence of a version tag and build
|
||||
configuration but does not execute the build process.
|
||||
176
CCGS Skill Testing Framework/skills/utility/localize.md
Normal file
176
CCGS Skill Testing Framework/skills/utility/localize.md
Normal file
@@ -0,0 +1,176 @@
|
||||
# Skill Test Spec: /localize
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/localize` manages the full localization pipeline: it extracts all player-facing
|
||||
strings from source files, manages translation files in `assets/localization/`,
|
||||
and validates completeness across all locale files. For new languages, it creates
|
||||
a locale file skeleton with all current strings as keys and empty values. For
|
||||
existing locale files, it produces a diff showing additions, removals, and
|
||||
changed keys.
|
||||
|
||||
Translation files are written to `assets/localization/[locale-code].csv` (or
|
||||
engine-appropriate format) after a "May I write" ask. No director gates apply.
|
||||
Verdicts: LOCALIZATION COMPLETE (all locales are complete) or GAPS FOUND (at
|
||||
least one locale is missing string keys).
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: LOCALIZATION COMPLETE, GAPS FOUND
|
||||
- [ ] Contains "May I write" collaborative protocol language before writing locale files
|
||||
- [ ] Has a next-step handoff (e.g., send locale skeletons to translators)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/localize` is a pipeline utility. No director gates apply. Localization
|
||||
lead agent may review separately but is not invoked within this skill.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: New Language — String Extraction and Locale Skeleton Created
|
||||
|
||||
**Fixture:**
|
||||
- Source code in `src/` contains player-facing strings (UI text, tutorial messages)
|
||||
- Existing locale: `assets/localization/en.csv`
|
||||
- No French locale exists
|
||||
|
||||
**Input:** `/localize fr`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill extracts all player-facing strings from source files
|
||||
2. Skill finds the same strings in `en.csv` as a reference
|
||||
3. Skill generates `fr.csv` skeleton with all string keys and empty values
|
||||
4. Skill asks "May I write to `assets/localization/fr.csv`?"
|
||||
5. File written on approval; verdict is GAPS FOUND (file created but empty values)
|
||||
6. Skill notes: "fr.csv created — send to translator to fill values"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All string keys from `en.csv` are present in `fr.csv`
|
||||
- [ ] All values in `fr.csv` are empty (not copied from English)
|
||||
- [ ] "May I write" is asked before creating the file
|
||||
- [ ] Verdict is GAPS FOUND (file is created but untranslated)
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Existing Locale Diff — Additions, Removals, and Changes Listed
|
||||
|
||||
**Fixture:**
|
||||
- `assets/localization/fr.csv` exists with 20 string keys translated
|
||||
- Source code has changed: 3 new strings added, 1 string removed, 2 strings
|
||||
with changed English source text
|
||||
|
||||
**Input:** `/localize fr`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill extracts current strings from source
|
||||
2. Skill diffs against existing `fr.csv`
|
||||
3. Skill produces diff report:
|
||||
- 3 new keys (need translation — listed as empty in fr.csv)
|
||||
- 1 removed key (marked as obsolete — suggest removal)
|
||||
- 2 changed keys (English source changed — French may need update, flagged)
|
||||
4. Skill asks "May I update `assets/localization/fr.csv`?"
|
||||
5. File updated with new empty keys added, obsolete keys marked; verdict is GAPS FOUND
|
||||
|
||||
**Assertions:**
|
||||
- [ ] New keys appear as empty in the updated file (not auto-translated)
|
||||
- [ ] Removed keys are flagged as obsolete (not silently deleted)
|
||||
- [ ] Changed source strings are flagged for translator review
|
||||
- [ ] Verdict is GAPS FOUND (new empty keys exist)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: String Missing in One Locale — GAPS FOUND With Missing Key List
|
||||
|
||||
**Fixture:**
|
||||
- 3 locale files exist: `en.csv`, `fr.csv`, `de.csv`
|
||||
- `de.csv` is missing 4 keys that exist in both `en.csv` and `fr.csv`
|
||||
|
||||
**Input:** `/localize`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all 3 locale files and cross-references keys
|
||||
2. `de.csv` is missing 4 keys
|
||||
3. Skill produces GAPS FOUND report listing the 4 missing keys by locale:
|
||||
"de.csv missing: [key1], [key2], [key3], [key4]"
|
||||
4. Skill offers to add the missing keys as empty values to `de.csv`
|
||||
5. After approval: file updated; verdict remains GAPS FOUND (values still empty)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Missing keys are listed explicitly (not just a count)
|
||||
- [ ] Missing keys are attributed to the specific locale file
|
||||
- [ ] Verdict is GAPS FOUND (not LOCALIZATION COMPLETE)
|
||||
- [ ] Missing keys are added as empty (not auto-translated from English)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Translation File Has Syntax Error — Error With Line Reference
|
||||
|
||||
**Fixture:**
|
||||
- `assets/localization/fr.csv` has a malformed line at line 47
|
||||
(missing quote closure)
|
||||
|
||||
**Input:** `/localize fr`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `fr.csv` and encounters a parse error at line 47
|
||||
2. Skill outputs: "Parse error in fr.csv at line 47: [error detail]"
|
||||
3. Skill cannot diff or validate the file until the error is fixed
|
||||
4. Skill does NOT attempt to overwrite or auto-fix the malformed file
|
||||
5. Skill suggests fixing the file manually and re-running `/localize`
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Error message includes line number (line 47)
|
||||
- [ ] Error detail describes the nature of the parse error
|
||||
- [ ] Skill does NOT overwrite or modify the malformed file
|
||||
- [ ] Manual fix + re-run is suggested as remediation
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; localization is a pipeline utility
|
||||
|
||||
**Fixture:**
|
||||
- Source code with player-facing strings
|
||||
|
||||
**Input:** `/localize fr`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill extracts strings and manages locale files
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is LOCALIZATION COMPLETE or GAPS FOUND — no gate verdict
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Extracts strings from source before operating on locale files
|
||||
- [ ] Creates new locale files with all keys as empty values (not auto-translated)
|
||||
- [ ] Diffs existing locale files against current source strings
|
||||
- [ ] Flags missing keys by locale and by key name
|
||||
- [ ] Asks "May I write" before creating or updating any locale file
|
||||
- [ ] Verdict is LOCALIZATION COMPLETE (all locales fully translated) or GAPS FOUND
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- LOCALIZATION COMPLETE is only achievable when all locale files have all keys
|
||||
with non-empty values; new-language skeleton creation always results in GAPS FOUND.
|
||||
- Engine-specific locale formats (Godot `.translation`, Unity `.po` files) are
|
||||
handled by the skill body; `.csv` is used as the canonical format in tests.
|
||||
- The case where source strings change at a very high rate (continuous integration
|
||||
of new UI text) is not tested; the diff logic handles this case.
|
||||
179
CCGS Skill Testing Framework/skills/utility/onboard.md
Normal file
179
CCGS Skill Testing Framework/skills/utility/onboard.md
Normal file
@@ -0,0 +1,179 @@
|
||||
# Skill Test Spec: /onboard
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/onboard` generates a contextual project onboarding summary tailored for a new
|
||||
team member. It reads CLAUDE.md, `technical-preferences.md`, the active sprint
|
||||
file, recent git commits, and `production/stage.txt` to produce a structured
|
||||
orientation document. The skill runs on the Haiku model (read-only, formatting
|
||||
task) and produces no file writes — all output is conversational.
|
||||
|
||||
The skill optionally accepts a role argument (e.g., `/onboard artist`) to tailor
|
||||
the summary to a specific discipline. When the project is in an early stage or
|
||||
unconfigured, the output adapts to reflect what little is known. The verdict is
|
||||
always ONBOARDING COMPLETE — the skill is purely informational.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: ONBOARDING COMPLETE
|
||||
- [ ] Does NOT contain "May I write" language (skill is read-only)
|
||||
- [ ] Has a next-step handoff suggesting a relevant follow-on skill
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/onboard` is a read-only orientation skill. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Configured project in Production stage with active sprint
|
||||
|
||||
**Fixture:**
|
||||
- `production/stage.txt` contains `Production`
|
||||
- `technical-preferences.md` has engine, language, and specialists populated
|
||||
- `production/sprints/sprint-005.md` exists with stories in progress
|
||||
- Git log contains 5 recent commits
|
||||
|
||||
**Input:** `/onboard`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads stage.txt, technical-preferences.md, active sprint, and git log
|
||||
2. Skill produces an onboarding summary with sections: Project Overview, Tech Stack,
|
||||
Current Stage, Active Sprint Summary, Recent Activity
|
||||
3. Summary is formatted for readability (headers, bullet points)
|
||||
4. Next-step suggestions are appropriate for Production stage (e.g., `/sprint-status`,
|
||||
`/dev-story`)
|
||||
5. Verdict ONBOARDING COMPLETE is stated
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Output includes current stage name from stage.txt
|
||||
- [ ] Output includes engine and language from technical-preferences.md
|
||||
- [ ] Active sprint stories are summarized (not just the sprint file name)
|
||||
- [ ] Recent commit context is present
|
||||
- [ ] Verdict is ONBOARDING COMPLETE
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Fresh Project — No engine, no sprint, suggests /start
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` contains only placeholders (`[TO BE CONFIGURED]`)
|
||||
- No `production/stage.txt`
|
||||
- No sprint files
|
||||
- No CLAUDE.md overrides beyond defaults
|
||||
|
||||
**Input:** `/onboard`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all config files and detects unconfigured state
|
||||
2. Skill produces a minimal summary: "This project has not been configured yet"
|
||||
3. Output explains the onboarding workflow: `/start` → `/setup-engine` → `/brainstorm`
|
||||
4. Skill suggests running `/start` as the immediate next step
|
||||
5. Verdict is ONBOARDING COMPLETE (informational, not a failure)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Output explicitly mentions the project is not yet configured
|
||||
- [ ] `/start` is recommended as the next step
|
||||
- [ ] Skill does NOT error out — it gracefully handles an empty project state
|
||||
- [ ] Verdict is still ONBOARDING COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No CLAUDE.md Found — Error with remediation
|
||||
|
||||
**Fixture:**
|
||||
- `CLAUDE.md` file does not exist (deleted or never created)
|
||||
- All other files may or may not exist
|
||||
|
||||
**Input:** `/onboard`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read CLAUDE.md and fails
|
||||
2. Skill outputs an error: "CLAUDE.md not found — cannot generate onboarding summary"
|
||||
3. Skill provides remediation: "Run `/start` to initialize the project configuration"
|
||||
4. No partial summary is generated
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Error message clearly identifies the missing file as CLAUDE.md
|
||||
- [ ] Remediation step (`/start`) is explicitly named
|
||||
- [ ] Skill does NOT produce a partial output when the root config is missing
|
||||
- [ ] Verdict is ONBOARDING COMPLETE (with error context, not a crash)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Role-Specific Onboarding — User specifies "artist" role
|
||||
|
||||
**Fixture:**
|
||||
- Fully configured project in Production stage
|
||||
- `art-bible.md` exists in `design/`
|
||||
- Active sprint has visual story types (animation, VFX)
|
||||
|
||||
**Input:** `/onboard artist`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all standard files plus any art-relevant docs (art bible, asset specs)
|
||||
2. Summary is tailored to the artist role: art bible overview, asset pipeline,
|
||||
current visual stories in the active sprint
|
||||
3. Technical architecture details (code structure, ADRs) are de-emphasized
|
||||
4. Specialist agents for art/audio are highlighted in the summary
|
||||
5. Verdict is ONBOARDING COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Role argument is acknowledged in the output ("Onboarding for: Artist")
|
||||
- [ ] Art bible summary is included if the file exists
|
||||
- [ ] Current visual stories from the active sprint are shown
|
||||
- [ ] Technical implementation details are not the primary focus
|
||||
- [ ] Verdict is ONBOARDING COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; onboard is read-only orientation
|
||||
|
||||
**Fixture:**
|
||||
- Any configured project state
|
||||
|
||||
**Input:** `/onboard`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill completes the full onboarding summary
|
||||
2. No director agents are spawned at any point
|
||||
3. No gate IDs appear in the output
|
||||
4. No "May I write" prompts appear
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No write tool is called
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is ONBOARDING COMPLETE without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads all source files before generating output (no hallucinated project state)
|
||||
- [ ] Adapts output to project stage (Production ≠ Concept)
|
||||
- [ ] Respects role argument when provided
|
||||
- [ ] Does not write any files
|
||||
- [ ] Ends with ONBOARDING COMPLETE verdict in all paths
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where `technical-preferences.md` is missing entirely (as opposed to
|
||||
having placeholders) is not separately tested; behavior follows the graceful
|
||||
error pattern of Case 3.
|
||||
- Git history reading is assumed available; offline/no-git scenarios are not
|
||||
tested here.
|
||||
- Discipline roles beyond "artist" (e.g., programmer, designer, producer) follow
|
||||
the same tailoring pattern as Case 4 and are not separately tested.
|
||||
178
CCGS Skill Testing Framework/skills/utility/playtest-report.md
Normal file
178
CCGS Skill Testing Framework/skills/utility/playtest-report.md
Normal file
@@ -0,0 +1,178 @@
|
||||
# Skill Test Spec: /playtest-report
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/playtest-report` generates a structured playtest report from session notes or
|
||||
user input. The report is organized into four sections: Feel/Accessibility,
|
||||
Bugs Observed, Design Feedback, and Next Steps. When multiple testers participated,
|
||||
the skill aggregates feedback and distinguishes majority opinions from minority
|
||||
ones. The skill links to existing bug reports when a reported bug matches a file
|
||||
in `production/bugs/`.
|
||||
|
||||
Reports are written to `production/qa/playtest-[date].md` after a "May I write"
|
||||
ask. No director gates apply here — the CD-PLAYTEST director gate (if needed) is
|
||||
a separate invocation. The verdict is COMPLETE when the report is written.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" collaborative protocol language before writing the report
|
||||
- [ ] Has a next-step handoff (e.g., `/bug-report` for new issues found, `/design-review` for feedback)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/playtest-report` is a documentation utility. The CD-PLAYTEST gate is a
|
||||
separate invocation and not part of this skill.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — User provides playtest notes, structured report produced
|
||||
|
||||
**Fixture:**
|
||||
- User provides typed playtest notes from a single session
|
||||
- Notes cover: game feel, one bug (framerate drop), and a design concern
|
||||
(tutorial too long)
|
||||
- `production/bugs/` exists but is empty (bug not yet reported)
|
||||
|
||||
**Input:** `/playtest-report` (user pastes session notes)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the provided notes and structures them into the 4-section template
|
||||
2. Feel/Accessibility: extracts feel observations
|
||||
3. Bugs: notes the framerate drop with available repro details
|
||||
4. Design Feedback: notes the tutorial length concern
|
||||
5. Next Steps: suggests `/bug-report` for the framerate issue and `/design-review`
|
||||
for the tutorial feedback
|
||||
6. Skill asks "May I write to `production/qa/playtest-2026-04-06.md`?"
|
||||
7. Report is written on approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 4 sections are present in the report
|
||||
- [ ] Bug is listed in the Bugs section (not the Design Feedback section)
|
||||
- [ ] Next Steps are appropriate (bug report for crash, design review for feedback)
|
||||
- [ ] "May I write" is asked before writing
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Empty Input — Guided prompting through each section
|
||||
|
||||
**Fixture:**
|
||||
- No notes provided by user at invocation
|
||||
|
||||
**Input:** `/playtest-report`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects empty input
|
||||
2. Skill prompts through each section:
|
||||
a. "Describe the overall feel and any accessibility observations"
|
||||
b. "Were any bugs observed? Describe them"
|
||||
c. "What design feedback did testers provide?"
|
||||
3. User answers each prompt
|
||||
4. Skill compiles report from answers and asks "May I write"
|
||||
5. Report written on approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] At least 3 guiding questions are asked (one per main section)
|
||||
- [ ] Report is not created until all sections have input (or user explicitly skips one)
|
||||
- [ ] Verdict is COMPLETE after file is written
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Multiple Testers — Aggregated feedback with majority/minority notes
|
||||
|
||||
**Fixture:**
|
||||
- User provides notes from 3 testers
|
||||
- 2/3 testers found the controls "intuitive"
|
||||
- 1/3 tester found the UI font too small
|
||||
- All 3 noted the same bug (player stuck on ledge)
|
||||
|
||||
**Input:** `/playtest-report` (3-tester session)
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill identifies 3 distinct tester perspectives in the input
|
||||
2. Control intuitiveness → noted as "Majority (2/3): controls intuitive"
|
||||
3. Font size → noted as "Minority (1/3): UI font size concern"
|
||||
4. Stuck-on-ledge bug → noted as "All testers: player stuck on ledge (confirmed)"
|
||||
5. Skill generates aggregated report with majority/minority labels
|
||||
6. Report written after "May I write" approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Majority opinion (2/3) is labeled as majority
|
||||
- [ ] Minority opinion (1/3) is labeled as minority
|
||||
- [ ] Unanimously reported bug is noted as confirmed by all testers
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Bug Matches Existing Report — Links to existing file
|
||||
|
||||
**Fixture:**
|
||||
- `production/bugs/bug-2026-03-30-player-stuck-ledge.md` exists
|
||||
- User's playtest notes describe "player gets stuck on ledges near walls"
|
||||
|
||||
**Input:** `/playtest-report`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill structures the report and identifies the stuck-on-ledge bug
|
||||
2. Skill scans `production/bugs/` and finds `bug-2026-03-30-player-stuck-ledge.md`
|
||||
3. In the Bugs section, the report includes: "See existing report:
|
||||
production/bugs/bug-2026-03-30-player-stuck-ledge.md"
|
||||
4. Skill does NOT suggest creating a new bug report for this issue
|
||||
5. Report written; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Existing bug report is found and linked in the playtest report
|
||||
- [ ] `/bug-report` is NOT suggested for the already-reported issue
|
||||
- [ ] Cross-reference to existing file appears in the Bugs section
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; CD-PLAYTEST is a separate invocation
|
||||
|
||||
**Fixture:**
|
||||
- Playtest notes provided
|
||||
|
||||
**Input:** `/playtest-report`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill generates and writes the playtest report
|
||||
2. No director agents are spawned (CD-PLAYTEST is not invoked here)
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No CD-PLAYTEST gate skip message appears
|
||||
- [ ] Verdict is COMPLETE without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Structures output into all 4 sections (Feel, Bugs, Design Feedback, Next Steps)
|
||||
- [ ] Labels majority vs. minority opinions when multiple testers are involved
|
||||
- [ ] Cross-references existing bug reports when bugs match
|
||||
- [ ] Asks "May I write to `production/qa/playtest-[date].md`?" before writing
|
||||
- [ ] Verdict is COMPLETE when report is written
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The CD-PLAYTEST director gate (creative director reviews playtest insights
|
||||
for design implications) is a separate invocation and is not tested here.
|
||||
- Video recording or screenshot attachments are not tested; the report is a
|
||||
text-only document.
|
||||
- The case where a tester's identity is unknown (anonymous feedback) follows
|
||||
the same aggregation pattern as Case 3 without tester labels.
|
||||
@@ -0,0 +1,183 @@
|
||||
# Skill Test Spec: /project-stage-detect
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/project-stage-detect` automatically analyzes project artifacts to determine
|
||||
the current development stage. It runs on the Haiku model (read-only) and
|
||||
examines `production/stage.txt` (if present), design documents in `design/`,
|
||||
source code in `src/`, sprint and milestone files in `production/`, and the
|
||||
presence of engine configuration to classify the project into one of seven
|
||||
stages: Concept, Systems Design, Technical Setup, Pre-Production, Production,
|
||||
Polish, or Release.
|
||||
|
||||
The skill is advisory — it never writes `stage.txt`. That file is only updated
|
||||
when `/gate-check` passes and the user confirms advancement. The skill reports
|
||||
its confidence level (HIGH if stage.txt was read directly, MEDIUM if inferred
|
||||
from artifacts, LOW if conflicting signals were found).
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains all seven stage names: Concept, Systems Design, Technical Setup, Pre-Production, Production, Polish, Release
|
||||
- [ ] Does NOT contain "May I write" language (skill is detection-only)
|
||||
- [ ] Has a next-step handoff (e.g., `/gate-check` to formally advance stage)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/project-stage-detect` is a read-only detection utility. No director
|
||||
gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: stage.txt Exists — Reads directly and cross-checks artifacts
|
||||
|
||||
**Fixture:**
|
||||
- `production/stage.txt` contains `Production`
|
||||
- `design/gdd/` has 4 GDD files
|
||||
- `src/` has source code files
|
||||
- `production/sprints/sprint-002.md` exists
|
||||
|
||||
**Input:** `/project-stage-detect`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `production/stage.txt` — detects stage `Production`
|
||||
2. Skill cross-checks artifacts: GDDs present, source code present, sprint present
|
||||
3. Artifacts are consistent with Production stage
|
||||
4. Skill reports: Stage = Production, Confidence = HIGH (from stage.txt, confirmed by artifacts)
|
||||
5. Next step: continue with `/sprint-plan` or `/dev-story`
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Detected stage is Production
|
||||
- [ ] Confidence is reported as HIGH when stage.txt is present
|
||||
- [ ] Cross-check result (consistent vs. discrepant) is noted
|
||||
- [ ] No files are written
|
||||
- [ ] Verdict clearly states the detected stage
|
||||
|
||||
---
|
||||
|
||||
### Case 2: No stage.txt but GDDs and Epics Exist — Infers Production
|
||||
|
||||
**Fixture:**
|
||||
- No `production/stage.txt`
|
||||
- `design/gdd/` has 3 GDD files
|
||||
- `production/epics/` has 2 epic files
|
||||
- `src/` has source code files
|
||||
- `production/sprints/sprint-001.md` exists
|
||||
|
||||
**Input:** `/project-stage-detect`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill finds no stage.txt — switches to artifact inference mode
|
||||
2. Skill finds GDDs (Systems Design complete), epics (Pre-Production complete),
|
||||
source code and sprints (Production active)
|
||||
3. Skill infers: Stage = Production
|
||||
4. Confidence is MEDIUM (inferred from artifacts, not from stage.txt)
|
||||
5. Skill recommends running `/gate-check` to formalize and write stage.txt
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Inferred stage is Production
|
||||
- [ ] Confidence is MEDIUM (not HIGH, since stage.txt is absent)
|
||||
- [ ] Recommendation to run `/gate-check` is present
|
||||
- [ ] No stage.txt is written by this skill
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No stage.txt, No Docs, No Source — Infers Concept
|
||||
|
||||
**Fixture:**
|
||||
- No `production/stage.txt`
|
||||
- `design/` directory exists but is empty
|
||||
- `src/` exists but contains no code files
|
||||
- `technical-preferences.md` has placeholders only
|
||||
|
||||
**Input:** `/project-stage-detect`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill finds no stage.txt
|
||||
2. Artifact scan: no GDDs, no source, no epics, no sprints, engine unconfigured
|
||||
3. Skill infers: Stage = Concept
|
||||
4. Confidence is MEDIUM
|
||||
5. Skill suggests `/start` to begin the onboarding workflow
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Inferred stage is Concept
|
||||
- [ ] Output lists the artifacts that were checked (and found absent)
|
||||
- [ ] `/start` is suggested as the next step
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Discrepancy — stage.txt says Production but no source code
|
||||
|
||||
**Fixture:**
|
||||
- `production/stage.txt` contains `Production`
|
||||
- `design/gdd/` has GDD files
|
||||
- `src/` directory exists but contains no source code files
|
||||
- No sprint files exist
|
||||
|
||||
**Input:** `/project-stage-detect`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads stage.txt — detects `Production`
|
||||
2. Cross-check finds: no source code, no sprints — inconsistent with Production
|
||||
3. Skill flags discrepancy: "stage.txt says Production but no source code or sprints found"
|
||||
4. Skill reports detected stage as Production (honoring stage.txt) but
|
||||
confidence drops to LOW due to artifact mismatch
|
||||
5. Skill suggests reviewing stage.txt manually or running `/gate-check`
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Discrepancy is flagged explicitly in the output
|
||||
- [ ] Confidence is LOW when artifacts contradict stage.txt
|
||||
- [ ] stage.txt value is not silently overridden
|
||||
- [ ] User is advised to verify the discrepancy manually
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; detection is advisory
|
||||
|
||||
**Fixture:**
|
||||
- Any project state with or without stage.txt
|
||||
|
||||
**Input:** `/project-stage-detect`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill completes full stage detection
|
||||
2. No director agents are spawned at any point
|
||||
3. No gate IDs appear in output
|
||||
4. No write tool is called
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No write tool is called
|
||||
- [ ] Detection output is purely advisory
|
||||
- [ ] Verdict names the detected stage without triggering any gate
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads stage.txt if present; falls back to artifact inference if absent
|
||||
- [ ] Always reports a confidence level (HIGH / MEDIUM / LOW)
|
||||
- [ ] Cross-checks stage.txt against artifacts and flags discrepancies
|
||||
- [ ] Does not write stage.txt (that is `/gate-check`'s responsibility)
|
||||
- [ ] Ends with a next-step recommendation appropriate to the detected stage
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The Technical Setup stage (engine configured, no GDDs yet) and Pre-Production
|
||||
stage (GDDs complete, no epics yet) follow the same artifact-inference pattern
|
||||
as Cases 2 and 3 and are not separately fixture-tested.
|
||||
- The Polish and Release stages are not fixture-tested here; they follow the
|
||||
same high-confidence (stage.txt present) or inference logic.
|
||||
- Confidence levels are advisory — the skill does not gate any actions on them.
|
||||
178
CCGS Skill Testing Framework/skills/utility/prototype.md
Normal file
178
CCGS Skill Testing Framework/skills/utility/prototype.md
Normal file
@@ -0,0 +1,178 @@
|
||||
# Skill Test Spec: /prototype
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/prototype` manages a rapid prototyping workflow for validating a game mechanic
|
||||
before committing to full production implementation. Prototypes are created in
|
||||
`prototypes/[mechanic-name]/` and are intentionally disposable — coding standards
|
||||
are relaxed (no ADR required, AC can be minimal, hardcoded values acceptable).
|
||||
After implementation, the skill produces a findings document summarizing what
|
||||
was learned and recommending next steps.
|
||||
|
||||
The skill asks "May I write to `prototypes/[name]/`?" before creating files. If a
|
||||
prototype already exists, the skill offers to extend, replace, or archive. No
|
||||
director gates apply. Verdicts: PROTOTYPE COMPLETE (prototype built and findings
|
||||
documented) or PROTOTYPE ABANDONED (mechanic found to be unworkable).
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: PROTOTYPE COMPLETE, PROTOTYPE ABANDONED
|
||||
- [ ] Contains "May I write" language before creating prototype files
|
||||
- [ ] Has a next-step handoff (e.g., `/design-system` to formalize, or archive)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Prototypes are throwaway validation artifacts. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Mechanic concept prototyped, findings documented
|
||||
|
||||
**Fixture:**
|
||||
- `prototypes/` directory exists
|
||||
- No existing prototype for "grapple-hook"
|
||||
|
||||
**Input:** `/prototype grapple-hook`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill asks "May I write to `prototypes/grapple-hook/`?"
|
||||
2. After approval: creates `prototypes/grapple-hook/` directory and basic
|
||||
implementation skeleton (main scene, player controller extension)
|
||||
3. Skill implements a minimal grapple hook mechanic (intentionally rough — no
|
||||
polish, hardcoded values acceptable)
|
||||
4. Skill produces `prototypes/grapple-hook/findings.md` with:
|
||||
- What was tested
|
||||
- What worked
|
||||
- What didn't work
|
||||
- Recommendation (proceed / abandon / revise concept)
|
||||
5. Verdict is PROTOTYPE COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] "May I write to `prototypes/grapple-hook/`?" is asked before any files are created
|
||||
- [ ] Implementation is isolated to `prototypes/` (not `src/`)
|
||||
- [ ] `findings.md` is created with at minimum: tested/worked/didn't-work/recommendation
|
||||
- [ ] Verdict is PROTOTYPE COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Prototype Already Exists — Offers Extend, Replace, or Archive
|
||||
|
||||
**Fixture:**
|
||||
- `prototypes/grapple-hook/` already exists from a previous prototype session
|
||||
- It contains a basic implementation and a findings.md
|
||||
|
||||
**Input:** `/prototype grapple-hook`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects existing `prototypes/grapple-hook/` directory
|
||||
2. Skill reports: "Prototype already exists for grapple-hook"
|
||||
3. Skill presents 3 options:
|
||||
- Extend: add new features to the existing prototype
|
||||
- Replace: start fresh (asks "May I replace `prototypes/grapple-hook/`?")
|
||||
- Archive: move to `prototypes/archive/grapple-hook/` and start fresh
|
||||
4. User selects; skill proceeds accordingly
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Existing prototype is detected and reported
|
||||
- [ ] Exactly 3 options are presented (extend, replace, archive)
|
||||
- [ ] Replace path includes a "May I replace" confirmation
|
||||
- [ ] Archive path moves (not deletes) the existing prototype
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Prototype Validates Mechanic — Recommends Proceeding to Production
|
||||
|
||||
**Fixture:**
|
||||
- Prototype implementation complete
|
||||
- Findings: grapple hook mechanic is fun and technically feasible
|
||||
|
||||
**Input:** `/prototype grapple-hook` (prototype session complete)
|
||||
|
||||
**Expected behavior:**
|
||||
1. After prototype is built and tested, findings are summarized
|
||||
2. Recommendation in findings.md: "Mechanic validated — recommend proceeding
|
||||
to `/design-system` for full specification"
|
||||
3. Skill handoff message explicitly suggests `/design-system grapple-hook`
|
||||
4. Verdict is PROTOTYPE COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] `findings.md` contains an explicit recommendation
|
||||
- [ ] Recommendation references `/design-system` when mechanic is validated
|
||||
- [ ] Handoff message echoes the recommendation
|
||||
- [ ] Verdict is PROTOTYPE COMPLETE (not PROTOTYPE ABANDONED)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Prototype Reveals Mechanic is Unworkable — PROTOTYPE ABANDONED
|
||||
|
||||
**Fixture:**
|
||||
- Prototype implemented for "procedural-dialogue"
|
||||
- After testing: the mechanic creates incoherent dialogue trees and is
|
||||
frustrating to play
|
||||
|
||||
**Input:** `/prototype procedural-dialogue`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Prototype is built
|
||||
2. Findings document the failure: incoherent output, player confusion, technical complexity
|
||||
3. Recommendation in findings.md: "Mechanic not viable — abandoning"
|
||||
4. `findings.md` documents the specific reasons the mechanic failed
|
||||
5. Skill suggests alternatives in the handoff (e.g., curated dialogue instead)
|
||||
6. Verdict is PROTOTYPE ABANDONED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is PROTOTYPE ABANDONED (not PROTOTYPE COMPLETE)
|
||||
- [ ] `findings.md` documents specific failure reasons (not vague)
|
||||
- [ ] Alternative approaches are suggested in the handoff
|
||||
- [ ] Prototype files are retained (not deleted) for reference
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; prototypes are validation artifacts
|
||||
|
||||
**Fixture:**
|
||||
- Mechanic concept provided
|
||||
|
||||
**Input:** `/prototype wall-jump`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill creates and documents the prototype
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is PROTOTYPE COMPLETE or PROTOTYPE ABANDONED — no gate verdict
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Asks "May I write to `prototypes/[name]/`?" before creating any files
|
||||
- [ ] Creates all files under `prototypes/` (not `src/`)
|
||||
- [ ] Produces `findings.md` with tested/worked/didn't-work/recommendation
|
||||
- [ ] Notes that production coding standards are intentionally relaxed
|
||||
- [ ] Offers extend/replace/archive when prototype already exists
|
||||
- [ ] Verdict is PROTOTYPE COMPLETE or PROTOTYPE ABANDONED
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Prototype implementation quality (code style) is intentionally not tested —
|
||||
prototypes are throwaway artifacts and quality standards do not apply.
|
||||
- The archiving mechanism is mentioned in Case 2 but the archive format is
|
||||
not assertion-tested in detail.
|
||||
- Engine-specific prototype scaffolding (GDScript scenes vs. C# MonoBehaviour)
|
||||
follows the same flow with engine-appropriate file types.
|
||||
175
CCGS Skill Testing Framework/skills/utility/qa-plan.md
Normal file
175
CCGS Skill Testing Framework/skills/utility/qa-plan.md
Normal file
@@ -0,0 +1,175 @@
|
||||
# Skill Test Spec: /qa-plan
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/qa-plan` generates a structured QA test plan for a feature or sprint milestone.
|
||||
It reads story files for the specified sprint, extracts acceptance criteria from
|
||||
each story, cross-references test standards from `coding-standards.md` to assign
|
||||
the appropriate test type (unit, integration, visual, UI, or config/data), and
|
||||
produces a prioritized QA plan document.
|
||||
|
||||
The skill asks "May I write to `production/qa/qa-plan-sprint-NNN.md`?" before
|
||||
persisting the output. If an existing test plan for the same sprint is found, the
|
||||
skill offers to update rather than replace. The verdict is COMPLETE when the plan
|
||||
is written. No director gates are used — gate-level story readiness is handled by
|
||||
`/story-readiness`.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" collaborative protocol language before writing the plan
|
||||
- [ ] Has a next-step handoff (e.g., `/smoke-check` or `/story-readiness`)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/qa-plan` is a planning utility. Story readiness gates are separate.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Sprint with 4 stories generates full test plan
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-003.md` lists 4 stories with defined acceptance criteria
|
||||
- Stories span types: 1 logic (formula), 1 integration, 1 visual, 1 UI
|
||||
- `coding-standards.md` is present with test evidence table
|
||||
|
||||
**Input:** `/qa-plan sprint-003`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads sprint-003.md and identifies 4 stories
|
||||
2. Skill reads each story's acceptance criteria
|
||||
3. Skill assigns test types per coding-standards.md table:
|
||||
- Logic story → Unit test (BLOCKING)
|
||||
- Integration story → Integration test (BLOCKING)
|
||||
- Visual story → Screenshot + lead sign-off (ADVISORY)
|
||||
- UI story → Manual walkthrough doc (ADVISORY)
|
||||
4. Skill drafts QA plan with story-by-story test type breakdown
|
||||
5. Skill asks "May I write to `production/qa/qa-plan-sprint-003.md`?"
|
||||
6. File is written on approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 4 stories are included in the plan
|
||||
- [ ] Test type is assigned per coding-standards.md (not guessed)
|
||||
- [ ] Gate level (BLOCKING vs ADVISORY) is noted for each story
|
||||
- [ ] "May I write" is asked with the correct file path
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Story With No Acceptance Criteria — Flagged as UNTESTABLE
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-004.md` lists 3 stories; one story has empty
|
||||
acceptance criteria section
|
||||
|
||||
**Input:** `/qa-plan sprint-004`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all 3 stories
|
||||
2. Skill detects the story with no AC
|
||||
3. Story is flagged as `UNTESTABLE — Acceptance Criteria required` in the plan
|
||||
4. Other 2 stories receive normal test type assignments
|
||||
5. Plan is written with the UNTESTABLE story flagged; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] UNTESTABLE label appears for the story with no AC
|
||||
- [ ] Plan is not blocked — the other stories are still planned
|
||||
- [ ] Output suggests adding AC to the flagged story (next step)
|
||||
- [ ] Verdict is COMPLETE (the plan is still generated)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Existing Test Plan Found — Offers update rather than replace
|
||||
|
||||
**Fixture:**
|
||||
- `production/qa/qa-plan-sprint-003.md` already exists from a previous run
|
||||
- Sprint-003 has 2 new stories added since the last plan
|
||||
|
||||
**Input:** `/qa-plan sprint-003`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads sprint-003.md and detects 2 stories not in the existing plan
|
||||
2. Skill reports: "Existing QA plan found for sprint-003 — offering to update"
|
||||
3. Skill presents the 2 new stories and their proposed test assignments
|
||||
4. Skill asks "May I update `production/qa/qa-plan-sprint-003.md`?" (not overwrite)
|
||||
5. Updated plan is written on approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill detects the existing plan file
|
||||
- [ ] "update" language is used (not "overwrite")
|
||||
- [ ] Only new stories are proposed for addition — existing entries preserved
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 4: No Stories Found for Sprint — Error with guidance
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-007.md` does not exist
|
||||
- No other sprint file matching sprint-007
|
||||
|
||||
**Input:** `/qa-plan sprint-007`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read sprint-007.md — file not found
|
||||
2. Skill outputs: "No sprint file found for sprint-007"
|
||||
3. Skill suggests running `/sprint-plan` to create the sprint first
|
||||
4. No plan is written; no "May I write" is asked
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Error message names the missing sprint file
|
||||
- [ ] `/sprint-plan` is suggested as the remediation step
|
||||
- [ ] No write tool is called
|
||||
- [ ] Verdict is not COMPLETE (error state)
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; QA planning is a utility
|
||||
|
||||
**Fixture:**
|
||||
- Sprint with valid stories and AC
|
||||
|
||||
**Input:** `/qa-plan sprint-003`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill generates and writes QA plan
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Skill reaches COMPLETE without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads coding-standards.md test evidence table before assigning test types
|
||||
- [ ] Assigns BLOCKING or ADVISORY gate level per story type
|
||||
- [ ] Flags stories with no AC as UNTESTABLE (does not silently skip them)
|
||||
- [ ] Detects existing plan and offers update path
|
||||
- [ ] Asks "May I write" before creating or updating the plan file
|
||||
- [ ] Verdict is COMPLETE when plan is written
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where `coding-standards.md` is missing (skill cannot assign test types)
|
||||
is not fixture-tested; behavior would follow the BLOCKED pattern with a note
|
||||
to restore the standards file.
|
||||
- Multi-sprint planning (spanning 2 sprints) is not tested; the skill is designed
|
||||
for one sprint at a time.
|
||||
- Config/data story type (balance tuning → smoke check) follows the same
|
||||
assignment pattern as other types in Case 1 and is not separately tested.
|
||||
172
CCGS Skill Testing Framework/skills/utility/regression-suite.md
Normal file
172
CCGS Skill Testing Framework/skills/utility/regression-suite.md
Normal file
@@ -0,0 +1,172 @@
|
||||
# Skill Test Spec: /regression-suite
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/regression-suite` maps test coverage to GDD requirements: it reads the
|
||||
acceptance criteria from story files in the current sprint (or a specified epic),
|
||||
then scans `tests/` for corresponding test files and checks whether each AC has
|
||||
a matching assertion. It produces a coverage report identifying which ACs are
|
||||
fully covered, partially covered, or untested, and which test files have no
|
||||
matching AC (orphan tests).
|
||||
|
||||
The skill may write a coverage report to `production/qa/` after a "May I write"
|
||||
ask. No director gates apply. Verdicts: FULL COVERAGE (all ACs have tests),
|
||||
GAPS FOUND (some ACs are untested), or CRITICAL GAPS (a critical-priority AC
|
||||
has no test).
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: FULL COVERAGE, GAPS FOUND, CRITICAL GAPS
|
||||
- [ ] Contains "May I write" language (skill may write coverage report)
|
||||
- [ ] Has a next-step handoff (e.g., `/test-setup` if framework missing, `/qa-plan` if plan missing)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/regression-suite` is a QA analysis utility. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Full Coverage — All ACs in sprint have corresponding tests
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-004.md` lists 3 stories with 2 ACs each (6 total)
|
||||
- `tests/unit/` and `tests/integration/` contain test files that match all 6 ACs
|
||||
(by system name and scenario description)
|
||||
|
||||
**Input:** `/regression-suite sprint-004`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all 6 ACs from sprint-004 stories
|
||||
2. Skill scans test files and matches each AC to at least one test assertion
|
||||
3. All 6 ACs have coverage
|
||||
4. Skill produces coverage report: "6/6 ACs covered"
|
||||
5. Skill asks "May I write to `production/qa/regression-sprint-004.md`?"
|
||||
6. File is written on approval; verdict is FULL COVERAGE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 6 ACs appear in the coverage report
|
||||
- [ ] Each AC is marked as covered with the matching test file referenced
|
||||
- [ ] Verdict is FULL COVERAGE
|
||||
- [ ] "May I write" is asked before writing the report
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Gaps Found — 3 ACs have no tests
|
||||
|
||||
**Fixture:**
|
||||
- Sprint has 5 stories with 8 total ACs
|
||||
- Tests exist for 5 of the 8 ACs; 3 ACs have no corresponding test file or assertion
|
||||
|
||||
**Input:** `/regression-suite`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all 8 ACs
|
||||
2. Skill scans tests — 5 matched, 3 unmatched
|
||||
3. Coverage report lists the 3 untested ACs by story and AC text
|
||||
4. Skill asks "May I write to `production/qa/regression-[sprint]-[date].md`?"
|
||||
5. Report is written; verdict is GAPS FOUND
|
||||
|
||||
**Assertions:**
|
||||
- [ ] The 3 untested ACs are listed by name in the report
|
||||
- [ ] Matched ACs are also shown (not only the gaps)
|
||||
- [ ] Verdict is GAPS FOUND (not FULL COVERAGE)
|
||||
- [ ] Report is written after "May I write" approval
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Critical AC Untested — CRITICAL GAPS verdict, flagged prominently
|
||||
|
||||
**Fixture:**
|
||||
- Sprint has 4 stories; one story is Priority: Critical with 2 ACs
|
||||
- One of the critical-priority ACs has no test
|
||||
|
||||
**Input:** `/regression-suite`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all stories and ACs, noting which stories are critical priority
|
||||
2. Skill scans tests — the critical AC has no match
|
||||
3. Report prominently flags: "CRITICAL GAP: [AC text] — no test found (Critical priority story)"
|
||||
4. Skill recommends blocking story completion until test is added
|
||||
5. Verdict is CRITICAL GAPS
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is CRITICAL GAPS (not GAPS FOUND)
|
||||
- [ ] Critical priority AC is flagged more prominently than normal gaps
|
||||
- [ ] Recommendation to block story completion is included
|
||||
- [ ] Non-critical gaps (if any) are also listed
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Orphan Tests — Test file has no matching AC
|
||||
|
||||
**Fixture:**
|
||||
- `tests/unit/save_system_test.gd` exists with assertions for scenarios
|
||||
not present in any current story's AC list
|
||||
- Current sprint stories do not reference save system
|
||||
|
||||
**Input:** `/regression-suite`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans tests and cross-references ACs
|
||||
2. `save_system_test.gd` assertions do not match any current AC
|
||||
3. Test file is flagged as ORPHAN TEST in the coverage report
|
||||
4. Report notes: "Orphan tests may belong to a past or future sprint, or AC was renamed"
|
||||
5. Verdict is FULL COVERAGE or GAPS FOUND depending on overall AC coverage
|
||||
(orphan tests do not affect verdict, they are advisory)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Orphan test is flagged in the report
|
||||
- [ ] Orphan flag includes the filename and suggestion (past sprint / renamed AC)
|
||||
- [ ] Orphan tests do not cause a GAPS FOUND verdict on their own
|
||||
- [ ] Overall verdict reflects AC coverage only
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; regression-suite is a QA utility
|
||||
|
||||
**Fixture:**
|
||||
- Sprint with stories and test files
|
||||
|
||||
**Input:** `/regression-suite`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill produces coverage report and writes it
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is FULL COVERAGE, GAPS FOUND, or CRITICAL GAPS — no gate verdict
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads story ACs from sprint files before scanning tests
|
||||
- [ ] Matches ACs to tests by system name and scenario (not file name alone)
|
||||
- [ ] Flags critical-priority untested ACs as CRITICAL GAPS
|
||||
- [ ] Flags orphan tests (exist in tests/ but no AC matches)
|
||||
- [ ] Asks "May I write" before persisting the coverage report
|
||||
- [ ] Verdict is FULL COVERAGE, GAPS FOUND, or CRITICAL GAPS
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The heuristic for matching an AC to a test (by system name + scenario keywords)
|
||||
is approximate; exact matching logic is defined in the skill body.
|
||||
- Integration test coverage is mapped the same way as unit test coverage; no
|
||||
distinction in verdicts is made between the two.
|
||||
- This skill does not run the tests — it maps AC text to test assertions. Test
|
||||
execution is handled by the CI pipeline.
|
||||
177
CCGS Skill Testing Framework/skills/utility/release-checklist.md
Normal file
177
CCGS Skill Testing Framework/skills/utility/release-checklist.md
Normal file
@@ -0,0 +1,177 @@
|
||||
# Skill Test Spec: /release-checklist
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/release-checklist` generates an internal release readiness checklist covering:
|
||||
sprint story completion, open bug severity, QA sign-off status, build stability,
|
||||
and changelog readiness. It is an internal gate — not a platform/store checklist
|
||||
(that is `/launch-checklist`). When a previous release checklist exists, it shows
|
||||
a delta of resolved and newly introduced issues.
|
||||
|
||||
The skill writes its checklist report to `production/releases/release-checklist-[date].md`
|
||||
after a "May I write" ask. No director gates apply — `/gate-check` handles
|
||||
formal phase gate logic. Verdicts: RELEASE READY, RELEASE BLOCKED, or CONCERNS.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: RELEASE READY, RELEASE BLOCKED, CONCERNS
|
||||
- [ ] Contains "May I write" collaborative protocol language before writing the report
|
||||
- [ ] Has a next-step handoff (e.g., `/launch-checklist` for external or `/gate-check` for phase)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/release-checklist` is an internal audit utility. Formal phase advancement
|
||||
is managed by `/gate-check`.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All Sprint Stories Complete, QA Passed, RELEASE READY
|
||||
|
||||
**Fixture:**
|
||||
- `production/sprints/sprint-008.md` — all stories are `Status: Done`
|
||||
- No open bugs with severity HIGH or CRITICAL in `production/bugs/`
|
||||
- `production/qa/qa-plan-sprint-008.md` has QA sign-off annotation
|
||||
- Changelog entry for this version exists
|
||||
- `production/stage.txt` contains `Polish`
|
||||
|
||||
**Input:** `/release-checklist`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads sprint-008: all stories Done
|
||||
2. Skill reads bugs: no HIGH or CRITICAL open bugs
|
||||
3. Skill confirms QA plan has sign-off
|
||||
4. Skill confirms changelog entry exists
|
||||
5. All checks pass; skill asks "May I write to
|
||||
`production/releases/release-checklist-2026-04-06.md`?"
|
||||
6. Report written; verdict is RELEASE READY
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 4 check categories are evaluated (stories, bugs, QA, changelog)
|
||||
- [ ] All items appear with PASS markers
|
||||
- [ ] Verdict is RELEASE READY
|
||||
- [ ] "May I write" is asked before writing
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Open HIGH Severity Bugs — RELEASE BLOCKED
|
||||
|
||||
**Fixture:**
|
||||
- All sprint stories are Done
|
||||
- `production/bugs/` contains 2 open bugs with severity HIGH
|
||||
|
||||
**Input:** `/release-checklist`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads sprint — stories complete
|
||||
2. Skill reads bugs — 2 HIGH severity bugs open
|
||||
3. Skill reports: "RELEASE BLOCKED — 2 open HIGH severity bugs must be resolved"
|
||||
4. Both bug filenames are listed in the report
|
||||
5. Verdict is RELEASE BLOCKED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is RELEASE BLOCKED (not CONCERNS)
|
||||
- [ ] Both bug filenames are listed explicitly
|
||||
- [ ] Skill makes clear HIGH severity bugs are blocking (not advisory)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Changelog Not Generated — CONCERNS
|
||||
|
||||
**Fixture:**
|
||||
- All stories Done, no HIGH/CRITICAL bugs
|
||||
- No changelog entry found for the current version/sprint
|
||||
|
||||
**Input:** `/release-checklist`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill checks all items
|
||||
2. Changelog check fails: no changelog entry found
|
||||
3. Skill reports: "CONCERNS — Changelog not generated for this release"
|
||||
4. Skill suggests running `/changelog` to generate it
|
||||
5. Verdict is CONCERNS (advisory — not a hard block)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is CONCERNS (not RELEASE BLOCKED — changelog is advisory)
|
||||
- [ ] `/changelog` is suggested as the remediation
|
||||
- [ ] Other passing checks are shown in the report
|
||||
- [ ] Missing changelog is described as advisory, not blocking
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Previous Release Checklist Exists — Delta From Last Release
|
||||
|
||||
**Fixture:**
|
||||
- `production/releases/release-checklist-2026-03-20.md` exists
|
||||
- Previous: 1 story was incomplete, 1 HIGH bug open
|
||||
- Current: all stories Done, HIGH bug resolved, but now 1 MEDIUM bug appeared
|
||||
|
||||
**Input:** `/release-checklist`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill finds the previous checklist and loads it
|
||||
2. New checklist is generated and compared:
|
||||
- Newly resolved: "Story [X] — was open, now Done"
|
||||
- Newly resolved: "HIGH bug [filename] — was open, now closed"
|
||||
- New item: "1 MEDIUM bug appeared (advisory)"
|
||||
3. Delta section shows all changes prominently
|
||||
4. Verdict is CONCERNS (MEDIUM bug is advisory, not blocking)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Delta section appears in the report with resolved and new items
|
||||
- [ ] Newly resolved items from the previous checklist are noted
|
||||
- [ ] New items not present in the previous checklist are highlighted
|
||||
- [ ] Verdict reflects current state (not previous state)
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; release-checklist is an internal audit
|
||||
|
||||
**Fixture:**
|
||||
- Active sprint with stories and bug reports
|
||||
|
||||
**Input:** `/release-checklist`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill runs the full checklist and writes the report
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is RELEASE READY, RELEASE BLOCKED, or CONCERNS — no gate verdict
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Checks sprint story completion status
|
||||
- [ ] Checks open bug severity (CRITICAL/HIGH = BLOCKED; MEDIUM/LOW = CONCERNS)
|
||||
- [ ] Checks QA plan sign-off status
|
||||
- [ ] Checks changelog existence
|
||||
- [ ] Compares against previous checklist when one exists
|
||||
- [ ] Asks "May I write" before writing the report
|
||||
- [ ] Verdict is RELEASE READY, RELEASE BLOCKED, or CONCERNS
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Build stability verification (no failed CI runs) is listed as a check category
|
||||
but relies on external CI system state; the skill notes this as a MANUAL CHECK
|
||||
if CI integration is not configured.
|
||||
- CRITICAL bugs always result in RELEASE BLOCKED regardless of other items;
|
||||
this is equivalent to the HIGH severity case in Case 2.
|
||||
- Stories with `Status: In Review` (not Done) are treated as incomplete
|
||||
and result in RELEASE BLOCKED; this edge case follows the same pattern
|
||||
as the HIGH bug case.
|
||||
180
CCGS Skill Testing Framework/skills/utility/reverse-document.md
Normal file
180
CCGS Skill Testing Framework/skills/utility/reverse-document.md
Normal file
@@ -0,0 +1,180 @@
|
||||
# Skill Test Spec: /reverse-document
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/reverse-document` generates design or architecture documentation from existing
|
||||
source code. It reads the specified source file(s), infers design intent from
|
||||
class structure, method names, constants, and comments, and produces either a
|
||||
GDD skeleton (for gameplay systems) or an architecture overview (for technical
|
||||
systems). The output is a best-effort inference — magic numbers and undocumented
|
||||
logic may result in a PARTIAL verdict.
|
||||
|
||||
The skill asks "May I write to [inferred path]?" before creating the document.
|
||||
No director gates apply. Verdicts: COMPLETE (clean inference), PARTIAL (some
|
||||
fields are ambiguous and need human review).
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, PARTIAL
|
||||
- [ ] Contains "May I write" collaborative protocol language before writing the doc
|
||||
- [ ] Has a next-step handoff (e.g., `/design-review` to validate the generated doc)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/reverse-document` is a documentation utility. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Well-Structured Source — Accurate design doc skeleton produced
|
||||
|
||||
**Fixture:**
|
||||
- `src/gameplay/health_system.gd` exists with:
|
||||
- `@export var max_health: int = 100`
|
||||
- `func take_damage(amount: int)` with clamping logic
|
||||
- `signal health_changed(new_value: int)`
|
||||
- Docstrings on all public methods
|
||||
|
||||
**Input:** `/reverse-document src/gameplay/health_system.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the source file and identifies the health system
|
||||
2. Skill infers design intent: max health, take_damage behavior, health signal
|
||||
3. Skill produces GDD skeleton for health system with 8 required sections:
|
||||
Overview, Player Fantasy, Detailed Rules, Formulas, Edge Cases, Dependencies,
|
||||
Tuning Knobs, Acceptance Criteria
|
||||
4. Formulas section includes the inferred clamping formula
|
||||
5. Tuning Knobs notes `max_health = 100` as a configurable value
|
||||
6. Skill asks "May I write to `design/gdd/health-system.md`?"
|
||||
7. File written; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 8 required GDD sections are present in the output
|
||||
- [ ] `max_health = 100` appears as a Tuning Knob
|
||||
- [ ] Clamping formula is captured in the Formulas section
|
||||
- [ ] "May I write" is asked with the inferred path
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Ambiguous Source — Magic Numbers, PARTIAL Verdict
|
||||
|
||||
**Fixture:**
|
||||
- `src/gameplay/enemy_ai.gd` exists with:
|
||||
- Inline magic numbers: `if distance < 150:`, `speed = 3.5`
|
||||
- No comments or docstrings
|
||||
- Complex state machine logic that is not self-explanatory
|
||||
|
||||
**Input:** `/reverse-document src/gameplay/enemy_ai.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the file and detects magic numbers with no context
|
||||
2. Skill produces a GDD skeleton with notes: "AMBIGUOUS VALUE: 150 (unknown units —
|
||||
is this pixels, world units, or tiles?)"
|
||||
3. Skill marks the Formulas and Tuning Knobs sections as requiring human review
|
||||
4. Skill asks "May I write to `design/gdd/enemy-ai.md`?" with PARTIAL advisory
|
||||
5. File written with PARTIAL markers; verdict is PARTIAL
|
||||
|
||||
**Assertions:**
|
||||
- [ ] AMBIGUOUS VALUE annotations appear for magic numbers
|
||||
- [ ] Sections needing human review are marked explicitly
|
||||
- [ ] Verdict is PARTIAL (not COMPLETE)
|
||||
- [ ] File is still written — PARTIAL is not a blocking failure
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Multiple Interdependent Files — Cross-System Overview Produced
|
||||
|
||||
**Fixture:**
|
||||
- User provides 2 source files: `combat_system.gd` and `damage_resolver.gd`
|
||||
- The files reference each other (combat calls damage_resolver)
|
||||
|
||||
**Input:** `/reverse-document src/gameplay/combat_system.gd src/gameplay/damage_resolver.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads both files and detects the dependency relationship
|
||||
2. Skill produces a cross-system architecture overview (not individual GDDs)
|
||||
3. Overview describes: Combat System → Damage Resolver interaction, shared
|
||||
interfaces, data flow between the two
|
||||
4. Skill asks "May I write to `docs/architecture/combat-damage-overview.md`?"
|
||||
5. Overview written after approval; verdict is COMPLETE (or PARTIAL if ambiguous)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Both files are analyzed together (not as two separate docs)
|
||||
- [ ] Cross-system dependency is documented in the output
|
||||
- [ ] Output file is written to `docs/architecture/` (not `design/gdd/`)
|
||||
- [ ] Verdict is COMPLETE or PARTIAL
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Source File Not Found — Error
|
||||
|
||||
**Fixture:**
|
||||
- `src/gameplay/inventory_system.gd` does not exist
|
||||
|
||||
**Input:** `/reverse-document src/gameplay/inventory_system.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read the specified file — not found
|
||||
2. Skill outputs: "Source file not found: src/gameplay/inventory_system.gd"
|
||||
3. Skill suggests checking the path or running `/map-systems` to identify
|
||||
the correct source file
|
||||
4. No document is created
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Error message names the missing file with the full path
|
||||
- [ ] Alternative suggestion (check path or `/map-systems`) is provided
|
||||
- [ ] No write tool is called
|
||||
- [ ] No verdict is issued (error state)
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; reverse-document is a utility
|
||||
|
||||
**Fixture:**
|
||||
- Well-structured source file exists
|
||||
|
||||
**Input:** `/reverse-document src/gameplay/health_system.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill generates and writes the design doc
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is COMPLETE or PARTIAL — no gate verdict involved
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads source file(s) before generating any content
|
||||
- [ ] Produces all 8 required GDD sections when target is a gameplay system
|
||||
- [ ] Annotates ambiguous values with AMBIGUOUS VALUE markers
|
||||
- [ ] Produces cross-system overview (not individual GDDs) for multiple files
|
||||
- [ ] Asks "May I write" before creating any output file
|
||||
- [ ] Verdict is COMPLETE (clean inference) or PARTIAL (ambiguous fields)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Architecture overview format (for technical/infrastructure systems) differs
|
||||
from GDD format; the inferred output type is determined by the nature of the
|
||||
source file (gameplay logic → GDD; engine/infra code → architecture doc).
|
||||
- The case where a source file is readable but contains only auto-generated
|
||||
boilerplate with no meaningful logic is not tested; skill would likely produce
|
||||
a near-empty skeleton with a PARTIAL verdict.
|
||||
- C# and Blueprint source files follow the same inference pattern as GDScript;
|
||||
language-specific differences are handled in the skill body.
|
||||
182
CCGS Skill Testing Framework/skills/utility/setup-engine.md
Normal file
182
CCGS Skill Testing Framework/skills/utility/setup-engine.md
Normal file
@@ -0,0 +1,182 @@
|
||||
# Skill Test Spec: /setup-engine
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/setup-engine` configures the project's engine, language, rendering backend,
|
||||
physics engine, specialist agent assignments, and naming conventions by
|
||||
populating `technical-preferences.md`. It accepts an optional engine argument
|
||||
(e.g., `/setup-engine godot`) to skip the engine-selection step. For each
|
||||
section of `technical-preferences.md`, the skill presents a draft and asks
|
||||
"May I write to `technical-preferences.md`?" before updating.
|
||||
|
||||
The skill also populates the specialist routing table (file extension → agent
|
||||
mappings) based on the chosen engine. It has no director gates — configuration
|
||||
is a technical utility task. The verdict is always COMPLETE when the file is
|
||||
fully written.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" collaborative protocol language before updating technical-preferences.md
|
||||
- [ ] Has a next-step handoff (e.g., `/brainstorm` or `/start` depending on flow)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/setup-engine` is a technical configuration skill. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Godot 4 + GDScript — Full engine configuration
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` contains only placeholders
|
||||
- Engine argument provided: `godot`
|
||||
|
||||
**Input:** `/setup-engine godot`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill skips engine-selection step (argument provided)
|
||||
2. Skill presents language options for Godot: GDScript or C#
|
||||
3. User selects GDScript
|
||||
4. Skill drafts all engine sections: engine/language/rendering/physics fields,
|
||||
naming conventions (snake_case for GDScript), specialist assignments
|
||||
(godot-specialist, gdscript-specialist, godot-shader-specialist, etc.)
|
||||
5. Skill populates the routing table: `.gd` → gdscript-specialist, `.gdshader` →
|
||||
godot-shader-specialist, `.tscn` → godot-specialist
|
||||
6. Skill asks "May I write to `technical-preferences.md`?"
|
||||
7. File is written after approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Engine field is set to Godot 4 (not a placeholder)
|
||||
- [ ] Language field is set to GDScript
|
||||
- [ ] Naming conventions are GDScript-appropriate (snake_case)
|
||||
- [ ] Routing table includes `.gd`, `.gdshader`, and `.tscn` entries
|
||||
- [ ] Specialists are assigned (not placeholders)
|
||||
- [ ] "May I write" is asked before writing
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Unity + C# — Unity-specific configuration
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` contains only placeholders
|
||||
- Engine argument provided: `unity`
|
||||
|
||||
**Input:** `/setup-engine unity`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill sets engine to Unity, language to C#
|
||||
2. Naming conventions are C#-appropriate (PascalCase for classes, camelCase for fields)
|
||||
3. Specialist assignments reference unity-specialist, csharp-specialist
|
||||
4. Routing table: `.cs` → csharp-specialist, `.asmdef` → unity-specialist,
|
||||
`.unity` (scene) → unity-specialist
|
||||
5. Skill asks "May I write to `technical-preferences.md`?" and writes on approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Engine field is set to Unity (not Godot or Unreal)
|
||||
- [ ] Language field is set to C#
|
||||
- [ ] Naming conventions reflect C# conventions
|
||||
- [ ] Routing table includes `.cs` and `.unity` entries
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Unreal + Blueprint — Unreal-specific configuration
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` contains only placeholders
|
||||
- Engine argument provided: `unreal`
|
||||
|
||||
**Input:** `/setup-engine unreal`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill sets engine to Unreal Engine 5, primary language to Blueprint (Visual Scripting)
|
||||
2. Specialist assignments reference unreal-specialist, blueprint-specialist
|
||||
3. Routing table: `.uasset` → blueprint-specialist or unreal-specialist,
|
||||
`.umap` → unreal-specialist
|
||||
4. Performance budgets are pre-set with Unreal defaults (e.g., higher draw call budget)
|
||||
5. Skill asks "May I write" and writes on approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Engine field is set to Unreal Engine 5
|
||||
- [ ] Routing table includes `.uasset` and `.umap` entries
|
||||
- [ ] Blueprint specialist is assigned
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Engine Already Configured — Offers to reconfigure specific sections
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` has engine set to Godot 4 with all fields populated
|
||||
- No engine argument provided
|
||||
|
||||
**Input:** `/setup-engine`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `technical-preferences.md` and detects fully configured engine (Godot 4)
|
||||
2. Skill reports: "Engine already configured as Godot 4 + GDScript"
|
||||
3. Skill presents options: reconfigure all, reconfigure specific section only
|
||||
(Engine/Language, Naming Conventions, Specialists, Performance Budgets)
|
||||
4. User selects "Reconfigure Performance Budgets only"
|
||||
5. Only the performance budget section is updated; all other fields unchanged
|
||||
6. Skill asks "May I write to `technical-preferences.md`?" and writes on approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT overwrite all fields when only a section update was requested
|
||||
- [ ] User is offered section-specific reconfiguration
|
||||
- [ ] Only the selected section is modified in the written file
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; setup-engine is a utility skill
|
||||
|
||||
**Fixture:**
|
||||
- Fresh project with no engine configured
|
||||
|
||||
**Input:** `/setup-engine godot`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill completes full engine configuration
|
||||
2. No director agents are spawned at any point
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is COMPLETE without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Presents draft configuration before asking to write
|
||||
- [ ] Asks "May I write to `technical-preferences.md`?" before writing
|
||||
- [ ] Respects engine argument when provided (skips selection step)
|
||||
- [ ] Detects existing config and offers partial reconfigure
|
||||
- [ ] Routing table is populated for all key file types for the chosen engine
|
||||
- [ ] Verdict is COMPLETE after file is written
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Godot 4 + C# (instead of GDScript) follows the same flow as Case 1 with
|
||||
different naming conventions and the godot-csharp-specialist assignment.
|
||||
This variant is not separately tested.
|
||||
- The engine-version-specific guidance (e.g., Godot 4.6 knowledge gap warning
|
||||
from VERSION.md) is surfaced by the skill but not assertion-tested here.
|
||||
- Performance budget defaults per engine are noted as engine-specific but
|
||||
exact default values are not assertion-tested.
|
||||
185
CCGS Skill Testing Framework/skills/utility/skill-improve.md
Normal file
185
CCGS Skill Testing Framework/skills/utility/skill-improve.md
Normal file
@@ -0,0 +1,185 @@
|
||||
# Skill Test Spec: /skill-improve
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/skill-improve` runs an automated test-fix-retest improvement loop on a skill
|
||||
file. It invokes `/skill-test static` (and optionally `/skill-test category`) to
|
||||
establish a baseline score, diagnoses the failing checks, proposes targeted fixes
|
||||
to the SKILL.md file, asks "May I write the improvements to [skill path]?", applies
|
||||
the fixes, and re-runs the tests to confirm improvement.
|
||||
|
||||
If the proposed fix makes the skill worse (regression), the fix is reverted (with
|
||||
user confirmation) rather than applied. If the skill is already perfect (0 failures),
|
||||
the skill exits immediately without making changes. No director gates apply. Verdicts:
|
||||
IMPROVED (score went up), NO CHANGE (no improvements possible or user declined), or
|
||||
REVERTED (fix was applied but caused regression and was reverted).
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: IMPROVED, NO CHANGE, REVERTED
|
||||
- [ ] Contains "May I write" collaborative protocol language before applying fixes
|
||||
- [ ] Has a next-step handoff (e.g., run `/skill-test spec` to validate behavioral compliance)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/skill-improve` is a meta-utility skill. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Skill With 2 Static Failures, Both Fixed, IMPROVED
|
||||
|
||||
**Fixture:**
|
||||
- `.claude/skills/some-skill/SKILL.md` has 2 static failures:
|
||||
- Check 4: no "May I write" language despite having Write in allowed-tools
|
||||
- Check 5: no next-step handoff at the end
|
||||
|
||||
**Input:** `/skill-improve some-skill`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill runs `/skill-test static some-skill` — baseline: 5/7 checks pass
|
||||
2. Skill diagnoses the 2 failing checks (4 and 5)
|
||||
3. Skill proposes fixes:
|
||||
- Add "May I write" language to the appropriate phase
|
||||
- Add a next-step handoff section at the end
|
||||
4. Skill asks "May I write improvements to `.claude/skills/some-skill/SKILL.md`?"
|
||||
5. Fixes applied; `/skill-test static some-skill` re-run — now 7/7 checks pass
|
||||
6. Verdict is IMPROVED (5→7)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Baseline score is established before any changes (5/7)
|
||||
- [ ] Both failing checks are diagnosed and addressed in the proposed fix
|
||||
- [ ] "May I write" is asked before applying the fix
|
||||
- [ ] Re-test confirms improvement (7/7)
|
||||
- [ ] Verdict is IMPROVED with before/after score shown
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Fix Causes Regression — Score Comparison Shows Regression, REVERTED
|
||||
|
||||
**Fixture:**
|
||||
- `.claude/skills/some-skill/SKILL.md` has 1 static failure (missing handoff)
|
||||
- Proposed fix inadvertently removes the verdict keywords section
|
||||
(introducing a new failure)
|
||||
|
||||
**Input:** `/skill-improve some-skill`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Baseline: 6/7 checks pass (1 failure: missing handoff)
|
||||
2. Skill proposes fix and asks "May I write improvements?"
|
||||
3. Fix is applied; re-test runs
|
||||
4. Re-test result: 5/7 (fixed the handoff but broke verdict keywords)
|
||||
5. Skill detects regression: score went DOWN
|
||||
6. Skill asks user: "Fix caused a regression (6→5). May I revert the changes?"
|
||||
7. User confirms; changes are reverted; verdict is REVERTED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Re-test score is compared to baseline before finalizing
|
||||
- [ ] Regression is detected when score decreases
|
||||
- [ ] User is asked to confirm revert (not automatic)
|
||||
- [ ] File is reverted on user confirmation
|
||||
- [ ] Verdict is REVERTED
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Skill With Category Assignment — Baseline Captures Both Scores
|
||||
|
||||
**Fixture:**
|
||||
- `.claude/skills/gate-check/SKILL.md` is a gate skill with 1 static failure
|
||||
and 2 category (G-criteria) failures
|
||||
- `tests/skills/quality-rubric.md` has Gate Skills section
|
||||
|
||||
**Input:** `/skill-improve gate-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill runs both static and category tests for the baseline:
|
||||
- Static: 6/7 checks pass
|
||||
- Category: 3/5 G-criteria pass
|
||||
2. Combined baseline: 9/12
|
||||
3. Skill diagnoses all 3 failures and proposes fixes
|
||||
4. "May I write improvements to `.claude/skills/gate-check/SKILL.md`?"
|
||||
5. Fixes applied; both test types re-run
|
||||
6. Re-test: static 7/7, category 5/5 = 12/12
|
||||
7. Verdict is IMPROVED (9→12)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Both static and category scores are captured in the baseline
|
||||
- [ ] Combined score is used for comparison (not just one type)
|
||||
- [ ] All 3 failures are addressed in the proposed fix
|
||||
- [ ] Re-test confirms improvement in both score types
|
||||
- [ ] Verdict is IMPROVED with combined before/after
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Skill Already Perfect — No Improvements Needed
|
||||
|
||||
**Fixture:**
|
||||
- `.claude/skills/brainstorm/SKILL.md` has no static failures
|
||||
- Category score is also 5/5 (if applicable)
|
||||
|
||||
**Input:** `/skill-improve brainstorm`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill runs `/skill-test static brainstorm` — 7/7 checks pass
|
||||
2. If category applies: 5/5 criteria pass
|
||||
3. Skill outputs: "No improvements needed — brainstorm is fully compliant"
|
||||
4. Skill exits without proposing any changes
|
||||
5. No "May I write" is asked; no files are modified
|
||||
6. Verdict is NO CHANGE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill exits immediately after confirming 0 failures
|
||||
- [ ] "No improvements needed" message is shown
|
||||
- [ ] No changes are proposed
|
||||
- [ ] No "May I write" is asked
|
||||
- [ ] Verdict is NO CHANGE
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; skill-improve is a meta utility
|
||||
|
||||
**Fixture:**
|
||||
- Skill with at least 1 static failure
|
||||
|
||||
**Input:** `/skill-improve some-skill`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill runs the test-fix-retest loop
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is IMPROVED, NO CHANGE, or REVERTED — no gate verdict
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Always establishes a baseline score before proposing any changes
|
||||
- [ ] Shows before/after score comparison in the output
|
||||
- [ ] Asks "May I write" before applying any fix
|
||||
- [ ] Detects regressions by comparing re-test score to baseline
|
||||
- [ ] Asks for user confirmation before reverting (not automatic)
|
||||
- [ ] Ends with IMPROVED, NO CHANGE, or REVERTED verdict
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The improvement loop is designed to run only one fix-retest cycle per
|
||||
invocation; running multiple iterations requires re-invoking `/skill-improve`.
|
||||
- Behavioral compliance (spec-mode test results) is not included in the
|
||||
improvement loop — only structural (static) and category scores are automated.
|
||||
- The case where the skill file cannot be read (permissions error or missing file)
|
||||
is not tested; this would result in an error before the baseline is established.
|
||||
188
CCGS Skill Testing Framework/skills/utility/skill-test.md
Normal file
188
CCGS Skill Testing Framework/skills/utility/skill-test.md
Normal file
@@ -0,0 +1,188 @@
|
||||
# Skill Test Spec: /skill-test
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/skill-test` validates skill files for structural correctness, behavioral
|
||||
compliance, and category-rubric scoring. It operates in three modes:
|
||||
|
||||
- **static**: Checks a single skill file for structural requirements
|
||||
(frontmatter fields, phase headings, verdict keywords, "May I write" language,
|
||||
next-step handoff) without needing a fixture. Produces a per-check PASS/FAIL
|
||||
table.
|
||||
- **spec**: Reads a test spec file from `tests/skills/` and evaluates the skill
|
||||
against each test case assertion, producing a case-by-case verdict.
|
||||
- **audit**: Produces a coverage table of all skills in `.claude/skills/` and
|
||||
all agents in `.claude/agents/`, showing which have spec files and which do not.
|
||||
|
||||
An additional **category** mode reads the quality rubric for a skill category
|
||||
(e.g., gate skills) and scores the skill against rubric criteria. The verdict
|
||||
system differs by mode.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdicts: COMPLIANT, NON-COMPLIANT, WARNINGS (static mode); PASS, FAIL, PARTIAL (spec mode); COMPLETE (audit mode)
|
||||
- [ ] Does NOT contain "May I write" language (skill is read-only in all modes)
|
||||
- [ ] Has a next-step handoff (e.g., `/skill-improve` to fix issues found)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/skill-test` is a meta-utility skill. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Static Mode — Well-formed skill, all 7 checks pass, COMPLIANT
|
||||
|
||||
**Fixture:**
|
||||
- `.claude/skills/brainstorm/SKILL.md` exists and is well-formed:
|
||||
- Has all required frontmatter fields
|
||||
- Has ≥2 phase headings
|
||||
- Has verdict keywords
|
||||
- Has "May I write" language
|
||||
- Has a next-step handoff
|
||||
- Documents director gates
|
||||
- Documents gate mode behavior (lean/solo skips)
|
||||
|
||||
**Input:** `/skill-test static brainstorm`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `.claude/skills/brainstorm/SKILL.md`
|
||||
2. Skill runs all 7 structural checks
|
||||
3. All 7 checks pass
|
||||
4. Skill outputs a PASS/FAIL table with all 7 checks marked PASS
|
||||
5. Verdict is COMPLIANT
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Exactly 7 structural checks are reported
|
||||
- [ ] All 7 are marked PASS
|
||||
- [ ] Verdict is COMPLIANT
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Static Mode — Skill Missing "May I Write" Despite Write Tool in allowed-tools
|
||||
|
||||
**Fixture:**
|
||||
- `.claude/skills/some-skill/SKILL.md` has `Write` in `allowed-tools` frontmatter
|
||||
- The skill body has no "May I write" or "May I update" language
|
||||
|
||||
**Input:** `/skill-test static some-skill`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `some-skill/SKILL.md`
|
||||
2. Check 4 (collaborative write protocol) fails: `Write` in allowed-tools but no
|
||||
"May I write" language found
|
||||
3. All other checks may pass
|
||||
4. Verdict is NON-COMPLIANT with Check 4 as the failing assertion
|
||||
5. Output lists Check 4 as FAIL with explanation
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Check 4 is marked FAIL
|
||||
- [ ] Explanation identifies the specific mismatch (Write tool without "May I write" language)
|
||||
- [ ] Verdict is NON-COMPLIANT
|
||||
- [ ] Other passing checks are shown (not only the failure)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Spec Mode — gate-check Skill Evaluated Against Spec
|
||||
|
||||
**Fixture:**
|
||||
- `tests/skills/gate-check.md` exists with 5 test cases
|
||||
- `.claude/skills/gate-check/SKILL.md` exists
|
||||
|
||||
**Input:** `/skill-test spec gate-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads both the skill file and the spec file
|
||||
2. Skill evaluates each of the 5 test case assertions against the skill's behavior
|
||||
3. For each case: PASS if skill behavior matches spec assertions, FAIL if not
|
||||
4. Skill produces a case-by-case result table
|
||||
5. Overall verdict: PASS (all 5), PARTIAL (some), or FAIL (majority failing)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 5 test cases from the spec are evaluated
|
||||
- [ ] Each case has an individual PASS/FAIL result
|
||||
- [ ] Overall verdict is PASS, PARTIAL, or FAIL based on case results
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Audit Mode — Coverage Table of All Skills and Agents
|
||||
|
||||
**Fixture:**
|
||||
- `.claude/skills/` contains 72+ skill directories
|
||||
- `.claude/agents/` contains 49+ agent files
|
||||
- `tests/skills/` contains spec files for a subset of skills
|
||||
|
||||
**Input:** `/skill-test audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill enumerates all skills in `.claude/skills/` and all agents in `.claude/agents/`
|
||||
2. Skill checks `tests/skills/` for a corresponding spec file for each
|
||||
3. Skill produces a coverage table:
|
||||
- Each skill/agent listed
|
||||
- "Has Spec" column: YES or NO
|
||||
- Summary: "X of Y skills have specs; A of B agents have specs"
|
||||
4. Verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All skill directories are enumerated (not just a sample)
|
||||
- [ ] "Has Spec" column is accurate for each entry
|
||||
- [ ] Summary counts are correct
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Category Mode — Gate Skill Evaluated Against Quality Rubric
|
||||
|
||||
**Fixture:**
|
||||
- `tests/skills/quality-rubric.md` exists with a "Gate Skills" section defining
|
||||
criteria G1-G5 (e.g., G1: has mode guard, G2: has verdict table, etc.)
|
||||
- `.claude/skills/gate-check/SKILL.md` is a gate skill
|
||||
|
||||
**Input:** `/skill-test category gate-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `quality-rubric.md` and identifies the Gate Skills section
|
||||
2. Skill evaluates `gate-check/SKILL.md` against criteria G1-G5
|
||||
3. Each criterion is scored: PASS, PARTIAL, or FAIL
|
||||
4. Overall category score is computed (e.g., 4/5 criteria pass)
|
||||
5. Verdict is COMPLIANT (all pass), WARNINGS (some partial), or NON-COMPLIANT (failures)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All gate criteria (G1-G5) from quality-rubric.md are evaluated
|
||||
- [ ] Each criterion has an individual score
|
||||
- [ ] Overall verdict reflects the score distribution
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Static mode checks exactly 7 structural assertions
|
||||
- [ ] Spec mode evaluates each test case from the spec file individually
|
||||
- [ ] Audit mode covers all skills AND agents (not just one category)
|
||||
- [ ] Category mode reads quality-rubric.md to get criteria (not hardcoded)
|
||||
- [ ] Does not write any files in any mode
|
||||
- [ ] Suggests `/skill-improve` as the next step when issues are found
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The skill-test skill is self-referential (it can test itself). The static
|
||||
mode case for skill-test's own SKILL.md is not separately fixture-tested to
|
||||
avoid infinite recursion in test design.
|
||||
- The specific 7 structural checks are defined in the skill body; only Check 4
|
||||
(May I write) is individually tested here because it has the most nuanced logic.
|
||||
- Audit mode counts are approximate — the exact number of skills and agents will
|
||||
change as the system grows; assertions use "all" rather than fixed counts.
|
||||
193
CCGS Skill Testing Framework/skills/utility/smoke-check.md
Normal file
193
CCGS Skill Testing Framework/skills/utility/smoke-check.md
Normal file
@@ -0,0 +1,193 @@
|
||||
# Skill Test Spec: /smoke-check
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/smoke-check` is the gate between implementation and QA hand-off. It detects the
|
||||
test environment, runs the automated test suite (via Bash), scans test coverage
|
||||
against sprint stories, and uses `AskUserQuestion` to batch-verify manual smoke
|
||||
checks with the developer. It writes a report to `production/qa/smoke-[date].md`
|
||||
after explicit user approval.
|
||||
|
||||
Verdicts: PASS (tests pass, all smoke checks pass, no missing test evidence),
|
||||
PASS WITH WARNINGS (tests pass or NOT RUN, all critical checks pass, but advisory
|
||||
gaps exist such as missing test coverage), or FAIL (any automated test failure or
|
||||
any Batch 1/Batch 2 smoke check returns FAIL).
|
||||
|
||||
No director gates apply. The skill does NOT invoke any director agents.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: PASS, PASS WITH WARNINGS, FAIL
|
||||
- [ ] Contains "May I write" collaborative protocol language before writing the report
|
||||
- [ ] Has a next-step handoff (e.g., `/bug-report` on FAIL, QA hand-off guidance on PASS)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/smoke-check` is a pre-QA utility skill. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Automated tests pass, manual items confirmed, PASS
|
||||
|
||||
**Fixture:**
|
||||
- `tests/` directory exists with a GDUnit4 runner script
|
||||
- Engine detected as Godot from `technical-preferences.md`
|
||||
- `production/qa/qa-plan-sprint-005.md` exists
|
||||
- Automated test runner reports 12 tests, 12 passing, 0 failing
|
||||
- Developer confirms all Batch 1 and Batch 2 smoke checks as PASS
|
||||
- All sprint stories have matching test files (no MISSING coverage)
|
||||
|
||||
**Input:** `/smoke-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects test directory and engine, notes QA plan found
|
||||
2. Runs `godot --headless --script tests/gdunit4_runner.gd` via Bash
|
||||
3. Parses output: 12/12 passing
|
||||
4. Scans test coverage — all stories COVERED or EXPECTED
|
||||
5. Uses `AskUserQuestion` for Batch 1 (core stability) and Batch 2 (sprint mechanics)
|
||||
6. Developer selects PASS for all items
|
||||
7. Report assembled: automated tests PASS, all smoke checks PASS, no MISSING coverage
|
||||
8. Asks "May I write this smoke check report to `production/qa/smoke-[date].md`?"
|
||||
9. Writes report after approval
|
||||
10. Delivers verdict: PASS
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Automated test runner is invoked via Bash
|
||||
- [ ] `AskUserQuestion` is used for manual smoke check batches
|
||||
- [ ] "May I write" is asked before writing the report file
|
||||
- [ ] Report is written to `production/qa/smoke-[date].md`
|
||||
- [ ] Verdict is PASS
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — Automated test fails, FAIL verdict
|
||||
|
||||
**Fixture:**
|
||||
- `tests/` directory exists, engine is Godot
|
||||
- Automated test runner reports 10 tests run: 8 passing, 2 failing
|
||||
- Failing tests: `test_health_clamp_at_zero`, `test_damage_calculation_negative`
|
||||
- QA plan exists
|
||||
|
||||
**Input:** `/smoke-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill runs automated tests via Bash
|
||||
2. Parses output — 2 failures detected
|
||||
3. Records failing test names
|
||||
4. Proceeds through manual smoke check batches
|
||||
5. Report shows automated tests as FAIL with failing test names listed
|
||||
6. Asks to write report; writes after approval
|
||||
7. Delivers FAIL verdict with message: "The smoke check failed. Do not hand off to
|
||||
QA until these failures are resolved." Lists failing tests and suggests fixing
|
||||
then re-running `/smoke-check`
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Failing test names are listed in the report
|
||||
- [ ] Verdict is FAIL
|
||||
- [ ] Post-verdict message directs developer to fix failures before QA hand-off
|
||||
- [ ] `/smoke-check` re-run is suggested after fixing
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Manual Confirmation — AskUserQuestion used, PASS WITH WARNINGS
|
||||
|
||||
**Fixture:**
|
||||
- `tests/` directory exists, engine is Godot
|
||||
- Automated test runner reports all tests passing (8/8)
|
||||
- One Logic story has no matching test file (MISSING coverage)
|
||||
- Developer confirms all Batch 1 and Batch 2 smoke checks as PASS
|
||||
|
||||
**Input:** `/smoke-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Automated tests PASS
|
||||
2. Coverage scan finds 1 MISSING entry for a Logic story
|
||||
3. `AskUserQuestion` is used for Batch 1 and Batch 2 — developer confirms all PASS
|
||||
4. Report shows: automated tests PASS, manual checks all PASS, 1 MISSING coverage entry
|
||||
5. Verdict is PASS WITH WARNINGS — build ready for QA, but MISSING entry must be
|
||||
resolved before `/story-done` closes the affected story
|
||||
6. Asks to write report; writes after approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] `AskUserQuestion` is used for manual smoke check batches (not inline text prompts)
|
||||
- [ ] MISSING test coverage entry appears in the report
|
||||
- [ ] Verdict is PASS WITH WARNINGS (not PASS, not FAIL)
|
||||
- [ ] Advisory note explains MISSING entry must be resolved before `/story-done`
|
||||
- [ ] Report file is written to `production/qa/smoke-[date].md`
|
||||
|
||||
---
|
||||
|
||||
### Case 4: No Test Directory — Skill stops with guidance
|
||||
|
||||
**Fixture:**
|
||||
- `tests/` directory does not exist
|
||||
- Engine is configured as Godot
|
||||
|
||||
**Input:** `/smoke-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Phase 1 checks for `tests/` directory — not found
|
||||
2. Skill outputs: "No test directory found at `tests/`. Run `/test-setup` to
|
||||
scaffold the testing infrastructure, or create the directory manually if
|
||||
tests live elsewhere."
|
||||
3. Skill stops — no automated tests run, no manual smoke checks, no report written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Error message references the missing `tests/` directory
|
||||
- [ ] `/test-setup` is suggested as the remediation step
|
||||
- [ ] Skill stops after this message (no further phases run)
|
||||
- [ ] No report file is written
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; smoke-check is a QA pre-check utility
|
||||
|
||||
**Fixture:**
|
||||
- Valid test setup, automated tests pass, manual smoke checks confirmed
|
||||
|
||||
**Input:** `/smoke-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill runs all phases and produces a PASS or PASS WITH WARNINGS verdict
|
||||
2. No director agents are spawned at any point
|
||||
3. No gate IDs (CD-*, TD-*, AD-*, PR-*) appear in output
|
||||
4. No `/gate-check` is invoked
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is PASS, PASS WITH WARNINGS, or FAIL — no gate verdict involved
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Uses `AskUserQuestion` for all manual smoke check batches (Batch 1, Batch 2, Batch 3)
|
||||
- [ ] Runs automated tests via Bash before asking any manual questions
|
||||
- [ ] Asks "May I write" before creating the report file — never writes without approval
|
||||
- [ ] Verdict vocabulary is strictly PASS / PASS WITH WARNINGS / FAIL — no other verdicts
|
||||
- [ ] FAIL is triggered by automated test failures or Batch 1/Batch 2 FAIL responses
|
||||
- [ ] PASS WITH WARNINGS is triggered when MISSING test coverage exists but no critical failures
|
||||
- [ ] NOT RUN (engine binary unavailable) is recorded as a warning, not a FAIL
|
||||
- [ ] Does not invoke director gates at any point
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The `quick` argument (skips Phase 3 coverage scan and Batch 3) is not separately
|
||||
fixture-tested; it follows the same pattern as Case 1 with a coverage-skip note in output.
|
||||
- The `--platform` argument adds platform-specific AskUserQuestion batches and a
|
||||
per-platform verdict table; not separately tested here.
|
||||
- The case where the engine binary is not on PATH (NOT RUN) follows the PASS WITH
|
||||
WARNINGS pattern and is covered by the protocol compliance assertions above.
|
||||
178
CCGS Skill Testing Framework/skills/utility/soak-test.md
Normal file
178
CCGS Skill Testing Framework/skills/utility/soak-test.md
Normal file
@@ -0,0 +1,178 @@
|
||||
# Skill Test Spec: /soak-test
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/soak-test` generates a structured soak test protocol — an extended runtime
|
||||
test plan designed to surface memory leaks, performance drift, and stability
|
||||
issues that only appear under sustained gameplay. The skill produces a document
|
||||
specifying the test duration, system under test, monitoring checkpoints (e.g.,
|
||||
memory sample every 30 minutes), pass/fail thresholds, and conditions for early
|
||||
termination.
|
||||
|
||||
The skill asks "May I write to `production/qa/soak-[slug]-[date].md`?" before
|
||||
persisting. If a previous soak test for the same system exists, the skill offers
|
||||
to extend the duration or add new conditions. No director gates apply. The verdict
|
||||
is COMPLETE when the soak test protocol is written.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" collaborative protocol language before writing the protocol
|
||||
- [ ] Has a next-step handoff (e.g., `/regression-suite` or `/release-checklist`)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/soak-test` is a QA planning utility. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Online gameplay feature, 2-hour soak protocol
|
||||
|
||||
**Fixture:**
|
||||
- User specifies: system = "online multiplayer lobby", duration = "2 hours"
|
||||
- `technical-preferences.md` has engine configured
|
||||
|
||||
**Input:** `/soak-test online-lobby 2h`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill generates a 2-hour soak test protocol for the online lobby system
|
||||
2. Protocol includes: monitoring checkpoints every 30 minutes, metrics to track
|
||||
(memory usage, connection count, packet loss), pass thresholds, early termination
|
||||
conditions (crash or >20% memory growth)
|
||||
3. Networking-specific checks are included (session drop rate, reconnect handling)
|
||||
4. Skill asks "May I write to `production/qa/soak-online-lobby-2026-04-06.md`?"
|
||||
5. File is written on approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Protocol duration matches the requested 2 hours
|
||||
- [ ] Monitoring checkpoints are at reasonable intervals (e.g., every 30 minutes)
|
||||
- [ ] Network-specific checks are included (not just generic memory checks)
|
||||
- [ ] "May I write" is asked with the correct file path
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: No Target Defined — Prompts for system, duration, and conditions
|
||||
|
||||
**Fixture:**
|
||||
- No arguments provided
|
||||
- No soak test config in session state
|
||||
|
||||
**Input:** `/soak-test`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no target system or duration specified
|
||||
2. Skill asks: "What system or feature should be soak-tested?"
|
||||
3. After user responds with system: Skill asks: "What duration? (e.g., 1h, 4h, 8h)"
|
||||
4. After user responds with duration: Skill asks for specific conditions or
|
||||
uses defaults (normal gameplay loop, default player count)
|
||||
5. Skill generates protocol from collected inputs and asks "May I write"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] At minimum 2 follow-up questions are asked (system + duration)
|
||||
- [ ] Default conditions are applied when user doesn't specify custom ones
|
||||
- [ ] Protocol is not generated until system and duration are known
|
||||
- [ ] Verdict is COMPLETE after file is written
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Previous Soak Test Exists — Offers to extend or add conditions
|
||||
|
||||
**Fixture:**
|
||||
- `production/qa/soak-online-lobby-2026-03-15.md` exists with a 1-hour protocol
|
||||
- User wants to extend to 4 hours with new memory threshold conditions
|
||||
|
||||
**Input:** `/soak-test online-lobby 4h`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill finds existing soak test for online-lobby
|
||||
2. Skill reports: "Previous soak test found: soak-online-lobby-2026-03-15.md (1h)"
|
||||
3. Skill presents options: create new protocol (4h standalone), or extend the
|
||||
existing protocol to 4h and add new conditions
|
||||
4. User selects extend; existing checkpoints are preserved, new ones added
|
||||
5. Skill asks "May I write to `production/qa/soak-online-lobby-2026-04-06.md`?"
|
||||
(new file, not overwriting old one)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Existing soak test is surfaced and referenced
|
||||
- [ ] User is offered extend vs. new options
|
||||
- [ ] New file is created (old file is not overwritten)
|
||||
- [ ] Extended protocol includes both old and new checkpoints
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Mobile Target Platform — Memory-specific checkpoints added
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` specifies target platform: Mobile
|
||||
- User requests soak test for "gameplay session" at 30 minutes
|
||||
|
||||
**Input:** `/soak-test gameplay 30m`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `technical-preferences.md` and detects mobile target platform
|
||||
2. Soak test protocol includes mobile-specific memory checkpoints:
|
||||
- Check heap memory growth vs. device baseline
|
||||
- Check texture memory at checkpoint intervals
|
||||
- Add warning threshold at 300MB (mobile ceiling)
|
||||
3. Protocol also includes thermal/battery drain advisory notes
|
||||
4. Skill asks "May I write?" and writes on approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Mobile platform is detected from technical-preferences.md
|
||||
- [ ] Memory checkpoints include mobile-appropriate thresholds (not desktop)
|
||||
- [ ] Thermal/battery notes are present in the protocol
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; soak-test is a planning utility
|
||||
|
||||
**Fixture:**
|
||||
- Valid system and duration provided
|
||||
|
||||
**Input:** `/soak-test combat 1h`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill generates and writes the soak test protocol
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Skill reaches COMPLETE without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Collects system, duration, and conditions before generating protocol
|
||||
- [ ] Includes monitoring checkpoints at regular intervals
|
||||
- [ ] Includes pass/fail thresholds and early termination conditions
|
||||
- [ ] Adapts checkpoints to target platform (mobile vs. desktop)
|
||||
- [ ] Asks "May I write" before creating the protocol file
|
||||
- [ ] Verdict is COMPLETE when file is written
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Soak tests for specific engine subsystems (rendering pipeline, physics
|
||||
simulation) follow the same protocol structure and are not separately tested.
|
||||
- The case where the user provides a duration shorter than the minimum useful
|
||||
soak period (e.g., 5 minutes) is not tested; the skill would note this is
|
||||
too short for meaningful results.
|
||||
- Automated execution of the soak test protocol is outside this skill's scope —
|
||||
this skill generates the plan, not the runner.
|
||||
173
CCGS Skill Testing Framework/skills/utility/start.md
Normal file
173
CCGS Skill Testing Framework/skills/utility/start.md
Normal file
@@ -0,0 +1,173 @@
|
||||
# Skill Test Spec: /start
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/start` is the first-time onboarding skill for new projects. It guides the
|
||||
user through naming the project, choosing a game engine, and setting up the
|
||||
initial directory structure. It creates stub configuration files (CLAUDE.md,
|
||||
technical-preferences.md) and then routes to `/setup-engine` with the chosen
|
||||
engine as an argument. Each file or directory created is gated behind a
|
||||
"May I write" ask, following the collaborative protocol.
|
||||
|
||||
The skill detects whether a project is already configured and whether a
|
||||
partial setup exists, offering to resume or restart as appropriate. It has
|
||||
no director gates — it is a utility setup skill that runs before any agent
|
||||
hierarchy exists.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
|
||||
- [ ] Contains "May I write" collaborative protocol language for each config file
|
||||
- [ ] Has a next-step handoff at the end (routes to `/setup-engine`)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/start` is a utility setup skill. No director agents exist yet at the
|
||||
point this skill runs.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Fresh repo, no engine, full onboarding flow
|
||||
|
||||
**Fixture:**
|
||||
- Empty repository: no CLAUDE.md overrides, no `production/stage.txt`, no
|
||||
`technical-preferences.md` content beyond placeholders
|
||||
- No existing design docs or source code
|
||||
|
||||
**Input:** `/start`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects no existing configuration and begins fresh onboarding
|
||||
2. Skill asks for project name
|
||||
3. Skill presents 3 engine options: Godot 4, Unity, Unreal Engine 5
|
||||
4. User selects an engine
|
||||
5. Skill asks "May I write the initial directory structure?"
|
||||
6. Skill creates all directories defined in `directory-structure.md`
|
||||
7. Skill asks "May I write CLAUDE.md stub?" and writes it on approval
|
||||
8. Skill routes to `/setup-engine [chosen-engine]` to complete technical config
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Project name is captured before any file is written
|
||||
- [ ] Exactly 3 engine options are presented
|
||||
- [ ] "May I write" is asked for each config file individually
|
||||
- [ ] No file is written without explicit user approval
|
||||
- [ ] Handoff to `/setup-engine` occurs at the end with the chosen engine argument
|
||||
- [ ] Verdict is COMPLETE after all files are written and handoff is issued
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Already Configured — Detects existing config, offers to skip or reconfigure
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` has engine already set (not placeholder)
|
||||
- `production/stage.txt` exists with `Concept`
|
||||
|
||||
**Input:** `/start`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `technical-preferences.md` and detects configured engine
|
||||
2. Skill reports: "This project is already configured with [engine]"
|
||||
3. Skill presents options: skip (exit), reconfigure engine, or reconfigure specific sections
|
||||
4. If user selects skip: skill exits cleanly with a summary of current config
|
||||
5. If user selects reconfigure: skill proceeds to the engine-selection step
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT overwrite existing config without user choosing reconfigure
|
||||
- [ ] Detected engine name is shown to the user in the status message
|
||||
- [ ] User is offered at least 2 options (skip or reconfigure)
|
||||
- [ ] Verdict is COMPLETE whether user skips or reconfigures
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Engine Choice — User picks Godot 4, routes to /setup-engine godot
|
||||
|
||||
**Fixture:**
|
||||
- Fresh repo — no existing configuration
|
||||
|
||||
**Input:** `/start`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill presents engine options and user selects Godot 4
|
||||
2. Skill writes initial stubs (directory structure, CLAUDE.md) after approval
|
||||
3. Skill explicitly routes to `/setup-engine godot` as the next step
|
||||
4. Handoff message clearly names the engine and the next skill invocation
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Handoff command is `/setup-engine godot` (not generic `/setup-engine`)
|
||||
- [ ] Handoff is issued after all initial stubs are written, not before
|
||||
- [ ] Engine choice is echoed back to user before writing begins
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Interrupted Setup — Partial config detected, offers resume or restart
|
||||
|
||||
**Fixture:**
|
||||
- Directory structure exists (was created) but `technical-preferences.md` is
|
||||
still all placeholders (engine was never chosen — setup was interrupted)
|
||||
- No `production/stage.txt`
|
||||
|
||||
**Input:** `/start`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects partial state: directories exist but engine is unconfigured
|
||||
2. Skill reports: "A partial setup was detected — directories exist but engine is not configured"
|
||||
3. Skill offers: resume from engine selection, or restart from scratch
|
||||
4. If resume: skill skips directory creation, proceeds to engine choice
|
||||
5. If restart: skill asks "May I overwrite existing structure?" before proceeding
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Partial state is correctly identified (directories present, engine absent)
|
||||
- [ ] User is offered resume vs. restart choice — not forced into one path
|
||||
- [ ] Resume path skips re-creating directories (no redundant "May I write" for structure)
|
||||
- [ ] Restart path asks for permission to overwrite before touching any files
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; start is a utility setup skill
|
||||
|
||||
**Fixture:**
|
||||
- Any fixture
|
||||
|
||||
**Input:** `/start`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill completes full onboarding flow
|
||||
2. No director agents are spawned at any point
|
||||
3. No gate IDs (CD-*, TD-*, AD-*, PR-*) appear in the output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked during the skill execution
|
||||
- [ ] No gate skip messages appear (gates are absent, not suppressed)
|
||||
- [ ] Skill reaches COMPLETE without any gate verdict
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Asks for project name before any file is written
|
||||
- [ ] Presents engine options as a structured choice (not free text)
|
||||
- [ ] Asks "May I write" separately for directory structure and for CLAUDE.md stub
|
||||
- [ ] Ends with a handoff to `/setup-engine` with the engine name as argument
|
||||
- [ ] Verdict is clearly stated (COMPLETE or BLOCKED) at end of output
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where the user rejects all engine options and provides a custom
|
||||
engine name is not tested — the skill is designed for the three supported
|
||||
engines only.
|
||||
- Git initialization (if any) is not tested here; that is an infrastructure
|
||||
concern outside the skill boundary.
|
||||
- Solo vs. lean mode behavior is not applicable — this skill has no gates and
|
||||
mode selection is irrelevant.
|
||||
175
CCGS Skill Testing Framework/skills/utility/test-helpers.md
Normal file
175
CCGS Skill Testing Framework/skills/utility/test-helpers.md
Normal file
@@ -0,0 +1,175 @@
|
||||
# Skill Test Spec: /test-helpers
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/test-helpers` generates engine-specific test helper utilities for the project's
|
||||
test suite. Helpers include factory functions (for creating test entities with
|
||||
known state), fixture loaders, assertion helpers, and mock stubs for external
|
||||
dependencies. Generated helpers follow the naming and structure conventions in
|
||||
`coding-standards.md` and are written to `tests/helpers/`.
|
||||
|
||||
Each helper file is gated behind a "May I write" ask. If a helper file already
|
||||
exists, the skill offers to extend it rather than replace. No director gates
|
||||
apply. The verdict is COMPLETE when helper files are written.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" collaborative protocol language before writing helpers
|
||||
- [ ] Has a next-step handoff (e.g., write a test using the generated helper)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/test-helpers` is a scaffolding utility. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Player factory helper generated for Godot/GDScript
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` has engine Godot 4, language GDScript
|
||||
- `tests/` directory exists (test-setup has been run)
|
||||
- `design/gdd/player.md` exists with defined player properties
|
||||
- No existing helpers in `tests/helpers/`
|
||||
|
||||
**Input:** `/test-helpers player-factory`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads engine (Godot 4 / GDScript) and player GDD for property context
|
||||
2. Skill generates a deterministic `PlayerFactory` helper in GDScript:
|
||||
- `create_player(health: int = 100, speed: float = 200.0)` function
|
||||
- Returns a player node pre-configured to a known state
|
||||
- Uses dependency injection (no singletons)
|
||||
3. Skill asks "May I write to `tests/helpers/player_factory.gd`?"
|
||||
4. File is written on approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Generated helper is in GDScript (not C# or Blueprint)
|
||||
- [ ] Factory function parameters use defaults matching GDD values
|
||||
- [ ] Helper uses dependency injection (no Autoload/singleton references)
|
||||
- [ ] Filename follows snake_case convention for GDScript
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: No Test Setup Exists — Redirects to /test-setup
|
||||
|
||||
**Fixture:**
|
||||
- `tests/` directory does not exist
|
||||
|
||||
**Input:** `/test-helpers player-factory`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill checks for `tests/` directory — not found
|
||||
2. Skill reports: "Test directory not found — test framework must be set up first"
|
||||
3. Skill suggests running `/test-setup` before generating helpers
|
||||
4. No helper file is created
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Error message identifies the missing tests/ directory
|
||||
- [ ] `/test-setup` is suggested as the prerequisite step
|
||||
- [ ] No write tool is called
|
||||
- [ ] Verdict is not COMPLETE (blocked state)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Helper Already Exists — Offers to extend rather than replace
|
||||
|
||||
**Fixture:**
|
||||
- `tests/helpers/player_factory.gd` already exists with a `create_player()` function
|
||||
- User requests a new `create_enemy()` function be added to the factory
|
||||
|
||||
**Input:** `/test-helpers enemy-factory`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill finds an existing `player_factory.gd` and checks if it's the right file
|
||||
to extend (or if a separate `enemy_factory.gd` should be created)
|
||||
2. Skill presents options: add `create_enemy()` to existing factory or create
|
||||
`tests/helpers/enemy_factory.gd`
|
||||
3. User selects extend; skill drafts the `create_enemy()` function
|
||||
4. Skill asks "May I extend `tests/helpers/player_factory.gd`?"
|
||||
5. Function is added on approval; verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Existing helper is detected and surfaced
|
||||
- [ ] User is given extend vs. new file choice
|
||||
- [ ] "May I extend" language is used (not "May I write" for replacement)
|
||||
- [ ] Existing `create_player()` is preserved in the extended file
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 4: System Has No GDD — Notes missing design context in helper
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` has Godot 4 / GDScript
|
||||
- `tests/` exists
|
||||
- User requests a helper for the "inventory system" but no `design/gdd/inventory.md` exists
|
||||
|
||||
**Input:** `/test-helpers inventory-factory`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill looks for `design/gdd/inventory.md` — not found
|
||||
2. Skill notes: "No GDD found for inventory — generating helper with placeholder defaults"
|
||||
3. Skill generates an `inventory_factory.gd` with generic placeholder values
|
||||
(item_count = 0, max_capacity = 20) and a comment: "# TODO: align defaults
|
||||
with inventory GDD when written"
|
||||
4. Skill asks "May I write to `tests/helpers/inventory_factory.gd`?"
|
||||
5. File is written; verdict is COMPLETE with advisory note
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill proceeds without GDD (does not block)
|
||||
- [ ] Generated helper has placeholder defaults with TODO comment
|
||||
- [ ] Missing GDD is noted in the output (advisory warning)
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; test-helpers is a scaffolding utility
|
||||
|
||||
**Fixture:**
|
||||
- Engine configured, tests/ exists
|
||||
|
||||
**Input:** `/test-helpers player-factory`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill generates and writes the helper file
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is COMPLETE without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads engine before generating any helper (helpers are engine-specific)
|
||||
- [ ] Reads GDD for default values when available
|
||||
- [ ] Notes missing GDD context rather than blocking
|
||||
- [ ] Detects existing helper files and offers extend rather than replace
|
||||
- [ ] Asks "May I write" (or "May I extend") before any file operation
|
||||
- [ ] Verdict is COMPLETE when helper is written
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Mock/stub helper generation (for dependencies like save systems or audio buses)
|
||||
follows the same pattern as factory helpers and is not separately tested.
|
||||
- Unity C# helper generation (using NSubstitute or custom mocks) follows the
|
||||
same logic as Case 1 with language-appropriate output.
|
||||
- The case where the requested helper type is not recognized is not tested;
|
||||
the skill would ask the user to clarify the helper type.
|
||||
173
CCGS Skill Testing Framework/skills/utility/test-setup.md
Normal file
173
CCGS Skill Testing Framework/skills/utility/test-setup.md
Normal file
@@ -0,0 +1,173 @@
|
||||
# Skill Test Spec: /test-setup
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/test-setup` scaffolds the test framework for the project based on the
|
||||
configured engine. It creates the `tests/` directory structure defined in
|
||||
`coding-standards.md` (unit/, integration/, performance/, playtest/) and
|
||||
generates the appropriate test runner configuration for the detected engine:
|
||||
GdUnit4 config for Godot, Unity Test Runner asmdef for Unity, or Unreal headless
|
||||
runner for Unreal Engine.
|
||||
|
||||
Each file or directory created is gated behind a "May I write" ask. If the test
|
||||
framework already exists, the skill verifies the configuration rather than
|
||||
reinitializing. No director gates apply. The verdict is COMPLETE when the
|
||||
scaffold is in place.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keyword: COMPLETE
|
||||
- [ ] Contains "May I write" collaborative protocol language before creating files
|
||||
- [ ] Has a next-step handoff (e.g., `/test-helpers` to generate helper utilities)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. `/test-setup` is a scaffolding utility. No director gates apply.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Godot project, scaffolds GdUnit4 test structure
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` has engine set to Godot 4, language GDScript
|
||||
- `tests/` directory does not exist yet
|
||||
|
||||
**Input:** `/test-setup`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads engine from `technical-preferences.md` → Godot 4 + GDScript
|
||||
2. Skill drafts the test directory structure: tests/unit/, tests/integration/,
|
||||
tests/performance/, tests/playtest/, and a GdUnit4 runner config file
|
||||
3. Skill asks "May I write the tests/ directory structure?"
|
||||
4. Directories and GdUnit4 runner script created on approval
|
||||
5. Skill confirms the runner script matches the CI command in coding-standards.md:
|
||||
`godot --headless --script tests/gdunit4_runner.gd`
|
||||
6. Verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 4 subdirectories (unit/, integration/, performance/, playtest/) are created
|
||||
- [ ] GdUnit4 runner config is generated
|
||||
- [ ] Runner script path matches coding-standards.md CI command
|
||||
- [ ] "May I write" is asked before creating any files
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Unity Project — Scaffolds Unity Test Runner with asmdef
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` has engine set to Unity, language C#
|
||||
- `tests/` directory does not exist
|
||||
|
||||
**Input:** `/test-setup`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads engine → Unity + C#
|
||||
2. Skill creates `Tests/` directory with Unity conventions (capitalized)
|
||||
3. Skill generates `Tests/Tests.asmdef` and `Tests/Editor/EditorTests.asmdef`
|
||||
4. EditMode and PlayMode test runner modes are configured
|
||||
5. Skill asks "May I write the Tests/ directory structure?"
|
||||
6. Verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Unity-specific `Tests/` structure is created (not the Godot structure)
|
||||
- [ ] `.asmdef` files are generated
|
||||
- [ ] EditMode and PlayMode runner config is present
|
||||
- [ ] Verdict is COMPLETE
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Test Framework Already Exists — Verifies config, not re-initialized
|
||||
|
||||
**Fixture:**
|
||||
- `tests/unit/`, `tests/integration/` exist
|
||||
- GdUnit4 runner script exists (Godot project)
|
||||
|
||||
**Input:** `/test-setup`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill detects existing tests/ structure
|
||||
2. Skill reports: "Test framework already exists — verifying configuration"
|
||||
3. Skill checks: runner script path, directory completeness, CI command alignment
|
||||
4. If all checks pass: reports "Configuration verified — no changes needed"
|
||||
5. If checks fail (e.g., missing tests/performance/): reports specific gap and
|
||||
asks "May I add the missing directories?"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does NOT reinitialize when framework exists
|
||||
- [ ] Verification checks are performed on existing structure
|
||||
- [ ] Only missing parts trigger a "May I write" ask
|
||||
- [ ] Verdict is COMPLETE whether everything was OK or gaps were fixed
|
||||
|
||||
---
|
||||
|
||||
### Case 4: No Engine Configured — Redirects to /setup-engine
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` contains only placeholders (engine not set)
|
||||
|
||||
**Input:** `/test-setup`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `technical-preferences.md` and finds engine placeholder
|
||||
2. Skill reports: "Engine not configured — cannot scaffold engine-specific test framework"
|
||||
3. Skill suggests running `/setup-engine` first
|
||||
4. No directories or files are created
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Error message explicitly states engine is not configured
|
||||
- [ ] `/setup-engine` is suggested as the next step
|
||||
- [ ] No write tool is called
|
||||
- [ ] Verdict is not COMPLETE (blocked state)
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate Check — No gate; test-setup is a scaffolding utility
|
||||
|
||||
**Fixture:**
|
||||
- Engine configured, tests/ does not exist
|
||||
|
||||
**Input:** `/test-setup`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scaffolds and writes all test framework files
|
||||
2. No director agents are spawned
|
||||
3. No gate IDs appear in output
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked
|
||||
- [ ] No gate skip messages appear
|
||||
- [ ] Verdict is COMPLETE without any gate check
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads engine from `technical-preferences.md` before generating any scaffold
|
||||
- [ ] Generates engine-appropriate test runner config (not generic)
|
||||
- [ ] Creates all 4 subdirectories from coding-standards.md
|
||||
- [ ] Asks "May I write" before creating files
|
||||
- [ ] Detects existing framework and offers verification (not reinitialization)
|
||||
- [ ] Verdict is COMPLETE when scaffold is in place
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Unreal Engine test scaffolding (headless runner with `-nullrhi`) follows the
|
||||
same pattern as Cases 1 and 2 and is not separately fixture-tested.
|
||||
- CI integration file generation (e.g., `.github/workflows/test.yml`) is
|
||||
referenced but not assertion-tested here — it may be a separate skill concern.
|
||||
- The case where tests/ exists but is from a different engine (e.g., Unity tests
|
||||
in a now-Godot project) is not tested; the skill would detect the mismatch
|
||||
and offer to reconcile.
|
||||
Reference in New Issue
Block a user