添加 claude code game studios 到项目
This commit is contained in:
170
CCGS Skill Testing Framework/skills/analysis/asset-audit.md
Normal file
170
CCGS Skill Testing Framework/skills/analysis/asset-audit.md
Normal file
@@ -0,0 +1,170 @@
|
||||
# Skill Test Spec: /asset-audit
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/asset-audit` audits the `assets/` directory for naming convention compliance,
|
||||
missing metadata, and format/size issues. It reads asset files against the
|
||||
conventions and budgets defined in `technical-preferences.md`. No director gates
|
||||
are invoked. The skill does not write without user approval. Verdicts: COMPLIANT,
|
||||
WARNINGS, or NON-COMPLIANT.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLIANT, WARNINGS, NON-COMPLIANT
|
||||
- [ ] Does NOT require "May I write" language (read-only; optional report requires approval)
|
||||
- [ ] Has a next-step handoff (what to do after audit results)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Asset auditing is a read-only analysis skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All assets follow naming conventions
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` specifies naming convention: `snake_case`, e.g., `enemy_grunt_idle.png`
|
||||
- `assets/art/characters/` contains: `enemy_grunt_idle.png`, `enemy_sniper_run.png`
|
||||
- `assets/audio/sfx/` contains: `sfx_jump_land.ogg`, `sfx_item_pickup.ogg`
|
||||
- All files are within size budget (textures ≤2MB, audio ≤500KB)
|
||||
|
||||
**Input:** `/asset-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads naming conventions and size budgets from `technical-preferences.md`
|
||||
2. Skill scans `assets/` recursively
|
||||
3. All files match `snake_case` convention; all within budget
|
||||
4. Audit table shows all rows PASS
|
||||
5. Verdict is COMPLIANT
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Audit covers both art and audio asset directories
|
||||
- [ ] Each file is checked against naming convention and size budget
|
||||
- [ ] All rows show PASS when compliant
|
||||
- [ ] Verdict is COMPLIANT
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Non-Compliant — Textures exceed size budget
|
||||
|
||||
**Fixture:**
|
||||
- `assets/art/environment/` contains 5 texture files
|
||||
- 3 texture files are 4MB each (budget: ≤2MB)
|
||||
- 2 texture files are within budget
|
||||
|
||||
**Input:** `/asset-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads size budget from `technical-preferences.md` (2MB for textures)
|
||||
2. Skill scans `assets/art/environment/` — finds 3 oversized textures
|
||||
3. Audit table lists each oversized file with actual size and budget
|
||||
4. Verdict is NON-COMPLIANT
|
||||
5. Skill recommends compression or resolution reduction for flagged files
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 3 oversized files are listed by name with actual size and budget size
|
||||
- [ ] Verdict is NON-COMPLIANT when any file exceeds its budget
|
||||
- [ ] Optimization recommendation is given for oversized files
|
||||
- [ ] Within-budget files are also listed (showing PASS) for completeness
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Format Issue — Audio in wrong format
|
||||
|
||||
**Fixture:**
|
||||
- `technical-preferences.md` specifies audio format: OGG
|
||||
- `assets/audio/music/theme_main.wav` exists (WAV format)
|
||||
- `assets/audio/sfx/sfx_footstep.ogg` exists (correct OGG format)
|
||||
|
||||
**Input:** `/asset-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads audio format requirement: OGG
|
||||
2. Skill scans `assets/audio/` — finds `theme_main.wav` in wrong format
|
||||
3. Audit table flags `theme_main.wav` as FORMAT ISSUE (expected OGG, found WAV)
|
||||
4. `sfx_footstep.ogg` shows PASS
|
||||
5. Verdict is WARNINGS (format issues are correctable)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] `theme_main.wav` is flagged as FORMAT ISSUE with expected and actual format noted
|
||||
- [ ] Verdict is WARNINGS (not NON-COMPLIANT) for format issues, which are correctable
|
||||
- [ ] Correct-format assets are shown as PASS
|
||||
- [ ] Skill does not modify or convert any asset files
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Missing Asset — Asset referenced by GDD but absent from assets/
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/enemies.md` references `enemy_boss_idle.png`
|
||||
- `assets/art/characters/boss/` directory is empty — file does not exist
|
||||
|
||||
**Input:** `/asset-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads GDD references to find expected assets (cross-references with `/content-audit` scope)
|
||||
2. Skill scans `assets/art/characters/boss/` — file not found
|
||||
3. Audit table flags `enemy_boss_idle.png` as MISSING ASSET
|
||||
4. Verdict is NON-COMPLIANT (missing critical art asset)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks GDD references to identify expected assets
|
||||
- [ ] Missing assets are flagged as MISSING ASSET with the GDD reference noted
|
||||
- [ ] Verdict is NON-COMPLIANT when critical assets are missing
|
||||
- [ ] Skill does not create or add placeholder assets
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; technical-artist may be consulted separately
|
||||
|
||||
**Fixture:**
|
||||
- 2 files have naming convention violations (CamelCase instead of snake_case)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/asset-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans assets and finds 2 naming violations
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Verdict is WARNINGS
|
||||
4. Output notes: "Consider having a Technical Artist review naming conventions"
|
||||
5. Skill presents findings; offers optional audit report write
|
||||
6. If user opts in: "May I write to `production/qa/asset-audit-[date].md`?"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Technical artist consultation is suggested (not mandated)
|
||||
- [ ] Findings table is presented before any write prompt
|
||||
- [ ] Optional audit report write asks "May I write" before writing
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads `technical-preferences.md` for naming conventions, formats, and size budgets
|
||||
- [ ] Scans `assets/` directory recursively
|
||||
- [ ] Audit table shows file name, check type, expected value, actual value, and result
|
||||
- [ ] Does not modify any asset files
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: COMPLIANT, WARNINGS, NON-COMPLIANT
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Metadata checks (e.g., missing texture import settings in Godot `.import` files)
|
||||
are not explicitly tested here; they follow the same FORMAT ISSUE flagging pattern.
|
||||
- The interaction between `/asset-audit` and `/content-audit` (both check GDD
|
||||
references vs. assets) is intentional overlap; `/asset-audit` focuses on
|
||||
compliance while `/content-audit` focuses on completeness.
|
||||
172
CCGS Skill Testing Framework/skills/analysis/balance-check.md
Normal file
172
CCGS Skill Testing Framework/skills/analysis/balance-check.md
Normal file
@@ -0,0 +1,172 @@
|
||||
# Skill Test Spec: /balance-check
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/balance-check` reads balance data files (JSON or YAML in `assets/data/`) and
|
||||
checks each value against the design formulas defined in GDDs under `design/gdd/`.
|
||||
It produces a findings table with columns: Value → Formula → Deviation → Severity.
|
||||
No director gates are invoked (read-only analysis). The skill may optionally write
|
||||
a balance report but asks "May I write" before doing so. Verdicts: BALANCED,
|
||||
CONCERNS, or OUT OF BALANCE.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: BALANCED, CONCERNS, OUT OF BALANCE
|
||||
- [ ] Contains "May I write" language (optional report write)
|
||||
- [ ] Has a next-step handoff (what to do after findings are reviewed)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Balance check is a read-only analysis skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All balance values within formula tolerances
|
||||
|
||||
**Fixture:**
|
||||
- `assets/data/combat-balance.json` exists with 6 stat values
|
||||
- `design/gdd/combat-system.md` contains formulas for all 6 stats with ±10% tolerance
|
||||
- All 6 values fall within tolerance
|
||||
|
||||
**Input:** `/balance-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all balance data files in `assets/data/`
|
||||
2. Skill reads GDD formulas from `design/gdd/`
|
||||
3. Skill computes deviation for each value against its formula
|
||||
4. All deviations are within ±10% tolerance
|
||||
5. Skill outputs findings table with all rows showing PASS
|
||||
6. Verdict is BALANCED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Findings table is shown for all checked values
|
||||
- [ ] Each row shows: stat name, formula target, actual value, deviation percentage
|
||||
- [ ] All rows show PASS or equivalent when within tolerance
|
||||
- [ ] Verdict is BALANCED
|
||||
- [ ] No files are written without user approval
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Out of Balance — Player damage 40% above formula target
|
||||
|
||||
**Fixture:**
|
||||
- `assets/data/combat-balance.json` has `player_damage_base: 140`
|
||||
- `design/gdd/combat-system.md` formula specifies `player_damage_base = 100` (±10%)
|
||||
- All other stats are within tolerance
|
||||
|
||||
**Input:** `/balance-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads combat-balance.json and computes deviation for `player_damage_base`
|
||||
2. Deviation is +40% — far outside ±10% tolerance
|
||||
3. Skill flags this row as severity HIGH in the findings table
|
||||
4. Verdict is OUT OF BALANCE
|
||||
5. Skill surfaces the HIGH severity item prominently before the table
|
||||
|
||||
**Assertions:**
|
||||
- [ ] `player_damage_base` row shows deviation of +40%
|
||||
- [ ] Severity is HIGH for deviations exceeding tolerance by more than 2×
|
||||
- [ ] Verdict is OUT OF BALANCE when any stat has HIGH severity deviation
|
||||
- [ ] The HIGH severity item is called out explicitly, not buried in table rows
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No GDD Formulas — Cannot validate, guidance given
|
||||
|
||||
**Fixture:**
|
||||
- `assets/data/economy-balance.yaml` exists with 10 stat values
|
||||
- No GDD in `design/gdd/` contains formula definitions for economy stats
|
||||
|
||||
**Input:** `/balance-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads balance data files
|
||||
2. Skill searches GDDs for formula definitions — finds none for economy stats
|
||||
3. Skill outputs: "Cannot validate economy stats — no formulas defined. Run /design-system first."
|
||||
4. No findings table is generated for the economy stats
|
||||
5. Verdict is CONCERNS (data exists but cannot be validated)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not fabricate formula targets when none exist in GDDs
|
||||
- [ ] Output explicitly names the missing formula source
|
||||
- [ ] Output recommends running `/design-system` to define formulas
|
||||
- [ ] Verdict is CONCERNS (not BALANCED, since validation was impossible)
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Orphan Reference — Balance file references an undefined stat
|
||||
|
||||
**Fixture:**
|
||||
- `assets/data/combat-balance.json` contains a stat `legacy_armor_mult: 1.5`
|
||||
- `design/gdd/combat-system.md` has no formula for `legacy_armor_mult`
|
||||
- All other stats have formula definitions and pass validation
|
||||
|
||||
**Input:** `/balance-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all stats from combat-balance.json
|
||||
2. Skill cannot find a formula for `legacy_armor_mult` in any GDD
|
||||
3. Skill flags `legacy_armor_mult` as ORPHAN REFERENCE in the findings table
|
||||
4. Other stats are evaluated normally; those within tolerance show PASS
|
||||
5. Verdict is CONCERNS (orphan reference prevents full validation)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] `legacy_armor_mult` appears in findings table with status ORPHAN REFERENCE
|
||||
- [ ] Orphan references are distinguished from formula deviations in the table
|
||||
- [ ] Verdict is CONCERNS when any orphan references are found
|
||||
- [ ] Skill does not skip orphan stats silently
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — Read-only; no gate; optional report requires approval
|
||||
|
||||
**Fixture:**
|
||||
- Balance data and GDD formulas exist; 1 stat has CONCERNS-level deviation (15% above target)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/balance-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads data and GDDs; generates findings table
|
||||
2. Verdict is CONCERNS (one stat slightly out of range)
|
||||
3. No director gate is invoked
|
||||
4. Skill presents findings table to user
|
||||
5. Skill offers to write an optional balance report
|
||||
6. If user says yes: skill asks "May I write to `production/qa/balance-report-[date].md`?"
|
||||
7. If user says no: skill ends without writing
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Findings table is presented without writing anything automatically
|
||||
- [ ] Optional report write is offered but not forced
|
||||
- [ ] "May I write" prompt appears only if user opts in to the report
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads both balance data files and GDD formulas before analysis
|
||||
- [ ] Findings table shows Value, Formula, Deviation, and Severity columns
|
||||
- [ ] Does not write any files without explicit user approval
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: BALANCED, CONCERNS, OUT OF BALANCE
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where `assets/data/` is entirely empty is not tested; behavior
|
||||
follows the CONCERNS pattern with a message that no data files were found.
|
||||
- Tolerance thresholds (±10%, ±20%) are implementation details of the skill;
|
||||
the tests verify that deviations are detected and classified, not the
|
||||
exact threshold values.
|
||||
172
CCGS Skill Testing Framework/skills/analysis/code-review.md
Normal file
172
CCGS Skill Testing Framework/skills/analysis/code-review.md
Normal file
@@ -0,0 +1,172 @@
|
||||
# Skill Test Spec: /code-review
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/code-review` performs an architectural code review of source files in `src/`,
|
||||
checking coding standards from `CLAUDE.md` (doc comments on public APIs,
|
||||
dependency injection over singletons, data-driven values, testability). Findings
|
||||
are advisory. No director gates are invoked. No code edits are made. Verdicts:
|
||||
APPROVED, CONCERNS, or NEEDS CHANGES.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: APPROVED, CONCERNS, NEEDS CHANGES
|
||||
- [ ] Does NOT require "May I write" language (read-only; findings are advisory output)
|
||||
- [ ] Has a next-step handoff (what to do with findings)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Code review is a read-only advisory skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Source file follows all coding standards
|
||||
|
||||
**Fixture:**
|
||||
- `src/gameplay/health_component.gd` exists with:
|
||||
- All public methods have doc comments (`##` notation)
|
||||
- No singletons used; dependencies injected via constructor
|
||||
- No hardcoded values; all constants reference `assets/data/`
|
||||
- ADR reference in file header: `# ADR: docs/architecture/adr-004-health.md`
|
||||
- Referenced ADR has `Status: Accepted`
|
||||
|
||||
**Input:** `/code-review src/gameplay/health_component.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the source file
|
||||
2. Skill checks all coding standards: doc comments, DI, data-driven, ADR status
|
||||
3. All checks pass
|
||||
4. Skill outputs findings summary with all checks PASS
|
||||
5. Verdict is APPROVED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Each coding standard check is listed in the output
|
||||
- [ ] All checks show PASS when standards are met
|
||||
- [ ] Skill reads referenced ADR to confirm its status
|
||||
- [ ] Verdict is APPROVED
|
||||
- [ ] No edits are made to any file
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Needs Changes — Missing doc comment and singleton usage
|
||||
|
||||
**Fixture:**
|
||||
- `src/ui/inventory_ui.gd` has:
|
||||
- 2 public methods without doc comments
|
||||
- Uses `GameManager.instance` (singleton pattern)
|
||||
- All other standards met
|
||||
|
||||
**Input:** `/code-review src/ui/inventory_ui.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the source file
|
||||
2. Skill detects: 2 missing doc comments on public methods
|
||||
3. Skill detects: singleton usage at specific lines (e.g., line 42, line 87)
|
||||
4. Findings list the exact method names and line numbers
|
||||
5. Verdict is NEEDS CHANGES
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Missing doc comments are listed with method names
|
||||
- [ ] Singleton usage is flagged with file and line number
|
||||
- [ ] Verdict is NEEDS CHANGES when BLOCKING-level standard violations exist
|
||||
- [ ] Skill does not edit the file — findings are for the developer to act on
|
||||
- [ ] Output suggests replacing singleton with dependency injection
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Architecture Risk — ADR reference is Proposed, not Accepted
|
||||
|
||||
**Fixture:**
|
||||
- `src/core/save_system.gd` has a header comment: `# ADR: docs/architecture/adr-010-save.md`
|
||||
- `adr-010-save.md` exists but has `Status: Proposed`
|
||||
- Code itself follows all other coding standards
|
||||
|
||||
**Input:** `/code-review src/core/save_system.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the source file
|
||||
2. Skill reads referenced ADR — finds `Status: Proposed`
|
||||
3. Skill flags this as ARCHITECTURE RISK (code is implementing an unaccepted ADR)
|
||||
4. Other coding standard checks pass
|
||||
5. Verdict is CONCERNS (risk flag is advisory, not a hard NEEDS CHANGES)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads referenced ADR file to check its status
|
||||
- [ ] ARCHITECTURE RISK is flagged when ADR status is Proposed
|
||||
- [ ] Verdict is CONCERNS (not NEEDS CHANGES) for ADR risk — advisory severity
|
||||
- [ ] Output recommends resolving the ADR before the code goes to production
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No source files found at specified path
|
||||
|
||||
**Fixture:**
|
||||
- User calls `/code-review src/networking/`
|
||||
- `src/networking/` directory does not exist
|
||||
|
||||
**Input:** `/code-review src/networking/`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read files in `src/networking/`
|
||||
2. Directory or files not found
|
||||
3. Skill outputs an error: "No source files found at `src/networking/`"
|
||||
4. Skill suggests checking `src/` for valid directories
|
||||
5. No verdict is emitted (nothing was reviewed)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash when path does not exist
|
||||
- [ ] Output names the attempted path in the error message
|
||||
- [ ] Output suggests checking `src/` for valid file paths
|
||||
- [ ] No verdict is emitted when there is nothing to review
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; LP may be consulted separately
|
||||
|
||||
**Fixture:**
|
||||
- Source file follows most standards but has 1 CONCERNS-level finding (a magic number)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/code-review src/gameplay/loot_system.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads and reviews the source file
|
||||
2. No director gate is invoked (code review findings are advisory)
|
||||
3. Skill presents findings with the CONCERNS verdict
|
||||
4. Output notes: "Consider requesting a Lead Programmer review for architecture concerns"
|
||||
5. Skill does not invoke any agent automatically
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] LP consultation is suggested (not mandated) in the output
|
||||
- [ ] No code edits are made
|
||||
- [ ] Verdict is CONCERNS for advisory-level findings
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads source file(s) and coding standards before reviewing
|
||||
- [ ] Lists each coding standard check in findings output
|
||||
- [ ] Does not edit any source files (read-only skill)
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: APPROVED, CONCERNS, NEEDS CHANGES
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Batch review of all files in a directory is not explicitly tested; behavior
|
||||
is assumed to apply the same checks file by file and aggregate the verdict.
|
||||
- Test coverage checks (verifying corresponding test files exist) are a stretch
|
||||
goal not tested here; that is primarily the domain of `/test-evidence-review`.
|
||||
@@ -0,0 +1,176 @@
|
||||
# Skill Test Spec: /consistency-check
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/consistency-check` scans all GDDs in `design/gdd/` and checks for internal
|
||||
conflicts across documents. It produces a structured findings table with columns:
|
||||
System A vs System B, Conflict Type, Severity (HIGH / MEDIUM / LOW). Conflict
|
||||
types include: formula mismatch, competing ownership, stale reference, and
|
||||
dependency gap.
|
||||
|
||||
The skill is read-only during analysis. It has no director gates. An optional
|
||||
consistency report can be written to `design/consistency-report-[date].md` if the
|
||||
user requests it, but the skill asks "May I write" before doing so.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: CONSISTENT, CONFLICTS FOUND, DEPENDENCY GAP
|
||||
- [ ] Does NOT require "May I write" language during analysis (read-only scan)
|
||||
- [ ] Has a next-step handoff at the end
|
||||
- [ ] Documents that report writing is optional and requires approval
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
No director gates — this skill spawns no director gate agents. Consistency
|
||||
checking is a mechanical scan; no creative or technical director review is
|
||||
required as part of the scan itself.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — 4 GDDs with no conflicts
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` contains exactly 4 system GDDs
|
||||
- All GDDs have consistent formulas (no overlapping variables with different values)
|
||||
- No two GDDs claim ownership of the same game entity or mechanic
|
||||
- All dependency references point to GDDs that exist
|
||||
|
||||
**Input:** `/consistency-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all 4 GDDs in `design/gdd/`
|
||||
2. Runs cross-GDD consistency checks (formulas, ownership, references)
|
||||
3. No conflicts found
|
||||
4. Outputs structured findings table showing 0 issues
|
||||
5. Verdict: CONSISTENT
|
||||
|
||||
**Assertions:**
|
||||
- [ ] All 4 GDDs are read before producing output
|
||||
- [ ] Findings table is present (even if empty — shows "No conflicts found")
|
||||
- [ ] Verdict is CONSISTENT when no conflicts exist
|
||||
- [ ] Skill does NOT write any files without user approval
|
||||
- [ ] Next-step handoff is present
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — Two GDDs with conflicting damage formulas
|
||||
|
||||
**Fixture:**
|
||||
- GDD-A defines damage formula: `damage = attack * 1.5`
|
||||
- GDD-B defines damage formula: `damage = attack * 2.0` for the same entity type
|
||||
- Both GDDs refer to the same "attack" variable
|
||||
|
||||
**Input:** `/consistency-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all GDDs and detects the formula mismatch
|
||||
2. Findings table includes an entry: GDD-A vs GDD-B | Formula Mismatch | HIGH
|
||||
3. Specific conflicting formulas are shown (not just "formula conflict exists")
|
||||
4. Verdict: CONFLICTS FOUND
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is CONFLICTS FOUND (not CONSISTENT)
|
||||
- [ ] Conflict entry names both GDD filenames
|
||||
- [ ] Conflict type is "Formula Mismatch"
|
||||
- [ ] Severity is HIGH for a direct formula contradiction
|
||||
- [ ] Both conflicting formulas are shown in the findings table
|
||||
- [ ] Skill does NOT auto-resolve the conflict
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Partial Path — GDD references a system with no GDD
|
||||
|
||||
**Fixture:**
|
||||
- GDD-A's Dependencies section lists "system-B" as a dependency
|
||||
- No GDD for system-B exists in `design/gdd/`
|
||||
- All other GDDs are consistent
|
||||
|
||||
**Input:** `/consistency-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all GDDs and checks dependency references
|
||||
2. GDD-A's reference to "system-B" cannot be resolved — no GDD exists for it
|
||||
3. Findings table includes: GDD-A vs (missing) | Dependency Gap | MEDIUM
|
||||
4. Verdict: DEPENDENCY GAP (not CONSISTENT, not CONFLICTS FOUND)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is DEPENDENCY GAP (distinct from CONSISTENT and CONFLICTS FOUND)
|
||||
- [ ] Findings entry names GDD-A and the missing system-B
|
||||
- [ ] Severity is MEDIUM for an unresolved dependency reference
|
||||
- [ ] Skill suggests running `/design-system system-B` to create the missing GDD
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No GDDs found
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` directory is empty or does not exist
|
||||
|
||||
**Input:** `/consistency-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read files in `design/gdd/`
|
||||
2. No GDD files found
|
||||
3. Skill outputs an error: "No GDDs found in `design/gdd/`. Run `/design-system` to create GDDs first."
|
||||
4. No findings table is produced
|
||||
5. No verdict is issued
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs a clear error message when no GDDs are found
|
||||
- [ ] No verdict is produced (CONSISTENT / CONFLICTS FOUND / DEPENDENCY GAP)
|
||||
- [ ] Skill recommends the correct next action (`/design-system`)
|
||||
- [ ] Skill does NOT crash or produce a partial report
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Director Gate — No gate spawned; no review-mode.txt read
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` contains ≥2 GDDs
|
||||
- `production/session-state/review-mode.txt` exists with `full`
|
||||
|
||||
**Input:** `/consistency-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all GDDs and runs the consistency scan
|
||||
2. Skill does NOT read `production/session-state/review-mode.txt`
|
||||
3. No director gate agents are spawned at any point
|
||||
4. Findings table and verdict are produced normally
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate agents are spawned (no CD-, TD-, PR-, AD- prefixed gates)
|
||||
- [ ] Skill does NOT read `production/session-state/review-mode.txt`
|
||||
- [ ] Output contains no "Gate: [GATE-ID]" or gate-skipped entries
|
||||
- [ ] Review mode has no effect on this skill's behavior
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads all GDDs before producing the findings table
|
||||
- [ ] Findings table shown in full before any write ask (if report is requested)
|
||||
- [ ] Verdict is one of exactly: CONSISTENT, CONFLICTS FOUND, DEPENDENCY GAP
|
||||
- [ ] No director gates — no review-mode.txt read
|
||||
- [ ] Report writing (if requested) gated by "May I write" approval
|
||||
- [ ] Ends with next-step handoff appropriate to verdict
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- This skill checks for structural consistency between GDDs. Deep design theory
|
||||
analysis (pillar drift, dominant strategies) is handled by `/review-all-gdds`.
|
||||
- Formula conflict detection relies on consistent formula notation across GDDs —
|
||||
informal descriptions of the same mechanic may not be detected.
|
||||
- The conflict severity rubric (HIGH / MEDIUM / LOW) is defined in the skill body
|
||||
and not re-enumerated here.
|
||||
164
CCGS Skill Testing Framework/skills/analysis/content-audit.md
Normal file
164
CCGS Skill Testing Framework/skills/analysis/content-audit.md
Normal file
@@ -0,0 +1,164 @@
|
||||
# Skill Test Spec: /content-audit
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/content-audit` reads GDDs in `design/gdd/` and checks whether all content
|
||||
items specified there (enemies, items, levels, etc.) are accounted for in
|
||||
`assets/`. It produces a gap table: Content Type → Specified Count → Found Count
|
||||
→ Missing Items. No director gates are invoked. The skill does not write without
|
||||
user approval. Verdicts: COMPLETE, GAPS FOUND, or MISSING CRITICAL CONTENT.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: COMPLETE, GAPS FOUND, MISSING CRITICAL CONTENT
|
||||
- [ ] Does NOT require "May I write" language (read-only output; write is optional report)
|
||||
- [ ] Has a next-step handoff (what to do after gap table is reviewed)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Content audit is a read-only analysis skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — All specified content present
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/enemies.md` specifies 4 enemy types: Grunt, Sniper, Tank, Boss
|
||||
- `assets/art/characters/` contains folders: `grunt/`, `sniper/`, `tank/`, `boss/`
|
||||
- `design/gdd/items.md` specifies 3 item types; all 3 found in `assets/data/items/`
|
||||
|
||||
**Input:** `/content-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all GDDs in `design/gdd/`
|
||||
2. Skill scans `assets/` for each specified content item
|
||||
3. All 4 enemy types and 3 item types are found
|
||||
4. Gap table shows: all rows have Found Count = Specified Count, no missing items
|
||||
5. Verdict is COMPLETE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Gap table covers all content types found in GDDs
|
||||
- [ ] Each row shows Specified Count and Found Count
|
||||
- [ ] No missing items when counts match
|
||||
- [ ] Verdict is COMPLETE
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Gaps Found — Enemy type missing from assets
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/enemies.md` specifies 3 enemy types: Grunt, Sniper, Boss
|
||||
- `assets/art/characters/` contains: `grunt/`, `sniper/` only (Boss folder missing)
|
||||
|
||||
**Input:** `/content-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads GDD — finds 3 enemy types specified
|
||||
2. Skill scans `assets/art/characters/` — finds only 2
|
||||
3. Gap table row for enemies: Specified 3, Found 2, Missing: Boss
|
||||
4. Verdict is GAPS FOUND
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Gap table row identifies "Boss" as the missing item by name
|
||||
- [ ] Specified Count (3) and Found Count (2) are both shown
|
||||
- [ ] Verdict is GAPS FOUND when any content item is missing
|
||||
- [ ] Skill does not assume the asset will be added later — it flags it now
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No GDD Content Specs Found — Guidance given
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/` contains only `core-loop.md` which has no content inventory section
|
||||
- No other GDDs exist with content specifications
|
||||
|
||||
**Input:** `/content-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads all GDDs — finds no content inventory sections
|
||||
2. Skill outputs: "No content specifications found in GDDs — run /design-system first to define content lists"
|
||||
3. No gap table is produced
|
||||
4. Verdict is GAPS FOUND (cannot confirm completeness without specs)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not produce a gap table when no GDD content specs exist
|
||||
- [ ] Output recommends running `/design-system`
|
||||
- [ ] Verdict reflects inability to confirm completeness
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — Asset in wrong format for target platform
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/audio.md` specifies audio assets as OGG format
|
||||
- `assets/audio/sfx/jump.wav` exists (WAV format, not OGG)
|
||||
- `assets/audio/sfx/land.ogg` exists (correct format)
|
||||
- `technical-preferences.md` specifies audio format: OGG
|
||||
|
||||
**Input:** `/content-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads GDD audio spec and technical preferences for format requirements
|
||||
2. Skill finds `jump.wav` — present but in wrong format
|
||||
3. Gap table row for audio: Specified 2, Found 2 (by name), but `jump.wav` flagged as FORMAT ISSUE
|
||||
4. Verdict is GAPS FOUND (format compliance is part of content completeness)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks asset format against GDD or technical preferences when format is specified
|
||||
- [ ] `jump.wav` is flagged as FORMAT ISSUE with expected format (OGG) noted
|
||||
- [ ] Format issues are distinct from missing content in the gap table
|
||||
- [ ] Verdict is GAPS FOUND when format issues exist
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — Read-only; no gate; gap table for human review
|
||||
|
||||
**Fixture:**
|
||||
- GDDs specify 10 content items; 9 are found in assets; 1 is missing
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/content-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads GDDs and scans assets; produces gap table
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Skill presents gap table to user as read-only output
|
||||
4. Verdict is GAPS FOUND
|
||||
5. Skill offers to write an audit report but does not write automatically
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Gap table is presented without auto-writing any file
|
||||
- [ ] Optional report write is offered but not forced
|
||||
- [ ] Skill does not modify any asset files
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads GDDs and asset directory before producing gap table
|
||||
- [ ] Gap table shows Content Type, Specified Count, Found Count, Missing Items
|
||||
- [ ] Does not write files without explicit user approval
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: COMPLETE, GAPS FOUND, MISSING CRITICAL CONTENT
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- MISSING CRITICAL CONTENT verdict (vs. GAPS FOUND) is triggered when the
|
||||
missing item is tagged as critical in the GDD; this is not explicitly tested
|
||||
but follows the same detection path.
|
||||
- The case where `assets/` directory does not exist is not tested; the skill
|
||||
would produce a MISSING CRITICAL CONTENT verdict for all specified items.
|
||||
168
CCGS Skill Testing Framework/skills/analysis/estimate.md
Normal file
168
CCGS Skill Testing Framework/skills/analysis/estimate.md
Normal file
@@ -0,0 +1,168 @@
|
||||
# Skill Test Spec: /estimate
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/estimate` estimates task or story effort using a relative-size scale (S / M /
|
||||
L / XL) based on story complexity, acceptance criteria count, and historical
|
||||
sprint velocity from past sprint files. Estimates are advisory and are never
|
||||
written automatically. No director gates are invoked. Verdicts are effort ranges,
|
||||
not pass/fail — every run produces an estimate.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains size labels: S, M, L, XL (the "verdict" equivalents for this skill)
|
||||
- [ ] Does NOT require "May I write" language (advisory output only)
|
||||
- [ ] Has a next-step handoff (how to use the estimate in sprint planning)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Estimation is an advisory informational skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Clear story with known tech stack
|
||||
|
||||
**Fixture:**
|
||||
- `production/epics/combat/story-hitbox-detection.md` exists with:
|
||||
- 4 clear Acceptance Criteria
|
||||
- ADR reference (Accepted status)
|
||||
- No "unknown" or "TBD" language in story body
|
||||
- `production/sprints/sprint-003.md` through `sprint-005.md` exist with velocity data
|
||||
- Tech stack is GDScript (well-understood by team per sprint history)
|
||||
|
||||
**Input:** `/estimate production/epics/combat/story-hitbox-detection.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the story file — assesses clarity, AC count, tech stack
|
||||
2. Skill reads sprint history to determine average velocity
|
||||
3. Skill outputs estimate: M (1–2 days) with reasoning
|
||||
4. No files are written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Estimate is M for a clear, well-scoped story with known tech
|
||||
- [ ] Reasoning references AC count, tech stack familiarity, and velocity data
|
||||
- [ ] Estimate is presented as a range (e.g., "1–2 days"), not a single point
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: High Uncertainty — Unknown system, no ADR yet
|
||||
|
||||
**Fixture:**
|
||||
- `production/epics/online/story-lobby-matchmaking.md` exists with:
|
||||
- 2 vague Acceptance Criteria (using "should" and "TBD")
|
||||
- No ADR reference — matchmaking architecture not yet decided
|
||||
- References new subsystem ("online/matchmaking") with no existing source files
|
||||
|
||||
**Input:** `/estimate production/epics/online/story-lobby-matchmaking.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads story — finds vague AC, no ADR, no existing source
|
||||
2. Skill flags multiple uncertainty factors
|
||||
3. Estimate is L–XL with an explicit risk note: "Estimate range is wide due to architectural unknowns"
|
||||
4. Skill recommends creating an ADR before development begins
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Estimate is L or XL (not S or M) when significant unknowns exist
|
||||
- [ ] Risk note explains the specific unknowns driving the wide range
|
||||
- [ ] Output recommends resolving architectural questions first
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Sprint Velocity Data — Conservative defaults used
|
||||
|
||||
**Fixture:**
|
||||
- Story file exists and is well-defined
|
||||
- `production/sprints/` is empty — no historical sprints
|
||||
|
||||
**Input:** `/estimate production/epics/core/story-save-load.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads story — assesses complexity
|
||||
2. Skill attempts to read sprint velocity data — finds none
|
||||
3. Skill notes: "No sprint history found — using conservative defaults for velocity"
|
||||
4. Estimate is produced using default assumptions (e.g., 1 story point = 1 day)
|
||||
5. No files are written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not error when no sprint history exists
|
||||
- [ ] Output explicitly notes that conservative defaults are being used
|
||||
- [ ] Estimate is still produced (not blocked by missing velocity)
|
||||
- [ ] Conservative defaults produce a higher (not lower) estimate range
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Multiple Stories — Each estimated individually plus sprint total
|
||||
|
||||
**Fixture:**
|
||||
- User provides a sprint file: `production/sprints/sprint-007.md` with 4 stories
|
||||
- Sprint history exists (3 previous sprints)
|
||||
|
||||
**Input:** `/estimate production/sprints/sprint-007.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads sprint file — identifies 4 stories
|
||||
2. Skill estimates each story individually: S, M, M, L
|
||||
3. Skill computes sprint total: approximately 6–8 story points
|
||||
4. Skill presents per-story estimates followed by sprint total
|
||||
5. No files are written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Each story receives its own estimate label
|
||||
- [ ] Sprint total is presented after individual estimates
|
||||
- [ ] Total is a sum range derived from individual ranges
|
||||
- [ ] Skill handles sprint files (not just single story files) as input
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; estimates are informational
|
||||
|
||||
**Fixture:**
|
||||
- Story file exists with medium complexity
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/estimate production/epics/core/story-item-pickup.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads story and sprint history; computes estimate
|
||||
2. No director gate is invoked in any review mode
|
||||
3. Estimate is presented as advisory output only
|
||||
4. Skill notes: "Use this estimate in /sprint-plan when selecting stories for the next sprint"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked regardless of review mode
|
||||
- [ ] Output is purely informational — no approval or write prompt
|
||||
- [ ] Next-step recommendation references `/sprint-plan`
|
||||
- [ ] Estimate does not change based on review mode
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads story file before estimating
|
||||
- [ ] Reads sprint velocity history when available
|
||||
- [ ] Produces effort range (S/M/L/XL), not a single number
|
||||
- [ ] Does not write any files
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Always produces an estimate (never blocked by missing data; uses defaults instead)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The skill does not produce PASS/FAIL verdicts; the "verdict" here is the
|
||||
effort range itself. Test assertions focus on the accuracy of the range
|
||||
and the quality of the reasoning, not a binary outcome.
|
||||
- Team-specific velocity calibration (what "M" means for this team) is an
|
||||
implementation detail not tested here; it is configured via sprint history.
|
||||
171
CCGS Skill Testing Framework/skills/analysis/perf-profile.md
Normal file
171
CCGS Skill Testing Framework/skills/analysis/perf-profile.md
Normal file
@@ -0,0 +1,171 @@
|
||||
# Skill Test Spec: /perf-profile
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/perf-profile` is a structured performance profiling workflow that identifies
|
||||
bottlenecks and recommends optimizations. If profiler data or performance logs
|
||||
are provided, it analyzes them directly. If not, it guides the user through a
|
||||
manual profiling checklist. No director gates are invoked. The skill asks
|
||||
"May I write to `production/qa/perf-[date].md`?" before persisting a report.
|
||||
Verdicts: WITHIN BUDGET, CONCERNS, or OVER BUDGET.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: WITHIN BUDGET, CONCERNS, OVER BUDGET
|
||||
- [ ] Contains "May I write" language (skill writes perf report)
|
||||
- [ ] Has a next-step handoff (what to do after performance findings are reviewed)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Performance profiling is an advisory analysis skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Frame data provided, draw call spike found
|
||||
|
||||
**Fixture:**
|
||||
- User provides `production/qa/profiler-export-2026-03-15.json` with frame time data
|
||||
- Data shows: average frame time 14ms (within 16.6ms budget), but frames 42–48 spike to 28ms
|
||||
- Spike correlates with a scene with 450 draw calls (budget: 200)
|
||||
|
||||
**Input:** `/perf-profile production/qa/profiler-export-2026-03-15.json`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads profiler data
|
||||
2. Skill identifies average frame time is within budget
|
||||
3. Skill identifies draw call spike on frames 42–48 (450 calls vs 200 budget)
|
||||
4. Verdict is CONCERNS (average OK, but spikes indicate an issue)
|
||||
5. Skill recommends batching or culling for the identified scene
|
||||
6. Skill asks "May I write to `production/qa/perf-2026-04-06.md`?"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Spike frames are identified by frame number
|
||||
- [ ] Draw call count and budget are compared explicitly
|
||||
- [ ] Verdict is CONCERNS when spikes exceed budget even if average is OK
|
||||
- [ ] At least one specific optimization recommendation is given
|
||||
- [ ] "May I write" prompt appears before writing report
|
||||
|
||||
---
|
||||
|
||||
### Case 2: No Profiler Data — Manual checklist output
|
||||
|
||||
**Fixture:**
|
||||
- User runs `/perf-profile` with no arguments
|
||||
- No profiler data files exist in `production/qa/`
|
||||
|
||||
**Input:** `/perf-profile`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill finds no profiler data
|
||||
2. Skill outputs a manual profiling checklist for the user to work through:
|
||||
- Enable Godot profiler or target engine's profiler
|
||||
- Record a 60-second play session
|
||||
- Export frame time data
|
||||
- Note any dropped frames or hitches
|
||||
3. Skill asks user to provide data once collected before running analysis
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash or emit a verdict when no data is provided
|
||||
- [ ] Manual profiling checklist is output (actionable steps, not just an error)
|
||||
- [ ] No verdict is emitted (there is nothing to assess yet)
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Over Budget — Frame budget exceeded for target platform
|
||||
|
||||
**Fixture:**
|
||||
- Profiler data shows consistent 22ms frame times (target: 16.6ms for 60fps)
|
||||
- All frames exceed budget; no single spike — systemic issue
|
||||
- `technical-preferences.md` specifies target platform: PC, 60fps
|
||||
|
||||
**Input:** `/perf-profile production/qa/profiler-export-2026-03-20.json`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads profiler data and technical preferences for performance budget
|
||||
2. All frames are over the 16.6ms budget
|
||||
3. Verdict is OVER BUDGET
|
||||
4. Skill outputs a prioritized optimization list (e.g., LOD system, shader complexity, physics tick rate)
|
||||
5. Skill asks "May I write" before writing report
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is OVER BUDGET when all or most frames exceed budget
|
||||
- [ ] Target frame budget is read from `technical-preferences.md` (not hardcoded)
|
||||
- [ ] Optimization priority list is provided, not just the raw verdict
|
||||
- [ ] "May I write" prompt appears before report write
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Previous Perf Report Exists — Delta comparison
|
||||
|
||||
**Fixture:**
|
||||
- `production/qa/perf-2026-03-28.md` exists with prior results (avg 15ms, max 19ms)
|
||||
- New profiler export shows: avg 13ms, max 17ms
|
||||
- Both reports are for the same scene
|
||||
|
||||
**Input:** `/perf-profile production/qa/profiler-export-2026-04-05.json`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads new profiler data
|
||||
2. Skill detects prior report for the same scene
|
||||
3. Skill computes deltas: avg improved 2ms, max improved 2ms
|
||||
4. Skill presents regression check: no regressions detected
|
||||
5. Verdict is WITHIN BUDGET; report notes improvement since last profile
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks `production/qa/` for prior perf reports before writing
|
||||
- [ ] Delta comparison is shown (prior vs. current for key metrics)
|
||||
- [ ] Verdict is WITHIN BUDGET when current metrics are within budget
|
||||
- [ ] Improvement trend is noted positively in the report
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; performance-analyst separate
|
||||
|
||||
**Fixture:**
|
||||
- Profiler data shows CONCERNS-level findings (some spikes)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/perf-profile production/qa/profiler-export-2026-04-01.json`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill analyzes profiler data; verdict is CONCERNS
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Output notes: "For in-depth analysis, consider running `/perf-profile` with the performance-analyst agent"
|
||||
4. Skill asks "May I write" and writes report on user approval
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Performance-analyst consultation is suggested (not mandated)
|
||||
- [ ] "May I write" prompt appears before report write
|
||||
- [ ] Verdict is CONCERNS for spike-based findings
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads profiler data when provided; outputs checklist when not
|
||||
- [ ] Reads `technical-preferences.md` for target platform frame budget
|
||||
- [ ] Checks for prior perf reports to enable delta comparison
|
||||
- [ ] Always asks "May I write" before writing report
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: WITHIN BUDGET, CONCERNS, OVER BUDGET
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Platform-specific profiling workflows (console, mobile) are not tested here;
|
||||
the checklist output in Case 2 would be platform-specific in practice.
|
||||
- The delta comparison in Case 4 assumes reports cover the same scene; cross-scene
|
||||
comparisons are not explicitly handled.
|
||||
168
CCGS Skill Testing Framework/skills/analysis/scope-check.md
Normal file
168
CCGS Skill Testing Framework/skills/analysis/scope-check.md
Normal file
@@ -0,0 +1,168 @@
|
||||
# Skill Test Spec: /scope-check
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/scope-check` is a Haiku-tier read-only skill that analyzes a feature, sprint,
|
||||
or story for scope creep risk. It reads sprint and story files and compares them
|
||||
against the active milestone goals. It is designed for fast, low-cost checks
|
||||
before or during planning. No director gates are invoked. No files are written.
|
||||
Verdicts: ON SCOPE, CONCERNS, or SCOPE CREEP DETECTED.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: ON SCOPE, CONCERNS, SCOPE CREEP DETECTED
|
||||
- [ ] Does NOT require "May I write" language (read-only skill)
|
||||
- [ ] Has a next-step handoff (what to do based on verdict)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Scope check is a read-only advisory skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Sprint stories align with milestone goals
|
||||
|
||||
**Fixture:**
|
||||
- `production/milestones/milestone-03.md` lists 3 goals: combat system, enemy AI, level loading
|
||||
- `production/sprints/sprint-006.md` contains 5 stories, all tagged to one of the 3 goals
|
||||
- `production/session-state/active.md` references milestone-03 as the active milestone
|
||||
|
||||
**Input:** `/scope-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads active milestone goals from milestone-03
|
||||
2. Skill reads sprint-006 stories and checks each against milestone goals
|
||||
3. All 5 stories map to one of the 3 goals
|
||||
4. Skill outputs a mapping table: story → milestone goal
|
||||
5. Verdict is ON SCOPE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Each story is mapped to a milestone goal in the output
|
||||
- [ ] Verdict is ON SCOPE when all stories map to milestone goals
|
||||
- [ ] No files are written
|
||||
- [ ] Skill does not modify sprint or milestone files
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Scope Creep Detected — Stories introducing systems not in milestone
|
||||
|
||||
**Fixture:**
|
||||
- `production/milestones/milestone-03.md` goals: combat, enemy AI, level loading
|
||||
- `production/sprints/sprint-006.md` contains 5 stories:
|
||||
- 3 stories map to milestone goals
|
||||
- 2 stories reference "online leaderboard" and "achievement system" (not in milestone-03)
|
||||
|
||||
**Input:** `/scope-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads milestone goals and sprint stories
|
||||
2. Skill identifies 2 stories with no matching milestone goal
|
||||
3. Skill names the out-of-scope stories: "Online Leaderboard Feature", "Achievement System Setup"
|
||||
4. Verdict is SCOPE CREEP DETECTED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Out-of-scope stories are named explicitly in the output
|
||||
- [ ] Verdict is SCOPE CREEP DETECTED when any story has no milestone goal match
|
||||
- [ ] Skill does not automatically remove the stories — findings are advisory
|
||||
- [ ] Output recommends deferring the out-of-scope stories to a later milestone
|
||||
|
||||
---
|
||||
|
||||
### Case 3: No Milestone Defined — CONCERNS; scope cannot be validated
|
||||
|
||||
**Fixture:**
|
||||
- `production/session-state/active.md` has no milestone reference
|
||||
- `production/milestones/` directory exists but is empty
|
||||
- `production/sprints/sprint-006.md` has 4 stories
|
||||
|
||||
**Input:** `/scope-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads active.md — finds no milestone reference
|
||||
2. Skill checks `production/milestones/` — no milestone files found
|
||||
3. Skill outputs: "No active milestone defined — scope cannot be validated"
|
||||
4. Verdict is CONCERNS
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not error when no milestone is defined
|
||||
- [ ] Output explicitly states that scope validation requires a milestone reference
|
||||
- [ ] Verdict is CONCERNS (not ON SCOPE or SCOPE CREEP DETECTED without data)
|
||||
- [ ] Output suggests running `/milestone-review` or creating a milestone
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Single Story Check — Evaluated against its parent epic
|
||||
|
||||
**Fixture:**
|
||||
- User targets a single story: `production/epics/combat/story-parry-timing.md`
|
||||
- Story references parent epic: `epic-combat.md`
|
||||
- `production/epics/combat/epic-combat.md` has scope: "melee combat mechanics"
|
||||
- Story title: "Implement parry timing window" — matches epic scope
|
||||
|
||||
**Input:** `/scope-check production/epics/combat/story-parry-timing.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the specified story file
|
||||
2. Skill reads the parent epic to get scope definition
|
||||
3. Skill evaluates story against epic scope — "parry timing" matches "melee combat"
|
||||
4. Verdict is ON SCOPE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Single-file argument is accepted (story path, not sprint)
|
||||
- [ ] Skill reads the parent epic referenced in the story file
|
||||
- [ ] Story is evaluated against epic scope (not milestone scope) in single-story mode
|
||||
- [ ] Verdict is ON SCOPE when story matches epic scope
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; PR may be consulted separately
|
||||
|
||||
**Fixture:**
|
||||
- Sprint has 2 SCOPE CREEP stories and 3 ON SCOPE stories
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/scope-check`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads milestone and sprint; identifies 2 scope creep items
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Skill presents findings with SCOPE CREEP DETECTED verdict
|
||||
4. Output notes: "Consider raising scope concerns with the Producer before sprint begins"
|
||||
5. Skill ends without writing any files
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Producer consultation is suggested (not mandated)
|
||||
- [ ] No files are written
|
||||
- [ ] Verdict is SCOPE CREEP DETECTED
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads milestone goals and sprint/story files before analysis
|
||||
- [ ] Maps each story to a milestone goal (or flags as unmapped)
|
||||
- [ ] Does not write any files
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Runs on Haiku model tier (fast, low-cost)
|
||||
- [ ] Verdict is one of: ON SCOPE, CONCERNS, SCOPE CREEP DETECTED
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where the sprint file itself does not exist is not tested; the
|
||||
skill would output a CONCERNS verdict with a message about missing sprint data.
|
||||
- Partial scope overlap (story touches a milestone goal but also introduces
|
||||
new scope) is not explicitly tested; implementation may classify this as
|
||||
CONCERNS rather than SCOPE CREEP DETECTED.
|
||||
167
CCGS Skill Testing Framework/skills/analysis/security-audit.md
Normal file
167
CCGS Skill Testing Framework/skills/analysis/security-audit.md
Normal file
@@ -0,0 +1,167 @@
|
||||
# Skill Test Spec: /security-audit
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/security-audit` audits the game for security risks including save data
|
||||
integrity, network communication, anti-cheat exposure, and data privacy. It
|
||||
reads source files in `src/` for security patterns and checks whether sensitive
|
||||
data is handled correctly. No director gates are invoked. The skill does not
|
||||
write files (findings report only). Verdicts: SECURE, CONCERNS, or
|
||||
VULNERABILITIES FOUND.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: SECURE, CONCERNS, VULNERABILITIES FOUND
|
||||
- [ ] Does NOT require "May I write" language (read-only; findings report only)
|
||||
- [ ] Has a next-step handoff (what to do with findings)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Security audit is a read-only advisory skill; no gates are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Save data encrypted, no hardcoded credentials
|
||||
|
||||
**Fixture:**
|
||||
- `src/core/save_system.gd` uses `Crypto` class to encrypt save data before writing
|
||||
- No hardcoded API keys, passwords, or credentials in any `src/` file
|
||||
- No version numbers or internal build IDs exposed in client-facing output
|
||||
|
||||
**Input:** `/security-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans `src/` for security patterns: encryption usage, hardcoded credentials, exposed internals
|
||||
2. All checks pass: save data encrypted, no credentials found, no exposed internals
|
||||
3. Findings report shows all checks PASS
|
||||
4. Verdict is SECURE
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks save data handling for encryption usage
|
||||
- [ ] Skill scans for hardcoded credentials (API keys, passwords, tokens)
|
||||
- [ ] Skill checks for version/build numbers exposed to players
|
||||
- [ ] All checks shown in findings report
|
||||
- [ ] Verdict is SECURE when all checks pass
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Vulnerabilities Found — Unencrypted save data and exposed version
|
||||
|
||||
**Fixture:**
|
||||
- `src/core/save_system.gd` writes save data as plain JSON (no encryption)
|
||||
- `src/ui/debug_overlay.gd` contains: `label.text = "Build: " + ProjectSettings.get("application/config/version")`
|
||||
(exposes internal build version to player)
|
||||
|
||||
**Input:** `/security-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans `src/` — finds unencrypted save write in `save_system.gd`
|
||||
2. Skill finds exposed version string in `debug_overlay.gd`
|
||||
3. Both findings are flagged as VULNERABILITIES
|
||||
4. Verdict is VULNERABILITIES FOUND
|
||||
5. Skill provides remediation recommendations for each vulnerability
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Unencrypted save data is flagged as a vulnerability with file and approximate line
|
||||
- [ ] Exposed version string is flagged as a vulnerability
|
||||
- [ ] Remediation suggestion is given for each vulnerability
|
||||
- [ ] Verdict is VULNERABILITIES FOUND when any vulnerability is detected
|
||||
- [ ] No files are written or modified
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Online Features Without Authentication — CONCERNS
|
||||
|
||||
**Fixture:**
|
||||
- `src/networking/lobby.gd` exists with functions: `join_lobby()`, `send_chat()`
|
||||
- No authentication check is found before `send_chat()` — players can call it without being verified
|
||||
- Game has online multiplayer features (inferred from file presence)
|
||||
|
||||
**Input:** `/security-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans `src/networking/` — detects online feature code
|
||||
2. Skill checks for authentication guard before network calls — finds none on `send_chat()`
|
||||
3. Flags: "Online feature without authentication check — CONCERNS"
|
||||
4. Verdict is CONCERNS (not VULNERABILITIES FOUND, as this is a missing control, not an exploit)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill detects online features by scanning for networking source files
|
||||
- [ ] Missing authentication checks before network operations are flagged
|
||||
- [ ] Verdict is CONCERNS (advisory severity) for missing authentication guards
|
||||
- [ ] Output recommends adding authentication before network calls
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No Source Files to Analyze
|
||||
|
||||
**Fixture:**
|
||||
- `src/` directory does not exist or is completely empty
|
||||
|
||||
**Input:** `/security-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to scan `src/` — no files found
|
||||
2. Skill outputs an error: "No source files found in `src/` — nothing to audit"
|
||||
3. No findings report is generated
|
||||
4. No verdict is emitted
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash when `src/` is empty or absent
|
||||
- [ ] Output clearly states that no source files were found
|
||||
- [ ] No verdict is emitted (there is nothing to assess)
|
||||
- [ ] Skill suggests verifying the `src/` directory path
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; security-engineer invoked separately
|
||||
|
||||
**Fixture:**
|
||||
- Source files exist; 1 CONCERNS-level finding detected (debug logging enabled in release build)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/security-audit`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans source; finds debug logging active in release path
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Verdict is CONCERNS
|
||||
4. Output notes: "For formal security review, consider engaging a security-engineer agent"
|
||||
5. Findings are presented as a read-only report; no files written
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Security-engineer consultation is suggested (not mandated)
|
||||
- [ ] No files are written
|
||||
- [ ] Verdict is CONCERNS for advisory-level security findings
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads source files in `src/` before auditing
|
||||
- [ ] Checks save data encryption, hardcoded credentials, exposed internals, auth guards
|
||||
- [ ] Provides remediation recommendations for each finding
|
||||
- [ ] Does not write any files (read-only skill)
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: SECURE, CONCERNS, VULNERABILITIES FOUND
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Anti-cheat analysis (client-side value validation, server authority) is not
|
||||
explicitly tested here; it follows the CONCERNS or VULNERABILITIES pattern
|
||||
depending on severity.
|
||||
- Data privacy compliance (GDPR, COPPA) is out of scope for this spec; those
|
||||
require legal review beyond code scanning.
|
||||
171
CCGS Skill Testing Framework/skills/analysis/tech-debt.md
Normal file
171
CCGS Skill Testing Framework/skills/analysis/tech-debt.md
Normal file
@@ -0,0 +1,171 @@
|
||||
# Skill Test Spec: /tech-debt
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/tech-debt` tracks, categorizes, and prioritizes technical debt across the
|
||||
codebase. It reads `docs/tech-debt-register.md` for the existing debt register
|
||||
and scans source files in `src/` for inline `TODO` and `FIXME` comments. It
|
||||
merges and sorts items by severity. No director gates are invoked. The skill
|
||||
asks "May I write to `docs/tech-debt-register.md`?" before updating. Verdicts:
|
||||
REGISTER UPDATED or NO NEW DEBT FOUND.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: REGISTER UPDATED, NO NEW DEBT FOUND
|
||||
- [ ] Contains "May I write" language (skill writes to debt register)
|
||||
- [ ] Has a next-step handoff (what to do after register is updated)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Tech debt tracking is an internal codebase analysis skill; no gates are
|
||||
invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Inline TODOs plus existing register items merged
|
||||
|
||||
**Fixture:**
|
||||
- `docs/tech-debt-register.md` exists with 2 items (LOW and MEDIUM severity)
|
||||
- `src/gameplay/combat.gd` has 2 `# TODO` comments and 1 `# FIXME` comment
|
||||
- `src/ui/hud.gd` has 0 inline debt comments
|
||||
|
||||
**Input:** `/tech-debt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads `docs/tech-debt-register.md` — finds 2 existing items
|
||||
2. Skill scans `src/` — finds 3 inline comments (2 TODOs, 1 FIXME)
|
||||
3. Skill checks whether inline comments already exist in the register (deduplication)
|
||||
4. Skill presents combined list sorted by severity (FIXME before TODO by default)
|
||||
5. Skill asks "May I write to `docs/tech-debt-register.md`?"
|
||||
6. User approves; register updated; verdict REGISTER UPDATED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Inline comments are found by scanning `src/` recursively
|
||||
- [ ] Existing register items are not duplicated
|
||||
- [ ] Combined list is sorted by severity
|
||||
- [ ] "May I write" prompt appears before any write
|
||||
- [ ] Verdict is REGISTER UPDATED
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Register Doesn't Exist — Offered to create it
|
||||
|
||||
**Fixture:**
|
||||
- `docs/tech-debt-register.md` does NOT exist
|
||||
- `src/` contains 4 inline TODO/FIXME comments
|
||||
|
||||
**Input:** `/tech-debt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read `docs/tech-debt-register.md` — not found
|
||||
2. Skill informs user: "No tech-debt-register.md found"
|
||||
3. Skill offers to create the register with the inline items it found
|
||||
4. Skill asks "May I write to `docs/tech-debt-register.md`?" (create)
|
||||
5. User approves; register created with 4 items; verdict REGISTER UPDATED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash when register file is absent
|
||||
- [ ] User is offered register creation (not silently skipping)
|
||||
- [ ] "May I write" prompt reflects file creation (not update)
|
||||
- [ ] Verdict is REGISTER UPDATED after creation
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Resolved Item Detected — Marked resolved in register
|
||||
|
||||
**Fixture:**
|
||||
- `docs/tech-debt-register.md` has 3 items; one references `src/gameplay/legacy_input.gd`
|
||||
- `src/gameplay/legacy_input.gd` has been deleted (refactored away)
|
||||
- The referenced TODO comment no longer exists in source
|
||||
|
||||
**Input:** `/tech-debt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads register — finds 3 items
|
||||
2. Skill scans `src/` — does not find the source location referenced by item 2
|
||||
3. Skill flags item 2 as RESOLVED (source is gone)
|
||||
4. Skill presents the resolved item to user for confirmation
|
||||
5. On approval, register is updated with item 2 marked `Status: Resolved`
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill checks whether each register item's source reference still exists
|
||||
- [ ] Missing source locations result in items being flagged as RESOLVED
|
||||
- [ ] User confirms before resolved items are written
|
||||
- [ ] RESOLVED items are kept in the register (not deleted) for audit history
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — CRITICAL debt item surfaces prominently
|
||||
|
||||
**Fixture:**
|
||||
- `src/core/network_sync.gd` has a comment: `# FIXME(CRITICAL): race condition in sync buffer — can corrupt save data`
|
||||
- `docs/tech-debt-register.md` exists with 5 lower-severity items
|
||||
|
||||
**Input:** `/tech-debt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans source and finds the CRITICAL-tagged FIXME
|
||||
2. Skill presents the CRITICAL item at the top of the output — before the full table
|
||||
3. Skill asks user to acknowledge the critical item before proceeding
|
||||
4. After acknowledgment, skill presents full debt table and asks to write
|
||||
5. Register is updated with CRITICAL item at top; verdict REGISTER UPDATED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] CRITICAL items appear at the top of the output, not buried in the table
|
||||
- [ ] Skill surfaces CRITICAL items before asking to write
|
||||
- [ ] User acknowledgment of the CRITICAL item is requested
|
||||
- [ ] CRITICAL severity is preserved in the written register entry
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; register updated only with approval
|
||||
|
||||
**Fixture:**
|
||||
- Inline scan finds 2 new TODOs; register has 3 existing items
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/tech-debt`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill scans source and reads register; compiles combined debt list
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Skill presents sorted debt table to user
|
||||
4. Skill asks "May I write to `docs/tech-debt-register.md`?"
|
||||
5. User approves; register updated; verdict REGISTER UPDATED
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Debt table is presented before any write prompt
|
||||
- [ ] "May I write" prompt appears before file update
|
||||
- [ ] Write only occurs with explicit user approval
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads `docs/tech-debt-register.md` and scans `src/` before compiling
|
||||
- [ ] Deduplicates inline comments against existing register items
|
||||
- [ ] Sorts combined list by severity
|
||||
- [ ] Always asks "May I write" before updating register
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is REGISTER UPDATED or NO NEW DEBT FOUND
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The case where `src/` is empty or absent is not tested; behavior follows
|
||||
the NO NEW DEBT FOUND path for the inline scan, but register items would
|
||||
still be read and presented.
|
||||
- TODO comments without severity tags are treated as LOW severity by default;
|
||||
this classification detail is an implementation concern, not tested here.
|
||||
@@ -0,0 +1,175 @@
|
||||
# Skill Test Spec: /test-evidence-review
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/test-evidence-review` performs a quality review of test files in `tests/`,
|
||||
checking test naming conventions, determinism, isolation, and absence of
|
||||
hardcoded magic numbers — all against the project's test standards defined in
|
||||
`coding-standards.md`. Findings may be flagged for qa-lead review. No director
|
||||
gates are invoked. The skill does not write without user approval. Verdicts:
|
||||
PASS, WARNINGS, or FAIL.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: PASS, WARNINGS, FAIL
|
||||
- [ ] Does NOT require "May I write" language (read-only; write is optional flagging report)
|
||||
- [ ] Has a next-step handoff (what to do after findings are reviewed)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Test evidence review is an advisory quality skill; QL-TEST-COVERAGE gate
|
||||
is a separate skill invocation and is NOT triggered here.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Tests follow all standards
|
||||
|
||||
**Fixture:**
|
||||
- `tests/unit/combat/health_system_take_damage_test.gd` exists with:
|
||||
- Naming: `test_health_system_take_damage_reduces_health()` (follows `test_[system]_[scenario]_[expected]`)
|
||||
- Arrange/Act/Assert structure present
|
||||
- No `sleep()`, `await` with time values, or random seeds
|
||||
- No calls to external APIs or file I/O
|
||||
- No inline magic numbers (uses constants from `tests/unit/combat/fixtures/`)
|
||||
|
||||
**Input:** `/test-evidence-review tests/unit/combat/`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads test standards from `coding-standards.md`
|
||||
2. Skill reads the test file; checks all 5 standards
|
||||
3. All checks pass: naming, structure, determinism, isolation, no hardcoded data
|
||||
4. Verdict is PASS
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Each of the 5 test standards is checked and reported
|
||||
- [ ] All checks show PASS when standards are met
|
||||
- [ ] Verdict is PASS
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Fail — Timing dependency detected
|
||||
|
||||
**Fixture:**
|
||||
- `tests/unit/ui/hud_update_test.gd` contains:
|
||||
```gdscript
|
||||
await get_tree().create_timer(1.0).timeout
|
||||
assert_eq(label.text, "Ready")
|
||||
```
|
||||
- Real-time wait of 1 second used instead of mock or signal-based assertion
|
||||
|
||||
**Input:** `/test-evidence-review tests/unit/ui/hud_update_test.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the test file
|
||||
2. Skill detects real-time wait (`create_timer(1.0)`) — non-deterministic timing dependency
|
||||
3. Skill flags this as a FAIL-level finding
|
||||
4. Verdict is FAIL
|
||||
5. Skill recommends replacing the timer with a signal-based assertion or mock
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Real-time wait usage is detected as a non-deterministic timing dependency
|
||||
- [ ] Finding is classified as FAIL severity (blocking — violates determinism standard)
|
||||
- [ ] Verdict is FAIL
|
||||
- [ ] Remediation suggestion references signal-based or mock-based approach
|
||||
- [ ] Skill does not edit the test file
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Fail — Test calls external API directly
|
||||
|
||||
**Fixture:**
|
||||
- `tests/unit/networking/auth_test.gd` contains:
|
||||
```gdscript
|
||||
var result = HTTPRequest.new().request("https://api.example.com/auth")
|
||||
```
|
||||
- Direct HTTP call to external API without a mock
|
||||
|
||||
**Input:** `/test-evidence-review tests/unit/networking/auth_test.gd`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the test file
|
||||
2. Skill detects direct external API call (HTTPRequest to live URL)
|
||||
3. Skill flags this as a FAIL-level finding — violates isolation standard
|
||||
4. Verdict is FAIL
|
||||
5. Skill recommends injecting a mock HTTP client
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Direct external API call is detected and flagged
|
||||
- [ ] Finding is classified as FAIL severity (violates isolation standard)
|
||||
- [ ] Verdict is FAIL
|
||||
- [ ] Remediation references dependency injection with a mock HTTP client
|
||||
- [ ] Skill does not modify the test file
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — No Test Files Found
|
||||
|
||||
**Fixture:**
|
||||
- User calls `/test-evidence-review tests/unit/audio/`
|
||||
- `tests/unit/audio/` directory does not exist
|
||||
|
||||
**Input:** `/test-evidence-review tests/unit/audio/`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read files in `tests/unit/audio/` — not found
|
||||
2. Skill outputs: "No test files found at `tests/unit/audio/` — run `/test-setup` to scaffold test directories"
|
||||
3. No verdict is emitted
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill does not crash when path does not exist
|
||||
- [ ] Output names the attempted path in the message
|
||||
- [ ] Output recommends `/test-setup` for scaffolding
|
||||
- [ ] No verdict is emitted when there is nothing to review
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; QL-TEST-COVERAGE is a separate skill
|
||||
|
||||
**Fixture:**
|
||||
- Test file has 1 WARNINGS-level finding (magic number in a non-boundary test)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/test-evidence-review tests/unit/combat/`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reviews tests; finds 1 WARNINGS-level finding
|
||||
2. No director gate is invoked (QL-TEST-COVERAGE is invoked separately, not here)
|
||||
3. Verdict is WARNINGS
|
||||
4. Output notes: "For full test coverage gate, run `/gate-check` which invokes QL-TEST-COVERAGE"
|
||||
5. Skill offers optional report write; asks "May I write" if user opts in
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] Output distinguishes this skill from the QL-TEST-COVERAGE gate invocation
|
||||
- [ ] Optional report requires "May I write" before writing
|
||||
- [ ] Verdict is WARNINGS for advisory-level test quality issues
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads `coding-standards.md` test standards before reviewing test files
|
||||
- [ ] Checks naming, Arrange/Act/Assert structure, determinism, isolation, no hardcoded data
|
||||
- [ ] Does not edit any test files (read-only skill)
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: PASS, WARNINGS, FAIL
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Batch review of all test files in `tests/` is not explicitly tested; behavior
|
||||
is assumed to apply the same checks file by file and aggregate the verdict.
|
||||
- The QL-TEST-COVERAGE director gate (which checks test coverage percentage) is
|
||||
a separate concern and is intentionally NOT invoked by this skill.
|
||||
177
CCGS Skill Testing Framework/skills/analysis/test-flakiness.md
Normal file
177
CCGS Skill Testing Framework/skills/analysis/test-flakiness.md
Normal file
@@ -0,0 +1,177 @@
|
||||
# Skill Test Spec: /test-flakiness
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/test-flakiness` detects non-deterministic tests by analyzing test history logs
|
||||
(if available) or scanning test source code for common flakiness patterns (random
|
||||
numbers without seeds, real-time waits, external I/O). No director gates are
|
||||
invoked. The skill does not write without user approval. Verdicts: NO FLAKINESS,
|
||||
SUSPECT TESTS FOUND, or CONFIRMED FLAKY.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings
|
||||
- [ ] Contains verdict keywords: NO FLAKINESS, SUSPECT TESTS FOUND, CONFIRMED FLAKY
|
||||
- [ ] Does NOT require "May I write" language (read-only; optional report requires approval)
|
||||
- [ ] Has a next-step handoff (what to do after flakiness findings)
|
||||
|
||||
---
|
||||
|
||||
## Director Gate Checks
|
||||
|
||||
None. Flakiness detection is an advisory quality skill for the QA lead; no gates
|
||||
are invoked.
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Clean test history, no flakiness
|
||||
|
||||
**Fixture:**
|
||||
- `production/qa/test-history/` contains logs for 10 test runs
|
||||
- All tests pass consistently across all 10 runs (100% pass rate per test)
|
||||
- No test has a failure pattern
|
||||
|
||||
**Input:** `/test-flakiness`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads test history logs from `production/qa/test-history/`
|
||||
2. Skill computes per-test pass rate across 10 runs
|
||||
3. All tests pass all 10 runs — no inconsistency detected
|
||||
4. Verdict is NO FLAKINESS
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads test history logs when available
|
||||
- [ ] Per-test pass rate is computed across all available runs
|
||||
- [ ] Verdict is NO FLAKINESS when all tests pass consistently
|
||||
- [ ] No files are written
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Suspect Tests Found — Test fails intermittently in history
|
||||
|
||||
**Fixture:**
|
||||
- `production/qa/test-history/` contains logs for 10 test runs
|
||||
- `test_combat_damage_applies_crit_multiplier` passes 7 times, fails 3 times
|
||||
- Failure messages differ (sometimes timeout, sometimes wrong value)
|
||||
|
||||
**Input:** `/test-flakiness`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads test history logs — computes pass rates
|
||||
2. `test_combat_damage_applies_crit_multiplier` has 70% pass rate (threshold: 95%)
|
||||
3. Skill flags it as SUSPECT with pass rate (7/10) and failure pattern noted
|
||||
4. Verdict is SUSPECT TESTS FOUND
|
||||
5. Skill recommends investigating the test for timing or state dependencies
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Tests below the pass-rate threshold are flagged by name
|
||||
- [ ] Pass rate (fraction and percentage) is shown for each suspect test
|
||||
- [ ] Failure pattern (e.g., inconsistent error messages) is noted if detectable
|
||||
- [ ] Verdict is SUSPECT TESTS FOUND
|
||||
- [ ] Skill recommends investigation steps
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Source Pattern — Random number used without seed
|
||||
|
||||
**Fixture:**
|
||||
- No test history logs exist
|
||||
- `tests/unit/loot/loot_drop_test.gd` contains:
|
||||
```gdscript
|
||||
var roll = randf() # unseeded random — non-deterministic
|
||||
assert_gt(roll, 0.5, "Loot should drop above 50%")
|
||||
```
|
||||
|
||||
**Input:** `/test-flakiness`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill finds no test history logs
|
||||
2. Skill falls back to source code analysis
|
||||
3. Skill detects `randf()` call without a preceding `seed()` call
|
||||
4. Skill flags the test as FLAKINESS RISK (source pattern, not confirmed)
|
||||
5. Verdict is SUSPECT TESTS FOUND (pattern detected, not confirmed by history)
|
||||
6. Skill recommends seeding random before the call or mocking the random function
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Source code analysis is used as fallback when no history logs exist
|
||||
- [ ] Unseeded random number usage is detected as a flakiness risk
|
||||
- [ ] Verdict is SUSPECT TESTS FOUND (not CONFIRMED FLAKY — no history to confirm)
|
||||
- [ ] Remediation recommends seeding or mocking
|
||||
|
||||
---
|
||||
|
||||
### Case 4: No Test History — Source-only analysis with common patterns
|
||||
|
||||
**Fixture:**
|
||||
- `production/qa/test-history/` does not exist
|
||||
- `tests/` contains 15 test files
|
||||
- Scan finds 2 tests using `OS.get_ticks_msec()` for timing assertions
|
||||
- No other flakiness patterns found
|
||||
|
||||
**Input:** `/test-flakiness`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill checks for test history — not found
|
||||
2. Skill notes: "No test history available — analyzing source code for flakiness patterns only"
|
||||
3. Skill scans all test files for known patterns: unseeded random, real-time waits, system clock usage
|
||||
4. Finds 2 tests using `OS.get_ticks_msec()` — flags as FLAKINESS RISK
|
||||
5. Verdict is SUSPECT TESTS FOUND
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill notes clearly that source-only analysis is being performed (no history)
|
||||
- [ ] Common flakiness patterns are scanned: random, time-based assertions, external I/O
|
||||
- [ ] `OS.get_ticks_msec()` usage for assertions is flagged as a flakiness risk
|
||||
- [ ] Verdict is SUSPECT TESTS FOUND when source patterns are found
|
||||
|
||||
---
|
||||
|
||||
### Case 5: Gate Compliance — No gate; flakiness report is advisory
|
||||
|
||||
**Fixture:**
|
||||
- Test history shows 1 CONFIRMED FLAKY test (fails 6 out of 10 runs)
|
||||
- `review-mode.txt` contains `full`
|
||||
|
||||
**Input:** `/test-flakiness`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill analyzes test history; identifies 1 confirmed flaky test
|
||||
2. No director gate is invoked regardless of review mode
|
||||
3. Verdict is CONFIRMED FLAKY
|
||||
4. Skill presents findings and offers optional written report
|
||||
5. If user opts in: "May I write to `production/qa/flakiness-report-[date].md`?"
|
||||
|
||||
**Assertions:**
|
||||
- [ ] No director gate is invoked in any review mode
|
||||
- [ ] CONFIRMED FLAKY verdict requires history-based evidence (not just source patterns)
|
||||
- [ ] Optional report requires "May I write" before writing
|
||||
- [ ] Flakiness report is advisory for qa-lead; skill does not auto-disable tests
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Reads test history logs when available; falls back to source analysis when not
|
||||
- [ ] Notes clearly which analysis mode is being used (history vs. source-only)
|
||||
- [ ] Flakiness threshold (e.g., 95% pass rate) is used for SUSPECT classification
|
||||
- [ ] CONFIRMED FLAKY requires history evidence; SUSPECT covers source patterns only
|
||||
- [ ] Does not disable or modify any test files
|
||||
- [ ] No director gates are invoked
|
||||
- [ ] Verdict is one of: NO FLAKINESS, SUSPECT TESTS FOUND, CONFIRMED FLAKY
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- The pass-rate threshold for SUSPECT classification (95% suggested above) is an
|
||||
implementation detail; the tests verify that intermittent failures are flagged,
|
||||
not the exact threshold value.
|
||||
- Tests that fail due to environment issues (missing assets, wrong platform) are
|
||||
not flakiness — the skill distinguishes environment failures from non-determinism
|
||||
in the test itself; this distinction is not explicitly tested here.
|
||||
Reference in New Issue
Block a user