6.0 KiB
Skill Test Spec: /test-evidence-review
Skill Summary
/test-evidence-review performs a quality review of test files in tests/,
checking test naming conventions, determinism, isolation, and absence of
hardcoded magic numbers — all against the project's test standards defined in
coding-standards.md. Findings may be flagged for qa-lead review. No director
gates are invoked. The skill does not write without user approval. Verdicts:
PASS, WARNINGS, or FAIL.
Static Assertions (Structural)
Verified automatically by /skill-test static — no fixture needed.
- Has required frontmatter fields:
name,description,argument-hint,user-invocable,allowed-tools - Has ≥2 phase headings
- Contains verdict keywords: PASS, WARNINGS, FAIL
- Does NOT require "May I write" language (read-only; write is optional flagging report)
- Has a next-step handoff (what to do after findings are reviewed)
Director Gate Checks
None. Test evidence review is an advisory quality skill; QL-TEST-COVERAGE gate is a separate skill invocation and is NOT triggered here.
Test Cases
Case 1: Happy Path — Tests follow all standards
Fixture:
tests/unit/combat/health_system_take_damage_test.gdexists with:- Naming:
test_health_system_take_damage_reduces_health()(followstest_[system]_[scenario]_[expected]) - Arrange/Act/Assert structure present
- No
sleep(),awaitwith time values, or random seeds - No calls to external APIs or file I/O
- No inline magic numbers (uses constants from
tests/unit/combat/fixtures/)
- Naming:
Input: /test-evidence-review tests/unit/combat/
Expected behavior:
- Skill reads test standards from
coding-standards.md - Skill reads the test file; checks all 5 standards
- All checks pass: naming, structure, determinism, isolation, no hardcoded data
- Verdict is PASS
Assertions:
- Each of the 5 test standards is checked and reported
- All checks show PASS when standards are met
- Verdict is PASS
- No files are written
Case 2: Fail — Timing dependency detected
Fixture:
tests/unit/ui/hud_update_test.gdcontains:await get_tree().create_timer(1.0).timeout assert_eq(label.text, "Ready")- Real-time wait of 1 second used instead of mock or signal-based assertion
Input: /test-evidence-review tests/unit/ui/hud_update_test.gd
Expected behavior:
- Skill reads the test file
- Skill detects real-time wait (
create_timer(1.0)) — non-deterministic timing dependency - Skill flags this as a FAIL-level finding
- Verdict is FAIL
- Skill recommends replacing the timer with a signal-based assertion or mock
Assertions:
- Real-time wait usage is detected as a non-deterministic timing dependency
- Finding is classified as FAIL severity (blocking — violates determinism standard)
- Verdict is FAIL
- Remediation suggestion references signal-based or mock-based approach
- Skill does not edit the test file
Case 3: Fail — Test calls external API directly
Fixture:
tests/unit/networking/auth_test.gdcontains:var result = HTTPRequest.new().request("https://api.example.com/auth")- Direct HTTP call to external API without a mock
Input: /test-evidence-review tests/unit/networking/auth_test.gd
Expected behavior:
- Skill reads the test file
- Skill detects direct external API call (HTTPRequest to live URL)
- Skill flags this as a FAIL-level finding — violates isolation standard
- Verdict is FAIL
- Skill recommends injecting a mock HTTP client
Assertions:
- Direct external API call is detected and flagged
- Finding is classified as FAIL severity (violates isolation standard)
- Verdict is FAIL
- Remediation references dependency injection with a mock HTTP client
- Skill does not modify the test file
Case 4: Edge Case — No Test Files Found
Fixture:
- User calls
/test-evidence-review tests/unit/audio/ tests/unit/audio/directory does not exist
Input: /test-evidence-review tests/unit/audio/
Expected behavior:
- Skill attempts to read files in
tests/unit/audio/— not found - Skill outputs: "No test files found at
tests/unit/audio/— run/test-setupto scaffold test directories" - No verdict is emitted
Assertions:
- Skill does not crash when path does not exist
- Output names the attempted path in the message
- Output recommends
/test-setupfor scaffolding - No verdict is emitted when there is nothing to review
Case 5: Gate Compliance — No gate; QL-TEST-COVERAGE is a separate skill
Fixture:
- Test file has 1 WARNINGS-level finding (magic number in a non-boundary test)
review-mode.txtcontainsfull
Input: /test-evidence-review tests/unit/combat/
Expected behavior:
- Skill reviews tests; finds 1 WARNINGS-level finding
- No director gate is invoked (QL-TEST-COVERAGE is invoked separately, not here)
- Verdict is WARNINGS
- Output notes: "For full test coverage gate, run
/gate-checkwhich invokes QL-TEST-COVERAGE" - Skill offers optional report write; asks "May I write" if user opts in
Assertions:
- No director gate is invoked in any review mode
- Output distinguishes this skill from the QL-TEST-COVERAGE gate invocation
- Optional report requires "May I write" before writing
- Verdict is WARNINGS for advisory-level test quality issues
Protocol Compliance
- Reads
coding-standards.mdtest standards before reviewing test files - Checks naming, Arrange/Act/Assert structure, determinism, isolation, no hardcoded data
- Does not edit any test files (read-only skill)
- No director gates are invoked
- Verdict is one of: PASS, WARNINGS, FAIL
Coverage Notes
- Batch review of all test files in
tests/is not explicitly tested; behavior is assumed to apply the same checks file by file and aggregate the verdict. - The QL-TEST-COVERAGE director gate (which checks test coverage percentage) is a separate concern and is intentionally NOT invoked by this skill.