Files

panw a16fe4bff7 添加 claude code game studios 到项目

2026-05-15 14:52:29 +08:00

6.0 KiB

Raw Blame History

Skill Test Spec: /test-evidence-review

Skill Summary

/test-evidence-review performs a quality review of test files in tests/, checking test naming conventions, determinism, isolation, and absence of hardcoded magic numbers — all against the project's test standards defined in coding-standards.md. Findings may be flagged for qa-lead review. No director gates are invoked. The skill does not write without user approval. Verdicts: PASS, WARNINGS, or FAIL.

Static Assertions (Structural)

Verified automatically by /skill-test static — no fixture needed.

Has required frontmatter fields: name, description, argument-hint, user-invocable, allowed-tools
Has ≥2 phase headings
Contains verdict keywords: PASS, WARNINGS, FAIL
Does NOT require "May I write" language (read-only; write is optional flagging report)
Has a next-step handoff (what to do after findings are reviewed)

Director Gate Checks

None. Test evidence review is an advisory quality skill; QL-TEST-COVERAGE gate is a separate skill invocation and is NOT triggered here.

Test Cases

Case 1: Happy Path — Tests follow all standards

Fixture:

tests/unit/combat/health_system_take_damage_test.gd exists with:
- Naming: test_health_system_take_damage_reduces_health() (follows test_[system]_[scenario]_[expected])
- Arrange/Act/Assert structure present
- No sleep(), await with time values, or random seeds
- No calls to external APIs or file I/O
- No inline magic numbers (uses constants from tests/unit/combat/fixtures/)

Input: /test-evidence-review tests/unit/combat/

Expected behavior:

Skill reads test standards from coding-standards.md
Skill reads the test file; checks all 5 standards
All checks pass: naming, structure, determinism, isolation, no hardcoded data
Verdict is PASS

Assertions:

Each of the 5 test standards is checked and reported
All checks show PASS when standards are met
Verdict is PASS
No files are written

Case 2: Fail — Timing dependency detected

Fixture:

tests/unit/ui/hud_update_test.gd contains:

await get_tree().create_timer(1.0).timeout
assert_eq(label.text, "Ready")

Real-time wait of 1 second used instead of mock or signal-based assertion

Input: /test-evidence-review tests/unit/ui/hud_update_test.gd

Expected behavior:

Skill reads the test file
Skill detects real-time wait (create_timer(1.0)) — non-deterministic timing dependency
Skill flags this as a FAIL-level finding
Verdict is FAIL
Skill recommends replacing the timer with a signal-based assertion or mock

Assertions:

Real-time wait usage is detected as a non-deterministic timing dependency
Finding is classified as FAIL severity (blocking — violates determinism standard)
Verdict is FAIL
Remediation suggestion references signal-based or mock-based approach
Skill does not edit the test file

Case 3: Fail — Test calls external API directly

Fixture:

tests/unit/networking/auth_test.gd contains:

var result = HTTPRequest.new().request("https://api.example.com/auth")

Direct HTTP call to external API without a mock

Input: /test-evidence-review tests/unit/networking/auth_test.gd

Expected behavior:

Skill reads the test file
Skill detects direct external API call (HTTPRequest to live URL)
Skill flags this as a FAIL-level finding — violates isolation standard
Verdict is FAIL
Skill recommends injecting a mock HTTP client

Assertions:

Direct external API call is detected and flagged
Finding is classified as FAIL severity (violates isolation standard)
Verdict is FAIL
Remediation references dependency injection with a mock HTTP client
Skill does not modify the test file

Case 4: Edge Case — No Test Files Found

Fixture:

User calls /test-evidence-review tests/unit/audio/
tests/unit/audio/ directory does not exist

Input: /test-evidence-review tests/unit/audio/

Expected behavior:

Skill attempts to read files in tests/unit/audio/ — not found
Skill outputs: "No test files found at tests/unit/audio/ — run /test-setup to scaffold test directories"
No verdict is emitted

Assertions:

Skill does not crash when path does not exist
Output names the attempted path in the message
Output recommends /test-setup for scaffolding
No verdict is emitted when there is nothing to review

Case 5: Gate Compliance — No gate; QL-TEST-COVERAGE is a separate skill

Fixture:

Test file has 1 WARNINGS-level finding (magic number in a non-boundary test)
review-mode.txt contains full

Input: /test-evidence-review tests/unit/combat/

Expected behavior:

Skill reviews tests; finds 1 WARNINGS-level finding
No director gate is invoked (QL-TEST-COVERAGE is invoked separately, not here)
Verdict is WARNINGS
Output notes: "For full test coverage gate, run /gate-check which invokes QL-TEST-COVERAGE"
Skill offers optional report write; asks "May I write" if user opts in

Assertions:

No director gate is invoked in any review mode
Output distinguishes this skill from the QL-TEST-COVERAGE gate invocation
Optional report requires "May I write" before writing
Verdict is WARNINGS for advisory-level test quality issues

Protocol Compliance

Reads coding-standards.md test standards before reviewing test files
Checks naming, Arrange/Act/Assert structure, determinism, isolation, no hardcoded data
Does not edit any test files (read-only skill)
No director gates are invoked
Verdict is one of: PASS, WARNINGS, FAIL

Coverage Notes

Batch review of all test files in tests/ is not explicitly tested; behavior is assumed to apply the same checks file by file and aggregate the verdict.
The QL-TEST-COVERAGE director gate (which checks test coverage percentage) is a separate concern and is intentionally NOT invoked by this skill.

6.0 KiB Raw Blame History

Skill Test Spec: /test-evidence-review

Skill Summary

Static Assertions (Structural)

Director Gate Checks

Test Cases

Case 1: Happy Path — Tests follow all standards

Case 2: Fail — Timing dependency detected

Case 3: Fail — Test calls external API directly

Case 4: Edge Case — No Test Files Found

Case 5: Gate Compliance — No gate; QL-TEST-COVERAGE is a separate skill

Protocol Compliance

Coverage Notes

6.0 KiB

Raw Blame History