# Skill Test Spec: /test-evidence-review

## Skill Summary

`/test-evidence-review` performs a quality review of test files in `tests/`,
checking test naming conventions, determinism, isolation, and absence of
hardcoded magic numbers — all against the project's test standards defined in
`coding-standards.md`. Findings may be flagged for qa-lead review. No director
gates are invoked. The skill does not write without user approval. Verdicts:
PASS, WARNINGS, or FAIL.

---

## Static Assertions (Structural)

Verified automatically by `/skill-test static` — no fixture needed.

- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
- [ ] Has ≥2 phase headings
- [ ] Contains verdict keywords: PASS, WARNINGS, FAIL
- [ ] Does NOT require "May I write" language (read-only; write is optional flagging report)
- [ ] Has a next-step handoff (what to do after findings are reviewed)

---

## Director Gate Checks

None. Test evidence review is an advisory quality skill; QL-TEST-COVERAGE gate
is a separate skill invocation and is NOT triggered here.

---

## Test Cases

### Case 1: Happy Path — Tests follow all standards

**Fixture:**
- `tests/unit/combat/health_system_take_damage_test.gd` exists with:
  - Naming: `test_health_system_take_damage_reduces_health()` (follows `test_[system]_[scenario]_[expected]`)
  - Arrange/Act/Assert structure present
  - No `sleep()`, `await` with time values, or random seeds
  - No calls to external APIs or file I/O
  - No inline magic numbers (uses constants from `tests/unit/combat/fixtures/`)

**Input:** `/test-evidence-review tests/unit/combat/`

**Expected behavior:**
1. Skill reads test standards from `coding-standards.md`
2. Skill reads the test file; checks all 5 standards
3. All checks pass: naming, structure, determinism, isolation, no hardcoded data
4. Verdict is PASS

**Assertions:**
- [ ] Each of the 5 test standards is checked and reported
- [ ] All checks show PASS when standards are met
- [ ] Verdict is PASS
- [ ] No files are written

---

### Case 2: Fail — Timing dependency detected

**Fixture:**
- `tests/unit/ui/hud_update_test.gd` contains:
  ```gdscript
  await get_tree().create_timer(1.0).timeout
  assert_eq(label.text, "Ready")
  ```
- Real-time wait of 1 second used instead of mock or signal-based assertion

**Input:** `/test-evidence-review tests/unit/ui/hud_update_test.gd`

**Expected behavior:**
1. Skill reads the test file
2. Skill detects real-time wait (`create_timer(1.0)`) — non-deterministic timing dependency
3. Skill flags this as a FAIL-level finding
4. Verdict is FAIL
5. Skill recommends replacing the timer with a signal-based assertion or mock

**Assertions:**
- [ ] Real-time wait usage is detected as a non-deterministic timing dependency
- [ ] Finding is classified as FAIL severity (blocking — violates determinism standard)
- [ ] Verdict is FAIL
- [ ] Remediation suggestion references signal-based or mock-based approach
- [ ] Skill does not edit the test file

---

### Case 3: Fail — Test calls external API directly

**Fixture:**
- `tests/unit/networking/auth_test.gd` contains:
  ```gdscript
  var result = HTTPRequest.new().request("https://api.example.com/auth")
  ```
- Direct HTTP call to external API without a mock

**Input:** `/test-evidence-review tests/unit/networking/auth_test.gd`

**Expected behavior:**
1. Skill reads the test file
2. Skill detects direct external API call (HTTPRequest to live URL)
3. Skill flags this as a FAIL-level finding — violates isolation standard
4. Verdict is FAIL
5. Skill recommends injecting a mock HTTP client

**Assertions:**
- [ ] Direct external API call is detected and flagged
- [ ] Finding is classified as FAIL severity (violates isolation standard)
- [ ] Verdict is FAIL
- [ ] Remediation references dependency injection with a mock HTTP client
- [ ] Skill does not modify the test file

---

### Case 4: Edge Case — No Test Files Found

**Fixture:**
- User calls `/test-evidence-review tests/unit/audio/`
- `tests/unit/audio/` directory does not exist

**Input:** `/test-evidence-review tests/unit/audio/`

**Expected behavior:**
1. Skill attempts to read files in `tests/unit/audio/` — not found
2. Skill outputs: "No test files found at `tests/unit/audio/` — run `/test-setup` to scaffold test directories"
3. No verdict is emitted

**Assertions:**
- [ ] Skill does not crash when path does not exist
- [ ] Output names the attempted path in the message
- [ ] Output recommends `/test-setup` for scaffolding
- [ ] No verdict is emitted when there is nothing to review

---

### Case 5: Gate Compliance — No gate; QL-TEST-COVERAGE is a separate skill

**Fixture:**
- Test file has 1 WARNINGS-level finding (magic number in a non-boundary test)
- `review-mode.txt` contains `full`

**Input:** `/test-evidence-review tests/unit/combat/`

**Expected behavior:**
1. Skill reviews tests; finds 1 WARNINGS-level finding
2. No director gate is invoked (QL-TEST-COVERAGE is invoked separately, not here)
3. Verdict is WARNINGS
4. Output notes: "For full test coverage gate, run `/gate-check` which invokes QL-TEST-COVERAGE"
5. Skill offers optional report write; asks "May I write" if user opts in

**Assertions:**
- [ ] No director gate is invoked in any review mode
- [ ] Output distinguishes this skill from the QL-TEST-COVERAGE gate invocation
- [ ] Optional report requires "May I write" before writing
- [ ] Verdict is WARNINGS for advisory-level test quality issues

---

## Protocol Compliance

- [ ] Reads `coding-standards.md` test standards before reviewing test files
- [ ] Checks naming, Arrange/Act/Assert structure, determinism, isolation, no hardcoded data
- [ ] Does not edit any test files (read-only skill)
- [ ] No director gates are invoked
- [ ] Verdict is one of: PASS, WARNINGS, FAIL

---

## Coverage Notes

- Batch review of all test files in `tests/` is not explicitly tested; behavior
  is assumed to apply the same checks file by file and aggregate the verdict.
- The QL-TEST-COVERAGE director gate (which checks test coverage percentage) is
  a separate concern and is intentionally NOT invoked by this skill.