# Agent Test Spec: producer

## Agent Summary
**Domain owned:** Scope management, sprint planning validation, milestone tracking, epic prioritization, production phase gate.
**Does NOT own:** Game design decisions (creative-director / game-designer), technical architecture (technical-director), creative direction.
**Model tier:** Opus (multi-document synthesis, high-stakes phase gate verdicts).
**Gate IDs handled:** PR-SCOPE, PR-SPRINT, PR-MILESTONE, PR-EPIC, PR-PHASE-GATE.

---

## Static Assertions (Structural)

Verified by reading the agent's `.claude/agents/producer.md` frontmatter:

- [ ] `description:` field is present and domain-specific (references scope, sprint, milestone, production — not generic)
- [ ] `allowed-tools:` list is primarily read-focused; Bash only if sprint/milestone files require parsing
- [ ] Model tier is `claude-opus-4-6` per coordination-rules.md (directors with gate synthesis = Opus)
- [ ] Agent definition does not claim authority over design decisions or technical architecture

---

## Test Cases

### Case 1: In-domain request — appropriate output format
**Scenario:** A sprint plan is submitted for Sprint 7. The plan includes 12 story points across 4 team members over 2 weeks. Historical velocity from the last 3 sprints averages 11.5 points. Request is tagged PR-SPRINT.
**Expected:** Returns `PR-SPRINT: REALISTIC` with rationale noting the plan is within one standard deviation of historical velocity and capacity appears matched.
**Assertions:**
- [ ] Verdict is exactly one of REALISTIC / CONCERNS / UNREALISTIC
- [ ] Verdict token is formatted as `PR-SPRINT: REALISTIC`
- [ ] Rationale references the specific story point count and historical velocity figures
- [ ] Output stays within production scope — does not comment on whether the stories are well-designed or technically sound

### Case 2: Out-of-domain request — redirects or escalates
**Scenario:** Team member asks producer to evaluate whether the game's "weight-based inventory" mechanic feels fun and engaging.
**Expected:** Agent declines to evaluate game feel and redirects to game-designer or creative-director.
**Assertions:**
- [ ] Does not make any binding assessment of the mechanic's design quality
- [ ] Explicitly names `game-designer` or `creative-director` as the correct handler
- [ ] May note if the mechanic's scope has production implications (e.g., dependencies on other systems), but defers all design evaluation

### Case 3: Gate verdict — correct vocabulary
**Scenario:** A new feature proposal adds three new systems (crafting, weather, and faction reputation) to a milestone that was scoped for two systems only. None of these additions appear in the current milestone plan. Request is tagged PR-SCOPE.
**Expected:** Returns `PR-SCOPE: CONCERNS` with specific identification of the three unplanned systems and their absence from the milestone scope document.
**Assertions:**
- [ ] Verdict is exactly one of REALISTIC / CONCERNS / UNREALISTIC — not freeform text
- [ ] Verdict token is formatted as `PR-SCOPE: CONCERNS`
- [ ] Rationale names the three specific systems being added out of scope
- [ ] Does not evaluate whether the systems are good design — only whether they fit the plan

### Case 4: Conflict escalation — correct parent
**Scenario:** game-designer wants to add a late-breaking mechanic (dynamic weather affecting all gameplay systems) that technical-director warns will require 3 additional sprints. game-designer and technical-director are in disagreement about whether to proceed.
**Expected:** Producer does not take a side on whether the mechanic is worth adding (design decision) or feasible (technical decision). Producer quantifies the production impact (3 sprints of delay, milestone slip risk), presents the trade-off to the user, and follows coordination-rules.md conflict resolution: escalate to the shared parent (in this case, surface the conflict for user decision since creative-director and technical-director are both top-tier).
**Assertions:**
- [ ] Quantifies the production impact in concrete terms (sprint count, milestone date slip)
- [ ] Does not make a binding design or technical decision
- [ ] Surfaces the conflict to the user with the scope implications clearly stated
- [ ] References coordination-rules.md conflict resolution protocol (escalate to shared parent or user)

### Case 5: Context pass — uses provided context
**Scenario:** Agent receives a gate context block that includes the current milestone deadline (8 weeks away) and velocity data from the last 4 sprints (8, 10, 9, 11 points). A sprint plan is submitted with 14 story points.
**Expected:** Assessment uses the provided velocity data to project whether 14 points is achievable, and references the 8-week milestone window to assess whether the current sprint's scope leaves adequate buffer.
**Assertions:**
- [ ] Uses the specific velocity figures from the provided context (not generic estimates)
- [ ] References the 8-week deadline in the capacity assessment
- [ ] Calculates or estimates remaining sprint count within the milestone window
- [ ] Does not give generic scope advice disconnected from the supplied deadline and velocity data

---

## Protocol Compliance

- [ ] Returns verdicts using REALISTIC / CONCERNS / UNREALISTIC vocabulary only
- [ ] Stays within declared production domain
- [ ] Escalates design/technical conflicts by quantifying scope impact and presenting to user
- [ ] Uses gate IDs in output (e.g., `PR-SPRINT: REALISTIC`) not inline prose verdicts
- [ ] Does not make binding game design or technical architecture decisions

---

## Coverage Notes
- PR-EPIC (epic-level prioritization) is not covered — a dedicated case should be added when the /create-epics skill produces structured epic documents.
- PR-MILESTONE (milestone health review) is not covered — deferred to integration with /milestone-review skill.
- PR-PHASE-GATE (full production phase advancement) involving synthesis of multiple sub-gate results is deferred.
- Multi-sprint burn-down and velocity trend analysis are not covered here.