pixelheros/CCGS Skill Testing Framework/agents/operations/economy-designer.md

# Agent Test Spec: economy-designer

## Agent Summary
- **Domain**: Resource economy design, loot table design, progression curves (XP, level, unlock), in-game market and shop design, economic balance analysis, sink and faucet mechanics, inflation/deflation risk assessment
- **Does NOT own**: Live ops event scheduling and structure (live-ops-designer), code implementation, analytics tracking design (analytics-engineer), narrative justification for economy systems (writer)
- **Model tier**: Sonnet
- **Gate IDs**: None; escalates economy-breaking design conflicts to creative-director or producer

---

## Static Assertions (Structural)

- [ ] `description:` field is present and domain-specific (references economy, loot tables, progression curves, balance)
- [ ] `allowed-tools:` list matches the agent's role (Read/Write for design/balance/ documents; no code or analytics tools)
- [ ] Model tier is Sonnet (default for design specialists)
- [ ] Agent definition does not claim authority over live ops scheduling, code, or narrative

---

## Test Cases

### Case 1: In-domain request — loot table design for a chest
**Input**: "Design the loot table for a standard treasure chest in our dungeon game."
**Expected behavior**:
- Produces a probability table with distinct rarity tiers: Common, Uncommon, Rare, Epic, Legendary (or project-equivalent tiers)
- Each tier has: probability percentage, example item categories, and expected gold equivalent value range
- Probabilities sum to 100%
- Includes a brief rationale for each tier's probability: why Common is set at its value, why Legendary is set at its value
- Does NOT produce a single flat list of items — uses tiered probability structure to reflect meaningful rarity

### Case 2: Out-of-domain request — seasonal event schedule
**Input**: "Design the schedule for our summer event and fall event. When should they run and how long should each last?"
**Expected behavior**:
- Does not produce an event schedule or content cadence plan
- States clearly: "Live ops event scheduling is owned by live-ops-designer; I design the economic structure of rewards within events once the event schedule is defined"
- Offers to produce the reward value design for events once live-ops-designer defines the structure

### Case 3: Domain boundary — inflation risk from new currency
**Input**: "We're adding a new 'Prestige Coins' currency earned by completing all seasonal content. Players can spend them in a Prestige Shop."
**Expected behavior**:
- Identifies the inflation risk: if Prestige Coins accumulate faster than the shop provides sinks, the shop loses perceived value and players hoard coins without spending
- Flags the specific risk: seasonal content completion is a finite faucet, but if the shop catalog is exhausted before the season ends, late-season coins have no value
- Proposes a sink mechanic: rotating limited-time shop items, consumable items in the Prestige Shop, or a currency conversion option to keep coins draining
- Does NOT approve the design as economically sound without addressing the sink question
- Produces a structured risk assessment: faucet rate (estimated coins/week), sink capacity (estimated coins required to exhaust catalog), surplus projection

### Case 4: Mid-game progression curve issue
**Input**: "Players are reporting the mid-game XP grind (levels 20-35) feels like a wall. They need 3x more XP per level but rewards don't increase proportionally."
**Expected behavior**:
- Identifies this as a progression curve problem: the XP cost growth rate outpaces the reward growth rate
- Produces a revised XP formula or curve adjustment: either reduce the XP cost multiplier for levels 20-35, increase reward XP in that range, or introduce a catch-up mechanic (bonus XP for completing content significantly below the player's level)
- Shows the math: current curve vs. proposed curve, with specific numbers for levels 20, 25, 30, 35
- Flags that any curve change affects time-to-level-cap projections — notes the downstream impact on end-game content pacing

### Case 5: Context pass — balance analysis using current economy data
**Input context**: Current economy data: average player earns 450 Gold/hour, average shop item costs 2,000 Gold, average session length is 40 minutes. Premium items cost 5,000 Gold.
**Input**: "Is our current Gold economy healthy? Should we adjust prices or earn rates?"
**Expected behavior**:
- Uses the specific numbers provided: 450 Gold/hour = 300 Gold/40-min session; 2,000 Gold item requires ~4.4 sessions to afford; 5,000 Gold premium item requires ~11 sessions
- Evaluates whether these ratios feel rewarding or frustrating based on economy design principles
- Produces a concrete recommendation using the actual numbers: e.g., "At current earn rates, premium items take ~7.3 hours of play to afford — this is at the high end of acceptable; consider either increasing earn rate to 550 Gold/hour or reducing premium item cost to 4,000 Gold"
- Does NOT produce generic advice ("prices may be too high") without anchoring to the provided data

---

## Protocol Compliance

- [ ] Stays within declared domain (loot tables, progression curves, resource economy, inflation/deflation analysis)
- [ ] Redirects live ops scheduling requests to live-ops-designer without producing schedules
- [ ] Flags inflation/deflation risks proactively with quantified sink/faucet analysis
- [ ] Produces explicit math for progression curves — no vague curve adjustments without numbers
- [ ] Uses actual economy data from context; does not produce generic benchmarks when specifics are provided

---

## Coverage Notes
- Case 3 (inflation risk) is an economic health test — missed inflation risks cause long-term economy damage in live games
- Case 4 requires the agent to produce actual numbers, not curve shapes — verify math is present, not just a narrative
- Case 5 is the most important context-awareness test; agent must use provided data, not placeholder values
- No automated runner; review manually or via `/skill-test`