5.9 KiB
Agent Test Spec: economy-designer
Agent Summary
- Domain: Resource economy design, loot table design, progression curves (XP, level, unlock), in-game market and shop design, economic balance analysis, sink and faucet mechanics, inflation/deflation risk assessment
- Does NOT own: Live ops event scheduling and structure (live-ops-designer), code implementation, analytics tracking design (analytics-engineer), narrative justification for economy systems (writer)
- Model tier: Sonnet
- Gate IDs: None; escalates economy-breaking design conflicts to creative-director or producer
Static Assertions (Structural)
description:field is present and domain-specific (references economy, loot tables, progression curves, balance)allowed-tools:list matches the agent's role (Read/Write for design/balance/ documents; no code or analytics tools)- Model tier is Sonnet (default for design specialists)
- Agent definition does not claim authority over live ops scheduling, code, or narrative
Test Cases
Case 1: In-domain request — loot table design for a chest
Input: "Design the loot table for a standard treasure chest in our dungeon game." Expected behavior:
- Produces a probability table with distinct rarity tiers: Common, Uncommon, Rare, Epic, Legendary (or project-equivalent tiers)
- Each tier has: probability percentage, example item categories, and expected gold equivalent value range
- Probabilities sum to 100%
- Includes a brief rationale for each tier's probability: why Common is set at its value, why Legendary is set at its value
- Does NOT produce a single flat list of items — uses tiered probability structure to reflect meaningful rarity
Case 2: Out-of-domain request — seasonal event schedule
Input: "Design the schedule for our summer event and fall event. When should they run and how long should each last?" Expected behavior:
- Does not produce an event schedule or content cadence plan
- States clearly: "Live ops event scheduling is owned by live-ops-designer; I design the economic structure of rewards within events once the event schedule is defined"
- Offers to produce the reward value design for events once live-ops-designer defines the structure
Case 3: Domain boundary — inflation risk from new currency
Input: "We're adding a new 'Prestige Coins' currency earned by completing all seasonal content. Players can spend them in a Prestige Shop." Expected behavior:
- Identifies the inflation risk: if Prestige Coins accumulate faster than the shop provides sinks, the shop loses perceived value and players hoard coins without spending
- Flags the specific risk: seasonal content completion is a finite faucet, but if the shop catalog is exhausted before the season ends, late-season coins have no value
- Proposes a sink mechanic: rotating limited-time shop items, consumable items in the Prestige Shop, or a currency conversion option to keep coins draining
- Does NOT approve the design as economically sound without addressing the sink question
- Produces a structured risk assessment: faucet rate (estimated coins/week), sink capacity (estimated coins required to exhaust catalog), surplus projection
Case 4: Mid-game progression curve issue
Input: "Players are reporting the mid-game XP grind (levels 20-35) feels like a wall. They need 3x more XP per level but rewards don't increase proportionally." Expected behavior:
- Identifies this as a progression curve problem: the XP cost growth rate outpaces the reward growth rate
- Produces a revised XP formula or curve adjustment: either reduce the XP cost multiplier for levels 20-35, increase reward XP in that range, or introduce a catch-up mechanic (bonus XP for completing content significantly below the player's level)
- Shows the math: current curve vs. proposed curve, with specific numbers for levels 20, 25, 30, 35
- Flags that any curve change affects time-to-level-cap projections — notes the downstream impact on end-game content pacing
Case 5: Context pass — balance analysis using current economy data
Input context: Current economy data: average player earns 450 Gold/hour, average shop item costs 2,000 Gold, average session length is 40 minutes. Premium items cost 5,000 Gold. Input: "Is our current Gold economy healthy? Should we adjust prices or earn rates?" Expected behavior:
- Uses the specific numbers provided: 450 Gold/hour = 300 Gold/40-min session; 2,000 Gold item requires ~4.4 sessions to afford; 5,000 Gold premium item requires ~11 sessions
- Evaluates whether these ratios feel rewarding or frustrating based on economy design principles
- Produces a concrete recommendation using the actual numbers: e.g., "At current earn rates, premium items take ~7.3 hours of play to afford — this is at the high end of acceptable; consider either increasing earn rate to 550 Gold/hour or reducing premium item cost to 4,000 Gold"
- Does NOT produce generic advice ("prices may be too high") without anchoring to the provided data
Protocol Compliance
- Stays within declared domain (loot tables, progression curves, resource economy, inflation/deflation analysis)
- Redirects live ops scheduling requests to live-ops-designer without producing schedules
- Flags inflation/deflation risks proactively with quantified sink/faucet analysis
- Produces explicit math for progression curves — no vague curve adjustments without numbers
- Uses actual economy data from context; does not produce generic benchmarks when specifics are provided
Coverage Notes
- Case 3 (inflation risk) is an economic health test — missed inflation risks cause long-term economy damage in live games
- Case 4 requires the agent to produce actual numbers, not curve shapes — verify math is present, not just a narrative
- Case 5 is the most important context-awareness test; agent must use provided data, not placeholder values
- No automated runner; review manually or via
/skill-test