添加 claude code game studios 到项目

This commit is contained in:
panw
2026-05-15 14:52:29 +08:00
parent dff559462d
commit a16fe4bff7
415 changed files with 78609 additions and 0 deletions

View File

@@ -0,0 +1,80 @@
# Agent Test Spec: ue-blueprint-specialist
## Agent Summary
- **Domain**: Blueprint architecture, the Blueprint/C++ boundary, Blueprint graph quality, Blueprint performance optimization, Blueprint Function Library design
- **Does NOT own**: C++ implementation (engine-programmer or gameplay-programmer), art assets or shaders, UI/UX flow design (ux-designer)
- **Model tier**: Sonnet
- **Gate IDs**: None; defers to unreal-specialist or lead-programmer for cross-domain rulings
---
## Static Assertions (Structural)
- [ ] `description:` field is present and domain-specific (references Blueprint architecture and optimization)
- [ ] `allowed-tools:` list matches the agent's role (Read for Blueprint project files; no server or deployment tools)
- [ ] Model tier is Sonnet (default for specialists)
- [ ] Agent definition does not claim authority over C++ implementation decisions
---
## Test Cases
### Case 1: In-domain request — Blueprint graph performance review
**Input**: "Review our AI behavior Blueprint. It has tick-based logic running every frame that checks line-of-sight for 30 NPCs simultaneously."
**Expected behavior**:
- Identifies tick-heavy logic as a performance problem
- Recommends switching from EventTick to event-driven patterns (perception system events, timers, or polling on a reduced interval)
- Flags the per-NPC cost of simultaneous line-of-sight checks
- Suggests alternatives: AIPerception component events, staggered tick groups, or moving the system to C++ if Blueprint overhead is measured to be significant
- Output is structured: problem identified, impact estimated, alternatives listed
### Case 2: Out-of-domain request — C++ implementation
**Input**: "Write the C++ implementation for this ability cooldown system."
**Expected behavior**:
- Does not produce C++ implementation code
- Provides the Blueprint equivalent of the cooldown logic (e.g., using a Timeline or GameplayEffect if GAS is in use)
- States clearly: "C++ implementation is handled by engine-programmer or gameplay-programmer; I can show the Blueprint approach or describe the boundary where Blueprint calls into C++"
- Optionally notes when the cooldown complexity warrants a C++ backend
### Case 3: Domain boundary — unsafe raw pointer access in Blueprint
**Input**: "Our Blueprint calls GetOwner() and then immediately accesses a component on the result without checking if it's valid."
**Expected behavior**:
- Flags this as a runtime crash risk: GetOwner() can return null in some lifecycle states
- Provides the correct Blueprint pattern: IsValid() node before any property/component access
- Notes that Blueprint's null checks are not optional on Actor-derived references
- Does NOT silently fix the code without explaining why the original was unsafe
### Case 4: Blueprint graph complexity — readiness for Function Library refactor
**Input**: "Our main GameMode Blueprint has 600+ nodes in a single graph with duplicated damage calculation logic in 8 places."
**Expected behavior**:
- Diagnoses this as a maintainability and testability problem
- Recommends extracting duplicated logic into a Blueprint Function Library (BFL)
- Describes how to structure the BFL: pure functions for calculations, static calls from any Blueprint
- Notes that if the damage logic is performance-sensitive or shared with C++, it may be a candidate for migration to unreal-specialist review
- Output is a concrete refactor plan, not a vague recommendation
### Case 5: Context pass — Blueprint complexity budget
**Input context**: Project conventions specify a maximum of 100 nodes per Blueprint event graph before a mandatory Function Library extraction.
**Input**: "Here is our inventory Blueprint graph [150 nodes shown]. Is it ready to ship?"
**Expected behavior**:
- References the stated 150-node count against the 100-node budget from project conventions
- Flags the graph as exceeding the complexity threshold
- Does NOT approve it as-is
- Produces a list of candidate subgraphs for Function Library extraction to bring the main graph within budget
---
## Protocol Compliance
- [ ] Stays within declared domain (Blueprint architecture, performance, graph quality)
- [ ] Redirects C++ implementation requests to engine-programmer or gameplay-programmer
- [ ] Returns structured findings (problem/impact/alternatives format) rather than freeform opinions
- [ ] Enforces Blueprint safety patterns (null checks, IsValid) proactively
- [ ] References project conventions when evaluating graph complexity
---
## Coverage Notes
- Case 3 (null pointer safety) is a safety-critical test — this is a common source of shipping crashes
- Case 5 requires that project conventions include a stated node budget; if none is configured, the agent should note the absence and recommend setting one
- No automated runner; review manually or via `/skill-test`

View File

@@ -0,0 +1,81 @@
# Agent Test Spec: ue-gas-specialist
## Agent Summary
- **Domain**: Gameplay Ability System (GAS) — abilities (UGameplayAbility), gameplay effects (UGameplayEffect), attribute sets (UAttributeSet), gameplay tags, ability tasks (UAbilityTask), ability specs (FGameplayAbilitySpec), GAS prediction and latency compensation
- **Does NOT own**: UI display of ability state (ue-umg-specialist), net replication of GAS data beyond built-in GAS prediction (ue-replication-specialist), art or VFX for ability feedback (vfx-artist)
- **Model tier**: Sonnet
- **Gate IDs**: None; defers cross-domain calls to the appropriate specialist
---
## Static Assertions (Structural)
- [ ] `description:` field is present and domain-specific (references GAS, abilities, GameplayEffects, AttributeSets)
- [ ] `allowed-tools:` list matches the agent's role (Read/Write for GAS source files; no deployment or server tools)
- [ ] Model tier is Sonnet (default for specialists)
- [ ] Agent definition does not claim authority over UI implementation or low-level net serialization
---
## Test Cases
### Case 1: In-domain request — dash ability with cooldown
**Input**: "Implement a dash ability that moves the player forward 500 units and has a 1.5 second cooldown."
**Expected behavior**:
- Produces a GAS AbilitySpec structure or outline: UGameplayAbility subclass with ActivateAbility logic, an AbilityTask for movement (e.g., AbilityTask_ApplyRootMotionMoveToForce or custom root motion), and a UGameplayEffect for the cooldown
- Cooldown GameplayEffect uses Duration policy with the 1.5s duration and a GameplayTag to block re-activation
- Tags clearly named following a hierarchy convention (e.g., Ability.Dash, Cooldown.Ability.Dash)
- Output includes both the ability class outline and the GameplayEffect definition
### Case 2: Out-of-domain request — GAS state replication
**Input**: "How do I replicate the player's ability cooldown state to all clients so the UI updates correctly?"
**Expected behavior**:
- Clarifies that GAS has built-in replication for AbilitySpecs and GameplayEffects via the AbilitySystemComponent's replication mode
- Explains the three ASC replication modes (Full, Mixed, Minimal) and when to use each
- For custom replication needs beyond GAS built-ins, explicitly states: "For custom net serialization of GAS data, coordinate with ue-replication-specialist"
- Does NOT attempt to write custom replication code outside GAS's own systems without flagging the domain boundary
### Case 3: Domain boundary — incorrect GameplayTag hierarchy
**Input**: "We have an ability that applies a tag called 'Stunned' and another that checks for 'Status.Stunned'. They're not matching."
**Expected behavior**:
- Identifies the root cause: tag names must be exact or use hierarchical matching via TagContainer queries
- Flags the naming inconsistency: 'Stunned' is a root-level tag; 'Status.Stunned' is a child tag under 'Status' — these are different tags
- Recommends a project tag naming convention: all status effects under Status.*, all abilities under Ability.*
- Provides the fix: either rename the applied tag to 'Status.Stunned' or update the query to match 'Stunned'
- Notes where tag definitions should live (DefaultGameplayTags.ini or a DataTable)
### Case 4: Conflict — attribute set conflict between two abilities
**Input**: "Our Shield ability and our Armor ability both modify a 'DefenseValue' attribute. They're stacking in ways that aren't intended — after both are active, defense goes well above maximum."
**Expected behavior**:
- Identifies this as a GameplayEffect stacking and magnitude calculation problem
- Proposes a resolution using Execution Calculations (UGameplayEffectExecutionCalculation) or Modifier Aggregators to cap the combined result
- Alternatively recommends using Gameplay Effect Stacking policies (Aggregate, None) to prevent unintended additive stacking
- Produces a concrete resolution: either an Execution Calculation class outline or a change to the Modifier Op (Override instead of Additive for the cap)
- Does NOT propose removing one of the abilities as the solution
### Case 5: Context pass — designing against an existing attribute set
**Input context**: Project has an existing AttributeSet with attributes: Health, MaxHealth, Stamina, MaxStamina, Defense, AttackPower.
**Input**: "Design a Berserker ability that increases AttackPower by 50% when Health drops below 30%."
**Expected behavior**:
- Uses the existing Health, MaxHealth, and AttackPower attributes — does NOT invent new attributes
- Designs a Passive GameplayAbility (or triggered Effect) that fires on Health change, checks Health/MaxHealth ratio via a GameplayEffectExecutionCalculation or Attribute-Based magnitude
- Uses a Gameplay Cue or Gameplay Tag to track the Berserker active state
- References the actual attribute names from the provided AttributeSet (AttackPower, not "Damage" or "Strength")
---
## Protocol Compliance
- [ ] Stays within declared domain (GAS: abilities, effects, attributes, tags, ability tasks)
- [ ] Redirects custom replication requests to ue-replication-specialist with clear explanation of boundary
- [ ] Returns structured findings (ability outline + GameplayEffect definition) rather than vague descriptions
- [ ] Enforces tag hierarchy naming conventions proactively
- [ ] Uses only attributes and tags present in the provided context; does not invent new ones without noting it
---
## Coverage Notes
- Case 3 (tag hierarchy) is a frequent source of subtle bugs; test whenever tag naming conventions change
- Case 4 requires knowledge of GAS stacking policies — verify this case if the GAS integration depth changes
- Case 5 is the most important context-awareness test; failing it means the agent ignores project state
- No automated runner; review manually or via `/skill-test`

View File

@@ -0,0 +1,82 @@
# Agent Test Spec: ue-replication-specialist
## Agent Summary
- **Domain**: Property replication (UPROPERTY Replicated/ReplicatedUsing), RPCs (Server/Client/NetMulticast), client prediction and reconciliation, net relevancy and always-relevant settings, net serialization (FArchive/NetSerialize), bandwidth optimization and replication frequency tuning
- **Does NOT own**: Gameplay logic being replicated (gameplay-programmer), server infrastructure and hosting (devops-engineer), GAS-specific prediction (ue-gas-specialist handles GAS net prediction)
- **Model tier**: Sonnet
- **Gate IDs**: None; escalates security-relevant replication concerns to lead-programmer
---
## Static Assertions (Structural)
- [ ] `description:` field is present and domain-specific (references replication, RPCs, client prediction, bandwidth)
- [ ] `allowed-tools:` list matches the agent's role (Read/Write for C++ and Blueprint source files; no infrastructure or deployment tools)
- [ ] Model tier is Sonnet (default for specialists)
- [ ] Agent definition does not claim authority over server infrastructure, game server architecture, or gameplay logic correctness
---
## Test Cases
### Case 1: In-domain request — replicated player health with client prediction
**Input**: "Set up replicated player health that clients can predict locally (e.g., when taking self-inflicted damage) and have corrected by the server."
**Expected behavior**:
- Produces a UPROPERTY(ReplicatedUsing=OnRep_Health) declaration in the appropriate Character or AttributeSet class
- Describes the OnRep_Health function: apply visual/audio feedback, reconcile predicted value with server-authoritative value
- Explains the client prediction pattern: local client applies tentative damage immediately, server authoritative value arrives via OnRep and corrects any discrepancy
- Notes that if GAS is in use, the built-in GAS prediction handles this — recommend coordinating with ue-gas-specialist
- Output is a concrete code structure (property declaration + OnRep outline), not a conceptual description only
### Case 2: Out-of-domain request — game server architecture
**Input**: "Design our game server infrastructure — how many dedicated servers we need, regional deployment, and matchmaking architecture."
**Expected behavior**:
- Does not produce server infrastructure architecture, hosting recommendations, or matchmaking design
- States clearly: "Server infrastructure and deployment architecture is owned by devops-engineer; I handle the Unreal replication layer within a running game session"
- Does not conflate in-game replication with server hosting concerns
### Case 3: Domain boundary — RPC without server authority validation
**Input**: "We have a Server RPC called ServerSpendCurrency that deducts in-game currency. The client calls it and the server just deducts without checking anything."
**Expected behavior**:
- Flags this as a critical security vulnerability: unvalidated server RPCs are exploitable by cheaters sending arbitrary RPC calls
- Provides the required fix: server-side validation before the deduct — check that the player actually has the currency, verify the transaction is valid, reject and log if not
- Uses the pattern: `if (!HasAuthority()) return;` guard plus explicit state validation before mutation
- Notes this should be reviewed by lead-programmer given the economy implications
- Does NOT produce the "fixed" code without explaining why the original was dangerous
### Case 4: Bandwidth optimization — high-frequency movement replication
**Input**: "Our player movement is replicated using a Vector3 position every tick. With 32 players, we're exceeding our bandwidth budget."
**Expected behavior**:
- Identifies tick-rate replication of full-precision Vector3 as bandwidth-expensive
- Proposes quantized replication: use FVector_NetQuantize or FVector_NetQuantize100 instead of raw FVector to reduce bytes per update
- Recommends reducing replication frequency via SetNetUpdateFrequency() for non-owning clients
- Notes that Unreal's built-in Character Movement Component already has optimized movement replication — recommends using or extending it rather than rolling a custom system
- Produces a concrete bandwidth estimate comparison if possible, or explains the tradeoff
### Case 5: Context pass — designing within a network budget
**Input context**: Project network budget is 64 KB/s per player, with 32 players = 2 MB/s total server outbound. Current movement replication already uses 40 KB/s per player.
**Input**: "We want to add real-time inventory replication so all clients can see other players' equipment changes immediately."
**Expected behavior**:
- Acknowledges the existing 40 KB/s movement cost leaves only 24 KB/s for everything else per player
- Does NOT design a naive full-inventory replication approach (would exceed budget)
- Recommends a delta-only or event-driven approach: replicate only changed slots rather than the full inventory array
- Uses FGameplayItemSlot or equivalent with ReplicatedUsing to trigger targeted updates
- Explicitly states the proposed approach's bandwidth estimate relative to the remaining 24 KB/s budget
---
## Protocol Compliance
- [ ] Stays within declared domain (property replication, RPCs, client prediction, bandwidth)
- [ ] Redirects server infrastructure requests to devops-engineer without producing infrastructure design
- [ ] Flags unvalidated server RPCs as security issues and recommends lead-programmer review
- [ ] Returns structured findings (property declarations, bandwidth estimates, optimization options) not freeform advice
- [ ] Uses project-provided bandwidth budget numbers when evaluating replication design choices
---
## Coverage Notes
- Case 3 (RPC security) is a shipping-critical test — unvalidated RPCs are a top-ten multiplayer exploit vector
- Case 5 is the most important context-awareness test; agent must use actual budget numbers, not generic advice
- Case 1 GAS branch: if GAS is configured, agent should detect it and defer to ue-gas-specialist for GAS-managed attributes
- No automated runner; review manually or via `/skill-test`

View File

@@ -0,0 +1,79 @@
# Agent Test Spec: ue-umg-specialist
## Agent Summary
- **Domain**: UMG widget hierarchy design, data binding patterns, CommonUI input routing and action tags, widget styling (WidgetStyle assets), UI optimization (widget pooling, ListView, invalidation)
- **Does NOT own**: UX flow and screen navigation design (ux-designer), gameplay logic (gameplay-programmer), backend data sources (game code), server communication
- **Model tier**: Sonnet
- **Gate IDs**: None; defers UX flow decisions to ux-designer
---
## Static Assertions (Structural)
- [ ] `description:` field is present and domain-specific (references UMG, widget hierarchy, CommonUI)
- [ ] `allowed-tools:` list matches the agent's role (Read/Write for UI assets and Blueprint files; no server or gameplay source tools)
- [ ] Model tier is Sonnet (default for specialists)
- [ ] Agent definition does not claim authority over UX flow, navigation architecture, or gameplay data logic
---
## Test Cases
### Case 1: In-domain request — inventory widget with data binding
**Input**: "Create an inventory widget that shows a grid of item slots. Each slot should display item icon, quantity, and rarity color. It needs to update when the inventory changes."
**Expected behavior**:
- Produces a UMG widget structure: a parent WBP_Inventory containing a UniformGridPanel or TileView, with a child WBP_InventorySlot widget per item
- Describes data binding approach: either Event Dispatchers on an Inventory Component triggering a refresh, or a ListView with a UObject item data class implementing IUserObjectListEntry
- Specifies how rarity color is driven: a WidgetStyle asset or a data table lookup, not hardcoded color values
- Output includes the widget hierarchy, binding pattern, and the refresh trigger mechanism
### Case 2: Out-of-domain request — UX flow design
**Input**: "Design the full navigation flow for our inventory system — how the player opens it, transitions to character stats, and exits to the pause menu."
**Expected behavior**:
- Does not produce a navigation flow or screen transition architecture
- States clearly: "Navigation flow and screen transition design is owned by ux-designer; I can implement the UMG widget structure once the flow is defined"
- Does not make UX decisions (back button behavior, transition animations, modal vs. fullscreen) without a UX spec
### Case 3: Domain boundary — CommonUI input action mismatch
**Input**: "Our inventory widget isn't responding to the controller Back button. We're using CommonUI."
**Expected behavior**:
- Identifies the likely cause: the widget's Back input action tag does not match the project's registered CommonUI InputAction data asset
- Explains the CommonUI input routing model: widgets declare input actions via `CommonUI_InputAction` tags; the CommonActivatableWidget handles routing
- Provides the fix: verify that the widget's Back action tag matches the registered tag in the project's CommonUI input action data table
- Distinguishes this from a hardware input binding issue (which would be Enhanced Input territory)
### Case 4: Widget performance issue — many widget instances per frame
**Input**: "Our leaderboard widget creates 500 individual WBP_LeaderboardRow instances at once. The game hitches for 300ms when opening the leaderboard."
**Expected behavior**:
- Identifies the root cause: 500 widget instantiations in a single frame causes a construction hitch
- Recommends switching to ListView or TileView with virtualization — only visible rows are constructed
- Explains the IUserObjectListEntry interface requirement for ListView data objects
- If ListView is not appropriate, recommends pooling: pre-instantiate a fixed number of rows and recycle them with new data
- Output is a concrete recommendation with the specific UMG component to use, not a vague "optimize it"
### Case 5: Context pass — CommonUI setup already configured
**Input context**: Project uses CommonUI with the following registered InputAction tags: UI.Action.Confirm, UI.Action.Back, UI.Action.Pause, UI.Action.Secondary.
**Input**: "Add a 'Sort Inventory' button to the inventory widget that works with CommonUI."
**Expected behavior**:
- Uses UI.Action.Secondary (or recommends registering a new tag like UI.Action.Sort if Secondary is already allocated)
- Does NOT invent a new InputAction tag without noting that it must be registered in the CommonUI data table
- Does NOT use a non-CommonUI input binding approach (e.g., raw key press in Event Graph) when CommonUI is the established pattern
- References the provided tag list explicitly in the recommendation
---
## Protocol Compliance
- [ ] Stays within declared domain (UMG structure, data binding, CommonUI, widget performance)
- [ ] Redirects UX flow and navigation design requests to ux-designer
- [ ] Returns structured findings (widget hierarchy + binding pattern) rather than freeform opinions
- [ ] Uses existing CommonUI InputAction tags from context; does not invent new ones without flagging registration requirement
- [ ] Recommends virtualized lists (ListView/TileView) before widget pooling for large collections
---
## Coverage Notes
- Case 3 (CommonUI input routing) requires project to have CommonUI configured; test is skipped if project does not use CommonUI
- Case 4 (performance) is a high-impact failure mode — 300ms hitches are shipping-blocking; prioritize this test case
- Case 5 is the most important context-awareness test for UI pipeline consistency
- No automated runner; review manually or via `/skill-test`

View File

@@ -0,0 +1,80 @@
# Agent Test Spec: unreal-specialist
## Agent Summary
- **Domain**: Unreal Engine patterns and architecture — Blueprint vs C++ decisions, UE subsystems (GAS, Enhanced Input, Niagara), UE project structure, plugin integration, and engine-level configuration
- **Does NOT own**: Art style and visual direction (art-director), server infrastructure and deployment (devops-engineer), UI/UX flow design (ux-designer)
- **Model tier**: Sonnet
- **Gate IDs**: None; defers gate verdicts to technical-director
---
## Static Assertions (Structural)
- [ ] `description:` field is present and domain-specific (references Unreal Engine)
- [ ] `allowed-tools:` list matches the agent's role (Read, Write for UE project files; no deployment tools)
- [ ] Model tier is Sonnet (default for specialists)
- [ ] Agent definition does not claim authority outside its declared domain (no art, no server infra)
---
## Test Cases
### Case 1: In-domain request — Blueprint vs C++ decision criteria
**Input**: "Should I implement our combo attack system in Blueprint or C++?"
**Expected behavior**:
- Provides structured decision criteria: complexity, reuse frequency, team skill, and performance requirements
- Recommends C++ for systems called every frame or shared across 5+ ability types
- Recommends Blueprint for designer-tunable values and one-off logic
- Does NOT render a final verdict without knowing project context — asks clarifying questions if context is absent
- Output is structured (criteria table or bullet list), not a freeform opinion
### Case 2: Out-of-domain request — Unity C# code
**Input**: "Write me a C# MonoBehaviour that handles player health and fires a Unity event on death."
**Expected behavior**:
- Does not produce Unity C# code
- States clearly: "This project uses Unreal Engine; the Unity equivalent would be an Actor Component in UE C++ or a Blueprint Actor Component"
- Optionally offers to provide the UE equivalent if requested
- Does not redirect to a Unity specialist (none exists in the framework)
### Case 3: Domain boundary — UE5.4 API requirement
**Input**: "I need to use the new Motion Matching API introduced in UE5.4."
**Expected behavior**:
- Flags that UE5.4 is a specific version with potentially limited LLM training coverage
- Recommends cross-referencing official Unreal docs or the project's engine-reference directory before trusting any API suggestions
- Provides best-effort API guidance with explicit uncertainty markers (e.g., "Verify this against UE5.4 release notes")
- Does NOT silently produce stale or incorrect API signatures without a caveat
### Case 4: Conflict — Blueprint spaghetti in a core system
**Input**: "Our replication logic is entirely in a deeply nested Blueprint event graph with 300+ nodes and no functions. It's becoming unmaintainable."
**Expected behavior**:
- Identifies this as a Blueprint architecture problem, not a minor style issue
- Recommends migrating core replication logic to C++ ActorComponent or GameplayAbility system
- Notes the coordination required: changes to replication architecture must involve lead-programmer
- Does NOT unilaterally declare "migrate to C++" without surfacing the scope of the refactor to the user
- Produces a concrete migration recommendation, not a vague suggestion
### Case 5: Context pass — version-appropriate API suggestions
**Input context**: Project engine-reference file states Unreal Engine 5.3.
**Input**: "How do I set up Enhanced Input actions for a new character?"
**Expected behavior**:
- Uses UE5.3-era Enhanced Input API (InputMappingContext, UEnhancedInputComponent::BindAction)
- Does NOT reference APIs introduced after UE5.3 without flagging them as potentially unavailable
- References the project's stated engine version in its response
- Provides concrete, version-anchored code or Blueprint node names
---
## Protocol Compliance
- [ ] Stays within declared domain (Unreal patterns, Blueprint/C++, UE subsystems)
- [ ] Redirects Unity or other-engine requests without producing wrong-engine code
- [ ] Returns structured findings (criteria tables, decision trees, migration plans) rather than freeform opinions
- [ ] Flags version uncertainty explicitly before producing API suggestions
- [ ] Coordinates with lead-programmer for architecture-scale refactors rather than deciding unilaterally
---
## Coverage Notes
- No automated runner exists for agent behavior tests — these are reviewed manually or via `/skill-test`
- Version-awareness (Case 3, Case 5) is the highest-risk failure mode for this agent; test regularly when engine version changes
- Case 4 integration with lead-programmer is a coordination test, not a technical correctness test