添加 claude code game studios 到项目
This commit is contained in:
@@ -0,0 +1,80 @@
|
||||
# Agent Test Spec: ue-blueprint-specialist
|
||||
|
||||
## Agent Summary
|
||||
- **Domain**: Blueprint architecture, the Blueprint/C++ boundary, Blueprint graph quality, Blueprint performance optimization, Blueprint Function Library design
|
||||
- **Does NOT own**: C++ implementation (engine-programmer or gameplay-programmer), art assets or shaders, UI/UX flow design (ux-designer)
|
||||
- **Model tier**: Sonnet
|
||||
- **Gate IDs**: None; defers to unreal-specialist or lead-programmer for cross-domain rulings
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] `description:` field is present and domain-specific (references Blueprint architecture and optimization)
|
||||
- [ ] `allowed-tools:` list matches the agent's role (Read for Blueprint project files; no server or deployment tools)
|
||||
- [ ] Model tier is Sonnet (default for specialists)
|
||||
- [ ] Agent definition does not claim authority over C++ implementation decisions
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: In-domain request — Blueprint graph performance review
|
||||
**Input**: "Review our AI behavior Blueprint. It has tick-based logic running every frame that checks line-of-sight for 30 NPCs simultaneously."
|
||||
**Expected behavior**:
|
||||
- Identifies tick-heavy logic as a performance problem
|
||||
- Recommends switching from EventTick to event-driven patterns (perception system events, timers, or polling on a reduced interval)
|
||||
- Flags the per-NPC cost of simultaneous line-of-sight checks
|
||||
- Suggests alternatives: AIPerception component events, staggered tick groups, or moving the system to C++ if Blueprint overhead is measured to be significant
|
||||
- Output is structured: problem identified, impact estimated, alternatives listed
|
||||
|
||||
### Case 2: Out-of-domain request — C++ implementation
|
||||
**Input**: "Write the C++ implementation for this ability cooldown system."
|
||||
**Expected behavior**:
|
||||
- Does not produce C++ implementation code
|
||||
- Provides the Blueprint equivalent of the cooldown logic (e.g., using a Timeline or GameplayEffect if GAS is in use)
|
||||
- States clearly: "C++ implementation is handled by engine-programmer or gameplay-programmer; I can show the Blueprint approach or describe the boundary where Blueprint calls into C++"
|
||||
- Optionally notes when the cooldown complexity warrants a C++ backend
|
||||
|
||||
### Case 3: Domain boundary — unsafe raw pointer access in Blueprint
|
||||
**Input**: "Our Blueprint calls GetOwner() and then immediately accesses a component on the result without checking if it's valid."
|
||||
**Expected behavior**:
|
||||
- Flags this as a runtime crash risk: GetOwner() can return null in some lifecycle states
|
||||
- Provides the correct Blueprint pattern: IsValid() node before any property/component access
|
||||
- Notes that Blueprint's null checks are not optional on Actor-derived references
|
||||
- Does NOT silently fix the code without explaining why the original was unsafe
|
||||
|
||||
### Case 4: Blueprint graph complexity — readiness for Function Library refactor
|
||||
**Input**: "Our main GameMode Blueprint has 600+ nodes in a single graph with duplicated damage calculation logic in 8 places."
|
||||
**Expected behavior**:
|
||||
- Diagnoses this as a maintainability and testability problem
|
||||
- Recommends extracting duplicated logic into a Blueprint Function Library (BFL)
|
||||
- Describes how to structure the BFL: pure functions for calculations, static calls from any Blueprint
|
||||
- Notes that if the damage logic is performance-sensitive or shared with C++, it may be a candidate for migration to unreal-specialist review
|
||||
- Output is a concrete refactor plan, not a vague recommendation
|
||||
|
||||
### Case 5: Context pass — Blueprint complexity budget
|
||||
**Input context**: Project conventions specify a maximum of 100 nodes per Blueprint event graph before a mandatory Function Library extraction.
|
||||
**Input**: "Here is our inventory Blueprint graph [150 nodes shown]. Is it ready to ship?"
|
||||
**Expected behavior**:
|
||||
- References the stated 150-node count against the 100-node budget from project conventions
|
||||
- Flags the graph as exceeding the complexity threshold
|
||||
- Does NOT approve it as-is
|
||||
- Produces a list of candidate subgraphs for Function Library extraction to bring the main graph within budget
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Stays within declared domain (Blueprint architecture, performance, graph quality)
|
||||
- [ ] Redirects C++ implementation requests to engine-programmer or gameplay-programmer
|
||||
- [ ] Returns structured findings (problem/impact/alternatives format) rather than freeform opinions
|
||||
- [ ] Enforces Blueprint safety patterns (null checks, IsValid) proactively
|
||||
- [ ] References project conventions when evaluating graph complexity
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
- Case 3 (null pointer safety) is a safety-critical test — this is a common source of shipping crashes
|
||||
- Case 5 requires that project conventions include a stated node budget; if none is configured, the agent should note the absence and recommend setting one
|
||||
- No automated runner; review manually or via `/skill-test`
|
||||
@@ -0,0 +1,81 @@
|
||||
# Agent Test Spec: ue-gas-specialist
|
||||
|
||||
## Agent Summary
|
||||
- **Domain**: Gameplay Ability System (GAS) — abilities (UGameplayAbility), gameplay effects (UGameplayEffect), attribute sets (UAttributeSet), gameplay tags, ability tasks (UAbilityTask), ability specs (FGameplayAbilitySpec), GAS prediction and latency compensation
|
||||
- **Does NOT own**: UI display of ability state (ue-umg-specialist), net replication of GAS data beyond built-in GAS prediction (ue-replication-specialist), art or VFX for ability feedback (vfx-artist)
|
||||
- **Model tier**: Sonnet
|
||||
- **Gate IDs**: None; defers cross-domain calls to the appropriate specialist
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] `description:` field is present and domain-specific (references GAS, abilities, GameplayEffects, AttributeSets)
|
||||
- [ ] `allowed-tools:` list matches the agent's role (Read/Write for GAS source files; no deployment or server tools)
|
||||
- [ ] Model tier is Sonnet (default for specialists)
|
||||
- [ ] Agent definition does not claim authority over UI implementation or low-level net serialization
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: In-domain request — dash ability with cooldown
|
||||
**Input**: "Implement a dash ability that moves the player forward 500 units and has a 1.5 second cooldown."
|
||||
**Expected behavior**:
|
||||
- Produces a GAS AbilitySpec structure or outline: UGameplayAbility subclass with ActivateAbility logic, an AbilityTask for movement (e.g., AbilityTask_ApplyRootMotionMoveToForce or custom root motion), and a UGameplayEffect for the cooldown
|
||||
- Cooldown GameplayEffect uses Duration policy with the 1.5s duration and a GameplayTag to block re-activation
|
||||
- Tags clearly named following a hierarchy convention (e.g., Ability.Dash, Cooldown.Ability.Dash)
|
||||
- Output includes both the ability class outline and the GameplayEffect definition
|
||||
|
||||
### Case 2: Out-of-domain request — GAS state replication
|
||||
**Input**: "How do I replicate the player's ability cooldown state to all clients so the UI updates correctly?"
|
||||
**Expected behavior**:
|
||||
- Clarifies that GAS has built-in replication for AbilitySpecs and GameplayEffects via the AbilitySystemComponent's replication mode
|
||||
- Explains the three ASC replication modes (Full, Mixed, Minimal) and when to use each
|
||||
- For custom replication needs beyond GAS built-ins, explicitly states: "For custom net serialization of GAS data, coordinate with ue-replication-specialist"
|
||||
- Does NOT attempt to write custom replication code outside GAS's own systems without flagging the domain boundary
|
||||
|
||||
### Case 3: Domain boundary — incorrect GameplayTag hierarchy
|
||||
**Input**: "We have an ability that applies a tag called 'Stunned' and another that checks for 'Status.Stunned'. They're not matching."
|
||||
**Expected behavior**:
|
||||
- Identifies the root cause: tag names must be exact or use hierarchical matching via TagContainer queries
|
||||
- Flags the naming inconsistency: 'Stunned' is a root-level tag; 'Status.Stunned' is a child tag under 'Status' — these are different tags
|
||||
- Recommends a project tag naming convention: all status effects under Status.*, all abilities under Ability.*
|
||||
- Provides the fix: either rename the applied tag to 'Status.Stunned' or update the query to match 'Stunned'
|
||||
- Notes where tag definitions should live (DefaultGameplayTags.ini or a DataTable)
|
||||
|
||||
### Case 4: Conflict — attribute set conflict between two abilities
|
||||
**Input**: "Our Shield ability and our Armor ability both modify a 'DefenseValue' attribute. They're stacking in ways that aren't intended — after both are active, defense goes well above maximum."
|
||||
**Expected behavior**:
|
||||
- Identifies this as a GameplayEffect stacking and magnitude calculation problem
|
||||
- Proposes a resolution using Execution Calculations (UGameplayEffectExecutionCalculation) or Modifier Aggregators to cap the combined result
|
||||
- Alternatively recommends using Gameplay Effect Stacking policies (Aggregate, None) to prevent unintended additive stacking
|
||||
- Produces a concrete resolution: either an Execution Calculation class outline or a change to the Modifier Op (Override instead of Additive for the cap)
|
||||
- Does NOT propose removing one of the abilities as the solution
|
||||
|
||||
### Case 5: Context pass — designing against an existing attribute set
|
||||
**Input context**: Project has an existing AttributeSet with attributes: Health, MaxHealth, Stamina, MaxStamina, Defense, AttackPower.
|
||||
**Input**: "Design a Berserker ability that increases AttackPower by 50% when Health drops below 30%."
|
||||
**Expected behavior**:
|
||||
- Uses the existing Health, MaxHealth, and AttackPower attributes — does NOT invent new attributes
|
||||
- Designs a Passive GameplayAbility (or triggered Effect) that fires on Health change, checks Health/MaxHealth ratio via a GameplayEffectExecutionCalculation or Attribute-Based magnitude
|
||||
- Uses a Gameplay Cue or Gameplay Tag to track the Berserker active state
|
||||
- References the actual attribute names from the provided AttributeSet (AttackPower, not "Damage" or "Strength")
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Stays within declared domain (GAS: abilities, effects, attributes, tags, ability tasks)
|
||||
- [ ] Redirects custom replication requests to ue-replication-specialist with clear explanation of boundary
|
||||
- [ ] Returns structured findings (ability outline + GameplayEffect definition) rather than vague descriptions
|
||||
- [ ] Enforces tag hierarchy naming conventions proactively
|
||||
- [ ] Uses only attributes and tags present in the provided context; does not invent new ones without noting it
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
- Case 3 (tag hierarchy) is a frequent source of subtle bugs; test whenever tag naming conventions change
|
||||
- Case 4 requires knowledge of GAS stacking policies — verify this case if the GAS integration depth changes
|
||||
- Case 5 is the most important context-awareness test; failing it means the agent ignores project state
|
||||
- No automated runner; review manually or via `/skill-test`
|
||||
@@ -0,0 +1,82 @@
|
||||
# Agent Test Spec: ue-replication-specialist
|
||||
|
||||
## Agent Summary
|
||||
- **Domain**: Property replication (UPROPERTY Replicated/ReplicatedUsing), RPCs (Server/Client/NetMulticast), client prediction and reconciliation, net relevancy and always-relevant settings, net serialization (FArchive/NetSerialize), bandwidth optimization and replication frequency tuning
|
||||
- **Does NOT own**: Gameplay logic being replicated (gameplay-programmer), server infrastructure and hosting (devops-engineer), GAS-specific prediction (ue-gas-specialist handles GAS net prediction)
|
||||
- **Model tier**: Sonnet
|
||||
- **Gate IDs**: None; escalates security-relevant replication concerns to lead-programmer
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] `description:` field is present and domain-specific (references replication, RPCs, client prediction, bandwidth)
|
||||
- [ ] `allowed-tools:` list matches the agent's role (Read/Write for C++ and Blueprint source files; no infrastructure or deployment tools)
|
||||
- [ ] Model tier is Sonnet (default for specialists)
|
||||
- [ ] Agent definition does not claim authority over server infrastructure, game server architecture, or gameplay logic correctness
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: In-domain request — replicated player health with client prediction
|
||||
**Input**: "Set up replicated player health that clients can predict locally (e.g., when taking self-inflicted damage) and have corrected by the server."
|
||||
**Expected behavior**:
|
||||
- Produces a UPROPERTY(ReplicatedUsing=OnRep_Health) declaration in the appropriate Character or AttributeSet class
|
||||
- Describes the OnRep_Health function: apply visual/audio feedback, reconcile predicted value with server-authoritative value
|
||||
- Explains the client prediction pattern: local client applies tentative damage immediately, server authoritative value arrives via OnRep and corrects any discrepancy
|
||||
- Notes that if GAS is in use, the built-in GAS prediction handles this — recommend coordinating with ue-gas-specialist
|
||||
- Output is a concrete code structure (property declaration + OnRep outline), not a conceptual description only
|
||||
|
||||
### Case 2: Out-of-domain request — game server architecture
|
||||
**Input**: "Design our game server infrastructure — how many dedicated servers we need, regional deployment, and matchmaking architecture."
|
||||
**Expected behavior**:
|
||||
- Does not produce server infrastructure architecture, hosting recommendations, or matchmaking design
|
||||
- States clearly: "Server infrastructure and deployment architecture is owned by devops-engineer; I handle the Unreal replication layer within a running game session"
|
||||
- Does not conflate in-game replication with server hosting concerns
|
||||
|
||||
### Case 3: Domain boundary — RPC without server authority validation
|
||||
**Input**: "We have a Server RPC called ServerSpendCurrency that deducts in-game currency. The client calls it and the server just deducts without checking anything."
|
||||
**Expected behavior**:
|
||||
- Flags this as a critical security vulnerability: unvalidated server RPCs are exploitable by cheaters sending arbitrary RPC calls
|
||||
- Provides the required fix: server-side validation before the deduct — check that the player actually has the currency, verify the transaction is valid, reject and log if not
|
||||
- Uses the pattern: `if (!HasAuthority()) return;` guard plus explicit state validation before mutation
|
||||
- Notes this should be reviewed by lead-programmer given the economy implications
|
||||
- Does NOT produce the "fixed" code without explaining why the original was dangerous
|
||||
|
||||
### Case 4: Bandwidth optimization — high-frequency movement replication
|
||||
**Input**: "Our player movement is replicated using a Vector3 position every tick. With 32 players, we're exceeding our bandwidth budget."
|
||||
**Expected behavior**:
|
||||
- Identifies tick-rate replication of full-precision Vector3 as bandwidth-expensive
|
||||
- Proposes quantized replication: use FVector_NetQuantize or FVector_NetQuantize100 instead of raw FVector to reduce bytes per update
|
||||
- Recommends reducing replication frequency via SetNetUpdateFrequency() for non-owning clients
|
||||
- Notes that Unreal's built-in Character Movement Component already has optimized movement replication — recommends using or extending it rather than rolling a custom system
|
||||
- Produces a concrete bandwidth estimate comparison if possible, or explains the tradeoff
|
||||
|
||||
### Case 5: Context pass — designing within a network budget
|
||||
**Input context**: Project network budget is 64 KB/s per player, with 32 players = 2 MB/s total server outbound. Current movement replication already uses 40 KB/s per player.
|
||||
**Input**: "We want to add real-time inventory replication so all clients can see other players' equipment changes immediately."
|
||||
**Expected behavior**:
|
||||
- Acknowledges the existing 40 KB/s movement cost leaves only 24 KB/s for everything else per player
|
||||
- Does NOT design a naive full-inventory replication approach (would exceed budget)
|
||||
- Recommends a delta-only or event-driven approach: replicate only changed slots rather than the full inventory array
|
||||
- Uses FGameplayItemSlot or equivalent with ReplicatedUsing to trigger targeted updates
|
||||
- Explicitly states the proposed approach's bandwidth estimate relative to the remaining 24 KB/s budget
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Stays within declared domain (property replication, RPCs, client prediction, bandwidth)
|
||||
- [ ] Redirects server infrastructure requests to devops-engineer without producing infrastructure design
|
||||
- [ ] Flags unvalidated server RPCs as security issues and recommends lead-programmer review
|
||||
- [ ] Returns structured findings (property declarations, bandwidth estimates, optimization options) not freeform advice
|
||||
- [ ] Uses project-provided bandwidth budget numbers when evaluating replication design choices
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
- Case 3 (RPC security) is a shipping-critical test — unvalidated RPCs are a top-ten multiplayer exploit vector
|
||||
- Case 5 is the most important context-awareness test; agent must use actual budget numbers, not generic advice
|
||||
- Case 1 GAS branch: if GAS is configured, agent should detect it and defer to ue-gas-specialist for GAS-managed attributes
|
||||
- No automated runner; review manually or via `/skill-test`
|
||||
@@ -0,0 +1,79 @@
|
||||
# Agent Test Spec: ue-umg-specialist
|
||||
|
||||
## Agent Summary
|
||||
- **Domain**: UMG widget hierarchy design, data binding patterns, CommonUI input routing and action tags, widget styling (WidgetStyle assets), UI optimization (widget pooling, ListView, invalidation)
|
||||
- **Does NOT own**: UX flow and screen navigation design (ux-designer), gameplay logic (gameplay-programmer), backend data sources (game code), server communication
|
||||
- **Model tier**: Sonnet
|
||||
- **Gate IDs**: None; defers UX flow decisions to ux-designer
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] `description:` field is present and domain-specific (references UMG, widget hierarchy, CommonUI)
|
||||
- [ ] `allowed-tools:` list matches the agent's role (Read/Write for UI assets and Blueprint files; no server or gameplay source tools)
|
||||
- [ ] Model tier is Sonnet (default for specialists)
|
||||
- [ ] Agent definition does not claim authority over UX flow, navigation architecture, or gameplay data logic
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: In-domain request — inventory widget with data binding
|
||||
**Input**: "Create an inventory widget that shows a grid of item slots. Each slot should display item icon, quantity, and rarity color. It needs to update when the inventory changes."
|
||||
**Expected behavior**:
|
||||
- Produces a UMG widget structure: a parent WBP_Inventory containing a UniformGridPanel or TileView, with a child WBP_InventorySlot widget per item
|
||||
- Describes data binding approach: either Event Dispatchers on an Inventory Component triggering a refresh, or a ListView with a UObject item data class implementing IUserObjectListEntry
|
||||
- Specifies how rarity color is driven: a WidgetStyle asset or a data table lookup, not hardcoded color values
|
||||
- Output includes the widget hierarchy, binding pattern, and the refresh trigger mechanism
|
||||
|
||||
### Case 2: Out-of-domain request — UX flow design
|
||||
**Input**: "Design the full navigation flow for our inventory system — how the player opens it, transitions to character stats, and exits to the pause menu."
|
||||
**Expected behavior**:
|
||||
- Does not produce a navigation flow or screen transition architecture
|
||||
- States clearly: "Navigation flow and screen transition design is owned by ux-designer; I can implement the UMG widget structure once the flow is defined"
|
||||
- Does not make UX decisions (back button behavior, transition animations, modal vs. fullscreen) without a UX spec
|
||||
|
||||
### Case 3: Domain boundary — CommonUI input action mismatch
|
||||
**Input**: "Our inventory widget isn't responding to the controller Back button. We're using CommonUI."
|
||||
**Expected behavior**:
|
||||
- Identifies the likely cause: the widget's Back input action tag does not match the project's registered CommonUI InputAction data asset
|
||||
- Explains the CommonUI input routing model: widgets declare input actions via `CommonUI_InputAction` tags; the CommonActivatableWidget handles routing
|
||||
- Provides the fix: verify that the widget's Back action tag matches the registered tag in the project's CommonUI input action data table
|
||||
- Distinguishes this from a hardware input binding issue (which would be Enhanced Input territory)
|
||||
|
||||
### Case 4: Widget performance issue — many widget instances per frame
|
||||
**Input**: "Our leaderboard widget creates 500 individual WBP_LeaderboardRow instances at once. The game hitches for 300ms when opening the leaderboard."
|
||||
**Expected behavior**:
|
||||
- Identifies the root cause: 500 widget instantiations in a single frame causes a construction hitch
|
||||
- Recommends switching to ListView or TileView with virtualization — only visible rows are constructed
|
||||
- Explains the IUserObjectListEntry interface requirement for ListView data objects
|
||||
- If ListView is not appropriate, recommends pooling: pre-instantiate a fixed number of rows and recycle them with new data
|
||||
- Output is a concrete recommendation with the specific UMG component to use, not a vague "optimize it"
|
||||
|
||||
### Case 5: Context pass — CommonUI setup already configured
|
||||
**Input context**: Project uses CommonUI with the following registered InputAction tags: UI.Action.Confirm, UI.Action.Back, UI.Action.Pause, UI.Action.Secondary.
|
||||
**Input**: "Add a 'Sort Inventory' button to the inventory widget that works with CommonUI."
|
||||
**Expected behavior**:
|
||||
- Uses UI.Action.Secondary (or recommends registering a new tag like UI.Action.Sort if Secondary is already allocated)
|
||||
- Does NOT invent a new InputAction tag without noting that it must be registered in the CommonUI data table
|
||||
- Does NOT use a non-CommonUI input binding approach (e.g., raw key press in Event Graph) when CommonUI is the established pattern
|
||||
- References the provided tag list explicitly in the recommendation
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Stays within declared domain (UMG structure, data binding, CommonUI, widget performance)
|
||||
- [ ] Redirects UX flow and navigation design requests to ux-designer
|
||||
- [ ] Returns structured findings (widget hierarchy + binding pattern) rather than freeform opinions
|
||||
- [ ] Uses existing CommonUI InputAction tags from context; does not invent new ones without flagging registration requirement
|
||||
- [ ] Recommends virtualized lists (ListView/TileView) before widget pooling for large collections
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
- Case 3 (CommonUI input routing) requires project to have CommonUI configured; test is skipped if project does not use CommonUI
|
||||
- Case 4 (performance) is a high-impact failure mode — 300ms hitches are shipping-blocking; prioritize this test case
|
||||
- Case 5 is the most important context-awareness test for UI pipeline consistency
|
||||
- No automated runner; review manually or via `/skill-test`
|
||||
@@ -0,0 +1,80 @@
|
||||
# Agent Test Spec: unreal-specialist
|
||||
|
||||
## Agent Summary
|
||||
- **Domain**: Unreal Engine patterns and architecture — Blueprint vs C++ decisions, UE subsystems (GAS, Enhanced Input, Niagara), UE project structure, plugin integration, and engine-level configuration
|
||||
- **Does NOT own**: Art style and visual direction (art-director), server infrastructure and deployment (devops-engineer), UI/UX flow design (ux-designer)
|
||||
- **Model tier**: Sonnet
|
||||
- **Gate IDs**: None; defers gate verdicts to technical-director
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
- [ ] `description:` field is present and domain-specific (references Unreal Engine)
|
||||
- [ ] `allowed-tools:` list matches the agent's role (Read, Write for UE project files; no deployment tools)
|
||||
- [ ] Model tier is Sonnet (default for specialists)
|
||||
- [ ] Agent definition does not claim authority outside its declared domain (no art, no server infra)
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: In-domain request — Blueprint vs C++ decision criteria
|
||||
**Input**: "Should I implement our combo attack system in Blueprint or C++?"
|
||||
**Expected behavior**:
|
||||
- Provides structured decision criteria: complexity, reuse frequency, team skill, and performance requirements
|
||||
- Recommends C++ for systems called every frame or shared across 5+ ability types
|
||||
- Recommends Blueprint for designer-tunable values and one-off logic
|
||||
- Does NOT render a final verdict without knowing project context — asks clarifying questions if context is absent
|
||||
- Output is structured (criteria table or bullet list), not a freeform opinion
|
||||
|
||||
### Case 2: Out-of-domain request — Unity C# code
|
||||
**Input**: "Write me a C# MonoBehaviour that handles player health and fires a Unity event on death."
|
||||
**Expected behavior**:
|
||||
- Does not produce Unity C# code
|
||||
- States clearly: "This project uses Unreal Engine; the Unity equivalent would be an Actor Component in UE C++ or a Blueprint Actor Component"
|
||||
- Optionally offers to provide the UE equivalent if requested
|
||||
- Does not redirect to a Unity specialist (none exists in the framework)
|
||||
|
||||
### Case 3: Domain boundary — UE5.4 API requirement
|
||||
**Input**: "I need to use the new Motion Matching API introduced in UE5.4."
|
||||
**Expected behavior**:
|
||||
- Flags that UE5.4 is a specific version with potentially limited LLM training coverage
|
||||
- Recommends cross-referencing official Unreal docs or the project's engine-reference directory before trusting any API suggestions
|
||||
- Provides best-effort API guidance with explicit uncertainty markers (e.g., "Verify this against UE5.4 release notes")
|
||||
- Does NOT silently produce stale or incorrect API signatures without a caveat
|
||||
|
||||
### Case 4: Conflict — Blueprint spaghetti in a core system
|
||||
**Input**: "Our replication logic is entirely in a deeply nested Blueprint event graph with 300+ nodes and no functions. It's becoming unmaintainable."
|
||||
**Expected behavior**:
|
||||
- Identifies this as a Blueprint architecture problem, not a minor style issue
|
||||
- Recommends migrating core replication logic to C++ ActorComponent or GameplayAbility system
|
||||
- Notes the coordination required: changes to replication architecture must involve lead-programmer
|
||||
- Does NOT unilaterally declare "migrate to C++" without surfacing the scope of the refactor to the user
|
||||
- Produces a concrete migration recommendation, not a vague suggestion
|
||||
|
||||
### Case 5: Context pass — version-appropriate API suggestions
|
||||
**Input context**: Project engine-reference file states Unreal Engine 5.3.
|
||||
**Input**: "How do I set up Enhanced Input actions for a new character?"
|
||||
**Expected behavior**:
|
||||
- Uses UE5.3-era Enhanced Input API (InputMappingContext, UEnhancedInputComponent::BindAction)
|
||||
- Does NOT reference APIs introduced after UE5.3 without flagging them as potentially unavailable
|
||||
- References the project's stated engine version in its response
|
||||
- Provides concrete, version-anchored code or Blueprint node names
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Stays within declared domain (Unreal patterns, Blueprint/C++, UE subsystems)
|
||||
- [ ] Redirects Unity or other-engine requests without producing wrong-engine code
|
||||
- [ ] Returns structured findings (criteria tables, decision trees, migration plans) rather than freeform opinions
|
||||
- [ ] Flags version uncertainty explicitly before producing API suggestions
|
||||
- [ ] Coordinates with lead-programmer for architecture-scale refactors rather than deciding unilaterally
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
- No automated runner exists for agent behavior tests — these are reviewed manually or via `/skill-test`
|
||||
- Version-awareness (Case 3, Case 5) is the highest-risk failure mode for this agent; test regularly when engine version changes
|
||||
- Case 4 integration with lead-programmer is a coordination test, not a technical correctness test
|
||||
Reference in New Issue
Block a user