Files

panw a16fe4bff7 添加 claude code game studios 到项目

2026-05-15 14:52:29 +08:00

14 KiB

Raw Blame History

Skill Test Spec: /team-polish

Skill Summary

Orchestrates the polish team through a six-phase pipeline: performance assessment (performance-analyst) → optimization (performance-analyst, optionally with engine-programmer when engine-level root causes are found) → visual polish (technical-artist, parallel with Phase 2) → audio polish (sound-designer, parallel with Phase 2) → hardening (qa-tester) → sign-off (orchestrator collects all results and issues READY FOR RELEASE or NEEDS MORE WORK). Uses AskUserQuestion at each phase transition. Engine-programmer is spawned conditionally only when Phase 1 identifies engine-level root causes. Verdict is READY FOR RELEASE or NEEDS MORE WORK.

Static Assertions (Structural)

Has required frontmatter fields: name, description, argument-hint, user-invocable, allowed-tools
Has ≥2 phase headings
Contains verdict keywords: READY FOR RELEASE, NEEDS MORE WORK
Contains "File Write Protocol" section
File writes are delegated to sub-agents — orchestrator does not write files directly
Sub-agents enforce "May I write to [path]?" before any write
Has a next-step handoff at the end (references /release-checklist, /sprint-plan update, /gate-check)
Error Recovery Protocol section is present
AskUserQuestion is used at phase transitions before proceeding
Phase 3 (visual polish) and Phase 4 (audio polish) are explicitly run in parallel with Phase 2
engine-programmer is conditionally spawned in Phase 2 only when Phase 1 identifies engine-level root causes
Phase 6 sign-off compares metrics against budgets before issuing verdict

Test Cases

Case 1: Happy Path — Full pipeline completes, READY FOR RELEASE verdict

Fixture:

Feature exists and is functionally complete (e.g., combat system)
Performance budgets are defined in technical-preferences.md (e.g., target 60fps, 16ms frame budget)
No frame budget violations exist before polishing begins
No audio events are missing; VFX assets are complete
No regressions are introduced by polish changes

Input: /team-polish combat

Expected behavior:

Phase 1: performance-analyst is spawned; profiles the combat system, measures frame budget, checks memory usage; output: performance report showing all metrics within budget, no violations
AskUserQuestion presents performance report; user approves before Phases 2, 3, and 4 begin
Phase 2: performance-analyst applies minor optimizations (e.g., draw call batching); no engine-programmer needed (no engine-level root causes identified)
Phases 3 and 4 are launched in parallel alongside Phase 2:
- Phase 3: technical-artist reviews VFX for quality, optimizes particle systems, adds screen shake and visual juice
- Phase 4: sound-designer reviews audio events for completeness, checks mix levels, adds ambient audio layers
All three parallel phases complete; AskUserQuestion presents results; user approves before Phase 5 begins
Phase 5: qa-tester runs edge case tests, soak tests, stress tests, and regression tests; all pass
AskUserQuestion presents test results; user approves before Phase 6
Phase 6: orchestrator collects all results; compares before/after performance metrics against budgets; all metrics pass
Subagent asks "May I write the polish report to production/qa/evidence/polish-combat-[date].md?" before writing
Verdict: READY FOR RELEASE

Assertions:

performance-analyst is spawned first in Phase 1 before any other agents
AskUserQuestion appears after Phase 1 output and before Phases 2/3/4 launch
Phases 3 and 4 Task calls are issued at the same time as Phase 2 (not after Phase 2 completes)
engine-programmer is NOT spawned when Phase 1 finds no engine-level root causes
qa-tester (Phase 5) is not launched until the parallel phases complete and user approves
Phase 6 verdict is based on comparison of metrics against defined budgets
Summary report includes: before/after performance metrics, visual polish changes, audio polish changes, test results
No files are written by the orchestrator directly
Verdict is READY FOR RELEASE

Case 2: Performance Blocker — Frame budget violation cannot be fully resolved

Fixture:

Feature being polished: particle-storm VFX system
Phase 1 identifies a frame budget violation: particle-storm costs 12ms on target hardware (budget is 6ms for this system)
Phase 2 performance-analyst applies optimizations reducing cost to 9ms — still over the 6ms budget
Phase 2 cannot fully resolve the violation without a fundamental design change

Input: /team-polish particle-storm

Expected behavior:

Phase 1: performance-analyst identifies the 12ms frame cost vs. 6ms budget; reports "FRAME BUDGET VIOLATION: particle-storm costs 12ms, budget is 6ms"
AskUserQuestion presents the violation; user chooses to proceed with optimization attempt
Phase 2: performance-analyst applies optimizations; achieves 9ms — reduced but still over budget; reports "Optimization reduced cost to 9ms (was 12ms) — 3ms over budget. No further gains achievable without design changes."
Phases 3 and 4 run in parallel with Phase 2 (visual and audio polish)
Phase 5: qa-tester runs regression and edge case tests; all pass
Phase 6: orchestrator collects results; frame budget violation (9ms vs 6ms budget) remains unresolved
Verdict: NEEDS MORE WORK
Report lists the specific unresolved issue: "particle-storm frame cost (9ms) exceeds budget (6ms) by 3ms — requires design scope reduction or budget renegotiation"
Next Steps: schedule the remaining issue in /sprint-plan update; re-run /team-polish after fix

Assertions:

Frame budget violation is flagged in Phase 1 with specific numbers (actual vs. budget)
Phase 2 reports the post-optimization metric explicitly (9ms achieved, 3ms still over)
Verdict is NEEDS MORE WORK (not READY FOR RELEASE) when a budget violation remains
The specific unresolved issue is listed by name with the remaining gap quantified
Next Steps references /sprint-plan update for scheduling the remaining fix
Phases 3 and 4 still run (polish work is not abandoned due to a Phase 2 partial resolution)
Phase 5 qa-tester still runs (regression testing is independent of the performance outcome)

Case 3: No Argument — Usage guidance shown

Fixture:

Any project state

Input: /team-polish (no argument)

Expected behavior:

Skill detects no argument is provided
Outputs usage guidance: e.g., "Usage: /team-polish [feature or area] — specify the feature or area to polish (e.g., combat, main menu, inventory system, level-1)"
Skill exits without spawning any agents

Assertions:

Skill does NOT spawn any agents when no argument is provided
Usage message includes the correct invocation format with argument examples
Skill does NOT attempt to guess a feature from project files
No AskUserQuestion is used — output is direct guidance

Case 4: Engine-Level Bottleneck — engine-programmer spawned conditionally in Phase 2

Fixture:

Feature being polished: open-world environment streaming
Phase 1 identifies a performance bottleneck with a root cause in the rendering pipeline: "draw call overhead is caused by the engine's scene tree traversal in the spatial indexer — this is an engine-level issue, not a game code issue"
Performance budgets are defined; the rendering overhead exceeds target frame budget

Input: /team-polish open-world

Expected behavior:

Phase 1: performance-analyst profiles the environment; identifies frame budget violation; root cause analysis points to engine-level rendering pipeline (spatial indexer traversal overhead)
Phase 1 output explicitly classifies the root cause as engine-level
AskUserQuestion presents the performance report including the engine-level root cause; user approves before Phase 2
Phase 2: performance-analyst is spawned for game-code-level optimizations AND engine-programmer is spawned in parallel for the engine-level rendering fix
Phases 3 and 4 also run in parallel with Phase 2 (visual and audio polish)
engine-programmer addresses the spatial indexer traversal; provides profiler validation showing the fix reduces overhead
Phase 5: qa-tester runs regression tests including tests for the engine-level fix
Phase 6: orchestrator collects all results; if metrics are now within budget, verdict is READY FOR RELEASE; if not, NEEDS MORE WORK

Assertions:

engine-programmer is NOT spawned in Phase 2 unless Phase 1 explicitly identifies an engine-level root cause
engine-programmer is spawned in Phase 2 when Phase 1 identifies an engine-level root cause
engine-programmer and performance-analyst Task calls in Phase 2 are issued simultaneously (not sequentially)
Phases 3 and 4 also run in parallel with Phase 2 (not deferred until Phase 2 completes)
engine-programmer's output includes profiler validation of the fix
qa-tester in Phase 5 runs regression tests that cover the engine-level change
Verdict correctly reflects whether all metrics including the engine fix now meet budgets

Case 5: Regression Found — Polish change broke an existing feature

Fixture:

Feature being polished: inventory-ui
Phases 1–4 complete successfully; performance and polish changes are applied
Phase 5: qa-tester runs regression tests and finds that a shader optimization applied in Phase 3 broke the item highlight glow effect on hover — an existing feature that was working before the polish pass

Input: /team-polish inventory-ui (Phase 5 scenario)

Expected behavior:

Phases 1–4 complete; polish changes include a shader optimization from technical-artist
Phase 5: qa-tester runs regression tests and detects "Item highlight glow on hover no longer renders — regression introduced by shader optimization in Phase 3"
qa-tester returns test results with the regression noted
Orchestrator surfaces the regression immediately: "qa-tester: REGRESSION FOUND — item-highlight-hover glow broken by Phase 3 shader optimization"
Subagent files a bug report asking "May I write the bug report to production/qa/evidence/bug-polish-inventory-ui-[date].md?" before writing
Bug report is written after approval; it includes: the broken behavior, the polish change that caused it, reproduction steps, and severity
AskUserQuestion presents the regression with options:
- Revert the shader optimization and find an alternative approach
- Fix the shader optimization to preserve the glow effect
- Accept the regression and schedule a fix in the next sprint
Verdict: NEEDS MORE WORK (regression present regardless of user's chosen resolution path, unless fix is applied within the current session)

Assertions:

Regression is surfaced before Phase 6 sign-off
The specific broken behavior and the responsible change are both named in the report
Subagent asks "May I write the bug report to [path]?" before filing
Bug report includes: broken behavior, causal change, reproduction steps, severity
AskUserQuestion offers options including revert, fix in place, and schedule later
Verdict is NEEDS MORE WORK when a regression is present and unresolved
Verdict may become READY FOR RELEASE only if the regression is fixed within the current polish session and qa-tester re-runs to confirm

Protocol Compliance

Phase 1 (assessment) must complete before any other phase begins
AskUserQuestion is used after every phase output before the next phase launches
Phases 3 and 4 are always launched in parallel with Phase 2 (not deferred)
engine-programmer is only spawned when Phase 1 explicitly identifies engine-level root causes
No files are written by the orchestrator directly — all writes are delegated to sub-agents
Each sub-agent enforces the "May I write to [path]?" protocol before any write
BLOCKED status from any agent is surfaced immediately — not silently skipped
A partial report is always produced when some agents complete and others block
Verdict is exactly READY FOR RELEASE or NEEDS MORE WORK — no other verdict values used
NEEDS MORE WORK verdict always lists specific remaining issues with severity
Next Steps handoff references /release-checklist (on success) and /sprint-plan update + /gate-check (on failure)

Coverage Notes

The tools-programmer optional agent (for content pipeline tool verification) is not separately tested — it follows the same conditional spawn pattern as engine-programmer and is invoked only when content authoring tools are involved in the polished area.
The "Retry with narrower scope" and "Skip this agent" resolution paths from the Error Recovery Protocol are not separately tested — they follow the same AskUserQuestion
- partial-report pattern validated in Cases 2 and 5.
Phase 6 sign-off logic (collecting and comparing all metrics) is validated implicitly by Cases 1 and 2. The distinction between READY FOR RELEASE and NEEDS MORE WORK is exercised in both directions across these cases.
Soak testing and stress testing (Phase 5) are validated implicitly by Case 1's qa-tester output. Case 5 focuses on the regression detection aspect of Phase 5.
The "minimum spec hardware" test path in Phase 5 is not separately tested — it follows the same qa-tester delegation pattern when the hardware is available.

14 KiB Raw Blame History Unescape Escape

Skill Test Spec: /team-polish

Skill Summary

Static Assertions (Structural)

Test Cases

Case 1: Happy Path — Full pipeline completes, READY FOR RELEASE verdict

Case 2: Performance Blocker — Frame budget violation cannot be fully resolved

Case 3: No Argument — Usage guidance shown

Case 4: Engine-Level Bottleneck — engine-programmer spawned conditionally in Phase 2

Case 5: Regression Found — Polish change broke an existing feature

Protocol Compliance

Coverage Notes

14 KiB

Raw Blame History