智
What is it?
多维度 Claude Code 代理评估框架,支持 LLM-as-Judge 模式和研究支持的性能方差分析。
How to use it?
安装技能后,Claude 会在检测到代理评估任务时自动应用此技能,也可直接在提示中引用其名称来调用。
Key Features
- 多维度代理评估打分
- LLM-as-Judge 评估模式
- 研究支持的性能方差分析
- 自动化评估标准评分
- 人工评估辅助检查清单
Related Skills
More from AI & MLContext Engineering Guide
Comprehensive context engineering tutorial covering attention mechanics, progressive disclosure, context budget management, and quality vs quantity trade-offs for AI agent development
433NeoLabHQ
AI & ML
Developer Tools
Multi-Perspective Critique
Multi-perspective review system using Multi-Agent Debate and LLM-as-Judge patterns with 3 specialized judges, debate rounds, and consensus building
433NeoLabHQ
AI & ML
Developer Tools
Create Claude Code Agent
Complete guide for creating Claude Code agents with YAML frontmatter structure, agent file format, trigger condition design, and system prompt writing
433NeoLabHQ
AI & ML
Developer Tools