エ

エージェント評価フレームワーク

Name: エージェント評価フレームワーク
Author: NeoLabHQ

多次元スコアリング、LLM-as-Judgeモード、リサーチに裏付けられたパフォーマンス分散分析を備えた包括的なClaude Codeエージェント評価フレームワーク

byNeoLabHQ

Home/AI & ML/エージェント評価フレームワーク

What is it?

マルチディメンショナルスコアリング、LLM-as-Judgeモード、リサーチに基づくパフォーマンスバリアンス分析を備えた包括的なClaude Codeエージェント評価フレームワークです。

How to use it?

Claude環境にインストールすると、エージェント評価フレームワーク関連の作業時に自動的にスキルのガイドラインを適用します。

完全なソースとドキュメントはGitHubで利用可能です。

Key Features

マルチディメンショナルスコアリング、LLM-as-Judgeモード、リサーチに基づくパフォーマンスバリアンス分析を備えた包括的なClaude Codeエージェント評価フレームワーク
Claude開発ワークフローとのシームレスな統合
エージェント評価フレームワークの包括的なガイドラインとベストプラクティス

View on GitHub

GitHub Stats

Stars

Forks

Last Update

Author

NeoLabHQ

License

GPL-3.0

Version

1.0.0

Features

Related Skills

Context Engineering Guide

Comprehensive context engineering tutorial covering attention mechanics, progressive disclosure, context budget management, and quality vs quantity trade-offs for AI agent development

433NeoLabHQ

AI & ML

Developer Tools

Multi-Perspective Critique

Multi-perspective review system using Multi-Agent Debate and LLM-as-Judge patterns with 3 specialized judges, debate rounds, and consensus building

433NeoLabHQ

AI & ML

Developer Tools

Create Claude Code Agent

Complete guide for creating Claude Code agents with YAML frontmatter structure, agent file format, trigger condition design, and system prompt writing

433NeoLabHQ

AI & ML

Developer Tools

エージェント評価フレームワーク

What is it?

How to use it?

Key Features

GitHub Stats

Categories

Tags

Features

Related Skills

Context Engineering Guide

Multi-Perspective Critique

Create Claude Code Agent