에

에이전트 평가 프레임워크

Name: 에이전트 평가 프레임워크
Author: NeoLabHQ

다차원 점수, LLM-as-Judge 모드, 연구 기반 성능 분산 분석을 갖춘 포괄적인 Claude Code 에이전트 평가 프레임워크

byNeoLabHQ

Home/AI & ML/에이전트 평가 프레임워크

What is it?

다차원 점수화, LLM-as-Judge 모드, 연구 기반 성능 분산 분석을 갖춘 포괄적인 Claude Code 에이전트 평가 프레임워크입니다.

How to use it?

Claude 환경에 설치하면 에이전트 평가 프레임워크 관련 작업 시 자동으로 스킬 지침을 적용합니다.

Key Features

다차원 점수화를 갖춘 에이전트 평가 프레임워크
LLM-as-Judge 모드 지원
연구 기반 성능 분산 분석
자동화된 기준 점수 평가
Claude 개발 워크플로우와 원활한 통합

View on GitHub

GitHub Stats

Stars

Forks

Last Update

Author

NeoLabHQ

License

GPL-3.0

Version

1.0.0

Features

Related Skills

Context Engineering Guide

Comprehensive context engineering tutorial covering attention mechanics, progressive disclosure, context budget management, and quality vs quantity trade-offs for AI agent development

433NeoLabHQ

AI & ML

Developer Tools

Multi-Perspective Critique

Multi-perspective review system using Multi-Agent Debate and LLM-as-Judge patterns with 3 specialized judges, debate rounds, and consensus building

433NeoLabHQ

AI & ML

Developer Tools

Create Claude Code Agent

Complete guide for creating Claude Code agents with YAML frontmatter structure, agent file format, trigger condition design, and system prompt writing

433NeoLabHQ

AI & ML

Developer Tools

에이전트 평가 프레임워크

What is it?

How to use it?

Key Features

GitHub Stats

Categories

Tags

Features

Related Skills

Context Engineering Guide

Multi-Perspective Critique

Create Claude Code Agent