LLM-as-judge implementations. Each judge implements the Judge interface.
interface Judge {
name: string
initialize(config: JudgeConfig): Promise<void>
evaluate(input: JudgeInput): Promise<JudgeResult>
getPromptForQuestionType(questionType: string, providerPrompts?: ProviderPrompts): string
getModel(): LanguageModel
}- Create
src/judges/myjudge.ts - Implement
Judgeinterface - Register in
src/judges/index.ts - Add to
JudgeNametype insrc/types/judge.ts - Add default model in
src/utils/models.ts(DEFAULT_JUDGE_MODELS)
Required returns:
initialize()- Set up client with API key and modelevaluate()- Return{ score: 0|1, label: "correct"|"incorrect", explanation: string }getPromptForQuestionType()- Return prompt string for question typegetModel()- Return the initialized LanguageModel
Use these helpers from ./base.ts:
buildJudgePrompt(input)- Builds full prompt from JudgeInputparseJudgeResponse(text)- Extracts JudgeResult from LLM response
Add model config in src/utils/models.ts:
interface ModelConfig {
id: string
provider: "openai" | "anthropic" | "google"
displayName: string
supportsTemperature: boolean
defaultTemperature: number
maxTokensParam: "maxTokens" | "max_completion_tokens"
defaultMaxTokens: number
}Providers can override judge prompts. See providers/README.md.
| Judge | SDK | Default Model |
|---|---|---|
openai |
@ai-sdk/openai |
gpt-4o |
anthropic |
@ai-sdk/anthropic |
sonnet-4 |
google |
@ai-sdk/google |
gemini-2.5-flash |