Class FactCheckingEvaluator
- All Implemented Interfaces:
Evaluator
Evaluator
used to evaluate the factual accuracy of Large
Language Model (LLM) responses against provided context.
This evaluator addresses a specific type of potential error in LLM outputs known as "hallucination" in the context of grounded factuality. It verifies whether a given statement (the "claim") is logically supported by a provided context (the "document").
Key concepts: - Document: The context or grounding information against which the claim is checked. - Claim: The statement to be verified against the document.
The evaluator uses a prompt-based approach with a separate, typically smaller and more efficient LLM to perform the fact-checking. This design choice allows for cost-effective and rapid verification, which is crucial when evaluating longer LLM outputs that may require multiple verification steps.
Implementation note: For efficient and accurate fact-checking, consider using specialized models like Bespoke-Minicheck, a grounded factuality checking model developed by Bespoke Labs and available in Ollama. Such models are specifically designed to fact-check responses generated by other models, helping to detect and reduce hallucinations. For more information, see: Reduce Hallucinations with Bespoke-Minicheck and the research paper: MiniCheck: An Efficient Method for LLM Hallucination Detection
Note: This evaluator is specifically designed to fact-check statements against given information. It's not meant for other types of accuracy tests, like quizzing an AI on obscure facts without giving it any reference material to work with (so-called 'closed book' scenarios).
The evaluation process aims to determine if the claim is supported by the document, returning a boolean result indicating whether the fact-check passed or failed.
- Since:
- 1.0.0
- Author:
- EddĂș MelĂ©ndez, Mark Pollack
- See Also:
-
Constructor Summary
ConstructorDescriptionFactCheckingEvaluator
(ChatClient.Builder chatClientBuilder) Constructs a new FactCheckingEvaluator with the provided ChatClient.Builder.FactCheckingEvaluator
(ChatClient.Builder chatClientBuilder, String evaluationPrompt) Constructs a new FactCheckingEvaluator with the provided ChatClient.Builder and evaluation prompt. -
Method Summary
Modifier and TypeMethodDescriptionevaluate
(EvaluationRequest evaluationRequest) Evaluates whether the response content in the EvaluationRequest is factually supported by the context provided in the same request.static FactCheckingEvaluator
forBespokeMinicheck
(ChatClient.Builder chatClientBuilder) Creates a FactCheckingEvaluator configured for use with the Bespoke Minicheck model.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.springframework.ai.evaluation.Evaluator
doGetSupportingData
-
Constructor Details
-
FactCheckingEvaluator
Constructs a new FactCheckingEvaluator with the provided ChatClient.Builder. Uses the default evaluation prompt suitable for general purpose LLMs.- Parameters:
chatClientBuilder
- The builder for the ChatClient used to perform the evaluation
-
FactCheckingEvaluator
Constructs a new FactCheckingEvaluator with the provided ChatClient.Builder and evaluation prompt.- Parameters:
chatClientBuilder
- The builder for the ChatClient used to perform the evaluationevaluationPrompt
- The prompt text to use for evaluation
-
-
Method Details
-
forBespokeMinicheck
Creates a FactCheckingEvaluator configured for use with the Bespoke Minicheck model.- Parameters:
chatClientBuilder
- The builder for the ChatClient used to perform the evaluation- Returns:
- A FactCheckingEvaluator configured for Bespoke Minicheck
-
evaluate
Evaluates whether the response content in the EvaluationRequest is factually supported by the context provided in the same request.
-