Class FactCheckingEvaluator

java.lang.Object
org.springframework.ai.evaluation.FactCheckingEvaluator
All Implemented Interfaces:
Evaluator

public class FactCheckingEvaluator extends Object implements Evaluator
Implementation of Evaluator used to evaluate the factual accuracy of Large Language Model (LLM) responses against provided context.

This evaluator addresses a specific type of potential error in LLM outputs known as "hallucination" in the context of grounded factuality. It verifies whether a given statement (the "claim") is logically supported by a provided context (the "document").

Key concepts: - Document: The context or grounding information against which the claim is checked. - Claim: The statement to be verified against the document.

The evaluator uses a prompt-based approach with a separate, typically smaller and more efficient LLM to perform the fact-checking. This design choice allows for cost-effective and rapid verification, which is crucial when evaluating longer LLM outputs that may require multiple verification steps.

Implementation note: For efficient and accurate fact-checking, consider using specialized models like Bespoke-Minicheck, a grounded factuality checking model developed by Bespoke Labs and available in Ollama. Such models are specifically designed to fact-check responses generated by other models, helping to detect and reduce hallucinations. For more information, see: Reduce Hallucinations with Bespoke-Minicheck and the research paper: MiniCheck: An Efficient Method for LLM Hallucination Detection

Note: This evaluator is specifically designed to fact-check statements against given information. It's not meant for other types of accuracy tests, like quizzing an AI on obscure facts without giving it any reference material to work with (so-called 'closed book' scenarios).

The evaluation process aims to determine if the claim is supported by the document, returning a boolean result indicating whether the fact-check passed or failed.

Since:
1.0.0
Author:
EddĂș MelĂ©ndez, Mark Pollack
See Also:
  • Constructor Details

    • FactCheckingEvaluator

      public FactCheckingEvaluator(ChatClient.Builder chatClientBuilder)
      Constructs a new FactCheckingEvaluator with the provided ChatClient.Builder. Uses the default evaluation prompt suitable for general purpose LLMs.
      Parameters:
      chatClientBuilder - The builder for the ChatClient used to perform the evaluation
    • FactCheckingEvaluator

      public FactCheckingEvaluator(ChatClient.Builder chatClientBuilder, String evaluationPrompt)
      Constructs a new FactCheckingEvaluator with the provided ChatClient.Builder and evaluation prompt.
      Parameters:
      chatClientBuilder - The builder for the ChatClient used to perform the evaluation
      evaluationPrompt - The prompt text to use for evaluation
  • Method Details

    • forBespokeMinicheck

      public static FactCheckingEvaluator forBespokeMinicheck(ChatClient.Builder chatClientBuilder)
      Creates a FactCheckingEvaluator configured for use with the Bespoke Minicheck model.
      Parameters:
      chatClientBuilder - The builder for the ChatClient used to perform the evaluation
      Returns:
      A FactCheckingEvaluator configured for Bespoke Minicheck
    • evaluate

      public EvaluationResponse evaluate(EvaluationRequest evaluationRequest)
      Evaluates whether the response content in the EvaluationRequest is factually supported by the context provided in the same request.
      Specified by:
      evaluate in interface Evaluator
      Parameters:
      evaluationRequest - The request containing the response to be evaluated and the supporting context
      Returns:
      An EvaluationResponse indicating whether the claim is supported by the document