Class TokenCountBatchingStrategy

java.lang.Object
org.springframework.ai.embedding.TokenCountBatchingStrategy
All Implemented Interfaces:
BatchingStrategy

public class TokenCountBatchingStrategy extends Object implements BatchingStrategy
Token count based strategy implementation for BatchingStrategy. Using openai max input token as the default: https://platform.openai.com/docs/guides/embeddings/embedding-models. This strategy incorporates a reserve percentage to provide a buffer for potential overhead or unexpected increases in token count during processing. The actual max input token count used is calculated as: actualMaxInputTokenCount = originalMaxInputTokenCount * (1 - RESERVE_PERCENTAGE) For example, with the default reserve percentage of 10% (0.1) and the default max input token count of 8191, the actual max input token count used will be 7371. The strategy batches documents based on their token counts, ensuring that each batch does not exceed the calculated max input token count.
Since:
1.0.0
Author:
Soby Chacko, Mark Pollack, Laura Trotta, Jihoon Kim
  • Constructor Details

    • TokenCountBatchingStrategy

      public TokenCountBatchingStrategy()
    • TokenCountBatchingStrategy

      public TokenCountBatchingStrategy(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double reservePercentage)
      Parameters:
      encodingType - EncodingType
      maxInputTokenCount - upper limit for input tokens
      reservePercentage - the percentage of tokens to reserve from the max input token count to create a buffer.
    • TokenCountBatchingStrategy

      public TokenCountBatchingStrategy(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode)
      Parameters:
      encodingType - The EncodingType to be used for token counting.
      maxInputTokenCount - The initial upper limit for input tokens.
      reservePercentage - The percentage of tokens to reserve from the max input token count. This creates a buffer for potential token count increases during processing.
      contentFormatter - the ContentFormatter to be used for formatting content.
      metadataMode - The MetadataMode to be used for handling metadata.
    • TokenCountBatchingStrategy

      public TokenCountBatchingStrategy(TokenCountEstimator tokenCountEstimator, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode)
      Constructs a TokenCountBatchingStrategy with the specified parameters.
      Parameters:
      tokenCountEstimator - the TokenCountEstimator to be used for estimating token counts.
      maxInputTokenCount - the initial upper limit for input tokens.
      reservePercentage - the percentage of tokens to reserve from the max input token count to create a buffer.
      contentFormatter - the ContentFormatter to be used for formatting content.
      metadataMode - the MetadataMode to be used for handling metadata.
  • Method Details

    • batch

      public List<List<Document>> batch(List<Document> documents)
      Description copied from interface: BatchingStrategy
      EmbeddingModel implementations can call this method to optimize embedding tokens. The incoming collection of Documents are split into sub-batches. It is important to preserve the order of the list of Documents when batching as they are mapped to their corresponding embeddings by their order.
      Specified by:
      batch in interface BatchingStrategy
      Parameters:
      documents - to batch
      Returns:
      a list of sub-batches that contain Documents.