org.springframework.ai.embedding.TokenCountBatchingStrategy

All Implemented Interfaces:: BatchingStrategy

public class TokenCountBatchingStrategy extends Object implements BatchingStrategy

Token count based strategy implementation for BatchingStrategy. Using openai max input token as the default: https://platform.openai.com/docs/guides/embeddings/embedding-models. This strategy incorporates a reserve percentage to provide a buffer for potential overhead or unexpected increases in token count during processing. The actual max input token count used is calculated as: actualMaxInputTokenCount = originalMaxInputTokenCount * (1 - RESERVE_PERCENTAGE) For example, with the default reserve percentage of 10% (0.1) and the default max input token count of 8191, the actual max input token count used will be 7371. The strategy batches documents based on their token counts, ensuring that each batch does not exceed the calculated max input token count.

Since:: 1.0.0
Author:: Soby Chacko, Mark Pollack, Laura Trotta, Jihoon Kim

Constructor Summary

Constructors

Constructor

Description

TokenCountBatchingStrategy()

TokenCountBatchingStrategy(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double reservePercentage)

TokenCountBatchingStrategy(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode)

TokenCountBatchingStrategy(TokenCountEstimator tokenCountEstimator, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode)

Constructs a TokenCountBatchingStrategy with the specified parameters.
Method Summary

Modifier and Type

Method

Description

List<List<Document>>

batch(List<Document> documents)

EmbeddingModel implementations can call this method to optimize embedding tokens.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- TokenCountBatchingStrategy
  
  public TokenCountBatchingStrategy()
- TokenCountBatchingStrategy
  
  public TokenCountBatchingStrategy(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double reservePercentage)
  
  Parameters:
  
  encodingType - EncodingType
  
  maxInputTokenCount - upper limit for input tokens
  
  reservePercentage - the percentage of tokens to reserve from the max input token count to create a buffer.
- TokenCountBatchingStrategy
  
  public TokenCountBatchingStrategy(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode)
  
  Parameters:
  
  encodingType - The EncodingType to be used for token counting.
  
  maxInputTokenCount - The initial upper limit for input tokens.
  
  reservePercentage - The percentage of tokens to reserve from the max input token count. This creates a buffer for potential token count increases during processing.
  
  contentFormatter - the ContentFormatter to be used for formatting content.
  
  metadataMode - The MetadataMode to be used for handling metadata.
- TokenCountBatchingStrategy
  
  public TokenCountBatchingStrategy(TokenCountEstimator tokenCountEstimator, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode)
  
  Constructs a TokenCountBatchingStrategy with the specified parameters.
  
  Parameters:
  
  tokenCountEstimator - the TokenCountEstimator to be used for estimating token counts.
  
  maxInputTokenCount - the initial upper limit for input tokens.
  
  reservePercentage - the percentage of tokens to reserve from the max input token count to create a buffer.
  
  contentFormatter - the ContentFormatter to be used for formatting content.
  
  metadataMode - the MetadataMode to be used for handling metadata.
Method Details
- batch
  
  public List<List<Document>> batch(List<Document> documents)
  
  Description copied from interface: BatchingStrategy
  
  EmbeddingModel implementations can call this method to optimize embedding tokens. The incoming collection of Documents are split into sub-batches. It is important to preserve the order of the list of Documents when batching as they are mapped to their corresponding embeddings by their order.
  
  Specified by:
  
  batch in interface BatchingStrategy
  
  Parameters:
  
  documents - to batch
  
  Returns:
  
  a list of sub-batches that contain Documents.

Class TokenCountBatchingStrategy

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

TokenCountBatchingStrategy

TokenCountBatchingStrategy

TokenCountBatchingStrategy

TokenCountBatchingStrategy

Method Details

batch