Package org.springframework.ai.embedding
Class TokenCountBatchingStrategy
java.lang.Object
org.springframework.ai.embedding.TokenCountBatchingStrategy
- All Implemented Interfaces:
BatchingStrategy
Token count based strategy implementation for
BatchingStrategy. Using openai
max input token as the default:
https://platform.openai.com/docs/guides/embeddings/embedding-models.
This strategy incorporates a reserve percentage to provide a buffer for potential
overhead or unexpected increases in token count during processing. The actual max input
token count used is calculated as: actualMaxInputTokenCount =
originalMaxInputTokenCount * (1 - RESERVE_PERCENTAGE)
For example, with the default reserve percentage of 10% (0.1) and the default max input
token count of 8191, the actual max input token count used will be 7371.
The strategy batches documents based on their token counts, ensuring that each batch
does not exceed the calculated max input token count.- Since:
- 1.0.0
- Author:
- Soby Chacko, Mark Pollack, Laura Trotta, Jihoon Kim, Yanming Zhou
-
Constructor Summary
ConstructorsConstructorDescriptionTokenCountBatchingStrategy(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double reservePercentage) TokenCountBatchingStrategy(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode) TokenCountBatchingStrategy(TokenCountEstimator tokenCountEstimator, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode) Constructs a TokenCountBatchingStrategy with the specified parameters. -
Method Summary
-
Constructor Details
-
TokenCountBatchingStrategy
public TokenCountBatchingStrategy() -
TokenCountBatchingStrategy
public TokenCountBatchingStrategy(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double reservePercentage) - Parameters:
encodingType-EncodingTypemaxInputTokenCount- upper limit for input tokensreservePercentage- the percentage of tokens to reserve from the max input token count to create a buffer.
-
TokenCountBatchingStrategy
public TokenCountBatchingStrategy(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode) - Parameters:
encodingType- TheEncodingTypeto be used for token counting.maxInputTokenCount- The initial upper limit for input tokens.reservePercentage- The percentage of tokens to reserve from the max input token count. This creates a buffer for potential token count increases during processing.contentFormatter- theContentFormatterto be used for formatting content.metadataMode- TheMetadataModeto be used for handling metadata.
-
TokenCountBatchingStrategy
public TokenCountBatchingStrategy(TokenCountEstimator tokenCountEstimator, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode) Constructs a TokenCountBatchingStrategy with the specified parameters.- Parameters:
tokenCountEstimator- the TokenCountEstimator to be used for estimating token counts.maxInputTokenCount- the initial upper limit for input tokens.reservePercentage- the percentage of tokens to reserve from the max input token count to create a buffer.contentFormatter- the ContentFormatter to be used for formatting content.metadataMode- the MetadataMode to be used for handling metadata.
-
-
Method Details
-
batch
Description copied from interface:BatchingStrategyEmbeddingModel implementations can call this method to optimize embedding tokens. The incoming collection ofDocuments are split into sub-batches. It is important to preserve the order of the list ofDocuments when batching as they are mapped to their corresponding embeddings by their order.- Specified by:
batchin interfaceBatchingStrategy- Parameters:
documents- to batch- Returns:
- a list of sub-batches that contain
Documents.
-