Package org.springframework.ai.embedding
Class TokenCountBatchingStrategy
java.lang.Object
org.springframework.ai.embedding.TokenCountBatchingStrategy
- All Implemented Interfaces:
BatchingStrategy
Token count based strategy implementation for
BatchingStrategy
. Using openai
max input token as the default:
https://platform.openai.com/docs/guides/embeddings/embedding-models.
This strategy incorporates a reserve percentage to provide a buffer for potential
overhead or unexpected increases in token count during processing. The actual max input
token count used is calculated as: actualMaxInputTokenCount =
originalMaxInputTokenCount * (1 - RESERVE_PERCENTAGE)
For example, with the default reserve percentage of 10% (0.1) and the default max input
token count of 8191, the actual max input token count used will be 7371.
The strategy batches documents based on their token counts, ensuring that each batch
does not exceed the calculated max input token count.- Since:
- 1.0.0
- Author:
- Soby Chacko, Mark Pollack, Laura Trotta
-
Constructor Summary
ConstructorDescriptionTokenCountBatchingStrategy
(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double thresholdFactor) TokenCountBatchingStrategy
(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode) TokenCountBatchingStrategy
(TokenCountEstimator tokenCountEstimator, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode) Constructs a TokenCountBatchingStrategy with the specified parameters. -
Method Summary
-
Constructor Details
-
TokenCountBatchingStrategy
public TokenCountBatchingStrategy() -
TokenCountBatchingStrategy
public TokenCountBatchingStrategy(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double thresholdFactor) - Parameters:
encodingType
-EncodingType
thresholdFactor
- the threshold factor to use on top of the max input token countmaxInputTokenCount
- upper limit for input tokens
-
TokenCountBatchingStrategy
public TokenCountBatchingStrategy(com.knuddels.jtokkit.api.EncodingType encodingType, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode) - Parameters:
encodingType
- TheEncodingType
to be used for token counting.maxInputTokenCount
- The initial upper limit for input tokens.reservePercentage
- The percentage of tokens to reserve from the max input token count. This creates a buffer for potential token count increases during processing.contentFormatter
- theContentFormatter
to be used for formatting content.metadataMode
- TheMetadataMode
to be used for handling metadata.
-
TokenCountBatchingStrategy
public TokenCountBatchingStrategy(TokenCountEstimator tokenCountEstimator, int maxInputTokenCount, double reservePercentage, ContentFormatter contentFormatter, MetadataMode metadataMode) Constructs a TokenCountBatchingStrategy with the specified parameters.- Parameters:
tokenCountEstimator
- the TokenCountEstimator to be used for estimating token counts.maxInputTokenCount
- the initial upper limit for input tokens.reservePercentage
- the percentage of tokens to reserve from the max input token count to create a buffer.contentFormatter
- the ContentFormatter to be used for formatting content.metadataMode
- the MetadataMode to be used for handling metadata.
-
-
Method Details
-
batch
Description copied from interface:BatchingStrategy
EmbeddingModel
implementations can call this method to optimize embedding tokens. The incoming collection ofDocument
s are split into su-batches.- Specified by:
batch
in interfaceBatchingStrategy
- Parameters:
documents
- to batch- Returns:
- a list of sub-batches that contain
Document
s.
-