Package org.springframework.ai.document
Class Document
java.lang.Object
org.springframework.ai.document.Document
A document is a container for the content and metadata of a document. It also contains
 the document's unique ID.
 A Document can hold either text content or media content, but not both.
 It is intended to be used to take data from external sources as part of spring-ai's ETL
 pipeline.
 
Example of creating a text document:
 // Using constructor
 Document textDoc = new Document("Sample text content", Map.of("source", "user-input"));
 // Using builder
 Document textDoc = Document.builder()
     .text("Sample text content")
     .metadata("source", "user-input")
     .build();
 Example of creating a media document:
 // Using constructor
 Media imageContent = new Media(MediaType.IMAGE_PNG, new byte[] {...});
 Document mediaDoc = new Document(imageContent, Map.of("filename", "sample.png"));
 // Using builder
 Document mediaDoc = Document.builder()
     .media(new Media(MediaType.IMAGE_PNG, new byte[] {...}))
     .metadata("filename", "sample.png")
     .build();
 Example of checking content type and accessing content:
 if (document.isText()) {
     String textContent = document.getText();
     // Process text content
 } else {
     Media mediaContent = document.getMedia();
     // Process media content
 }
 - 
Nested Class SummaryNested Classes
- 
Field SummaryFields
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionstatic Document.Builderbuilder()booleanReturns the content formatter associated with this document.getFormattedContent(ContentFormatter formatter, MetadataMode metadataMode) Helper content extractor that uses and externalContentFormatter.getFormattedContent(MetadataMode metadataMode) getId()Returns the unique identifier for this document.getMedia()Returns the document's media content, if any.Returns the metadata associated with this document.getScore()getText()Returns the document's text content, if any.inthashCode()booleanisText()Determines whether this document contains text or media content.mutate()voidsetContentFormatter(ContentFormatter contentFormatter) Replace the document'sContentFormatter.toString()
- 
Field Details- 
DEFAULT_CONTENT_FORMATTER
 
- 
- 
Constructor Details- 
Document
- 
Document
- 
Document
- 
Document
- 
Document
 
- 
- 
Method Details- 
builder
- 
getIdReturns the unique identifier for this document.This ID is either explicitly provided during document creation or generated using the configured IdGenerator(defaults toRandomIdGenerator).- Returns:
- the unique identifier of this document
- See Also:
 
- 
getTextReturns the document's text content, if any.- Returns:
- the text content if isText()is true, null otherwise
- See Also:
 
- 
isTextpublic boolean isText()Determines whether this document contains text or media content.- Returns:
- true if this document contains text content (accessible via
 getText()), false if it contains media content (accessible viagetMedia())
 
- 
getMediaReturns the document's media content, if any.
- 
getFormattedContent
- 
getFormattedContent
- 
getFormattedContentHelper content extractor that uses and externalContentFormatter.
- 
getMetadataReturns the metadata associated with this document.The metadata values are restricted to simple types (string, int, float, boolean) for compatibility with Vector Databases. - Returns:
- the metadata map
 
- 
getScore
- 
getContentFormatterReturns the content formatter associated with this document.- Returns:
- the current ContentFormatter instance used for formatting the document content.
 
- 
setContentFormatterReplace the document'sContentFormatter.- Parameters:
- contentFormatter- new formatter to use.
 
- 
mutate
- 
equals
- 
hashCodepublic int hashCode()
- 
toString
 
-