Class ExtractedTextFormatter.Builder

java.lang.Object
org.springframework.ai.reader.ExtractedTextFormatter.Builder
Enclosing class:
ExtractedTextFormatter

public static class ExtractedTextFormatter.Builder extends Object
The Builder class is a nested static class of ExtractedTextFormatter designed to facilitate the creation and customization of instances of ExtractedTextFormatter.

It allows for a step-by-step, fluent construction of the ExtractedTextFormatter, by providing methods to set specific configurations such as left alignment of text, the number of top lines or bottom lines to delete, and the number of top pages to skip before deletion. Each configuration method in the builder returns the builder instance itself, enabling method chaining.

By default, the builder sets:
  • Left alignment to false
  • Number of top pages to skip before deletion to 0
  • Number of top text lines to delete to 0
  • Number of bottom text lines to delete to 0

After configuring the builder, calling the build() method will return a new instance of ExtractedTextFormatter with the specified configurations.

See Also:
  • Constructor Details

    • Builder

      public Builder()
  • Method Details

    • withLeftAlignment

      public ExtractedTextFormatter.Builder withLeftAlignment(boolean leftAlignment)
      Align the document text to the left. Defaults to false.
      Parameters:
      leftAlignment - Flag to align the text to the left.
      Returns:
      this builder
    • withNumberOfTopPagesToSkipBeforeDelete

      public ExtractedTextFormatter.Builder withNumberOfTopPagesToSkipBeforeDelete(int numberOfTopPagesToSkipBeforeDelete)
      Withdraw the top N pages from the text top/bottom line deletion. Defaults to 0.
      Parameters:
      numberOfTopPagesToSkipBeforeDelete - Number of pages to skip from top/bottom line deletion policy.
      Returns:
      this builder
    • withNumberOfTopTextLinesToDelete

      public ExtractedTextFormatter.Builder withNumberOfTopTextLinesToDelete(int numberOfTopTextLinesToDelete)
      Remove the top N lines from the page text. Defaults to 0.
      Parameters:
      numberOfTopTextLinesToDelete - Number of top text lines to delete.
      Returns:
      this builder
    • withNumberOfBottomTextLinesToDelete

      public ExtractedTextFormatter.Builder withNumberOfBottomTextLinesToDelete(int numberOfBottomTextLinesToDelete)
      Remove the bottom N lines from the page text. Defaults to 0.
      Parameters:
      numberOfBottomTextLinesToDelete - Number of bottom text lines to delete.
      Returns:
      this builder
    • overrideLineSeparator

      public ExtractedTextFormatter.Builder overrideLineSeparator(String lineSeparator)
      Set the line separator to use when formatting the text. Defaults to the system line separator.
      Parameters:
      lineSeparator - The line separator to use.
      Returns:
      this builder
    • build

      public ExtractedTextFormatter build()
      Constructs and returns an instance of ExtractedTextFormatter using the configurations set on this builder.

      This method uses the values set on the builder to initialize the configuration for the ExtractedTextFormatter instance. If no values are explicitly set on the builder, the defaults specified in the builder are used.

      It's recommended to use this method only once per builder instance to ensure that each ExtractedTextFormatter object is configured as intended.

      Returns:
      a new instance of ExtractedTextFormatter configured with the values set on this builder.