Class PagePdfDocumentReader

java.lang.Object
org.springframework.ai.reader.pdf.PagePdfDocumentReader
All Implemented Interfaces:
Supplier<List<Document>>, DocumentReader

public class PagePdfDocumentReader extends Object implements DocumentReader
Groups the parsed PDF pages into Documents. You can group one or more pages into a single output document. Use PdfDocumentReaderConfig for customization options. The default configuration is: - pagesPerDocument = 1 - pageTopMargin = 0 - pageBottomMargin = 0
Author:
Christian Tzolov
  • Field Details

    • METADATA_START_PAGE_NUMBER

      public static final String METADATA_START_PAGE_NUMBER
      See Also:
    • METADATA_END_PAGE_NUMBER

      public static final String METADATA_END_PAGE_NUMBER
      See Also:
    • METADATA_FILE_NAME

      public static final String METADATA_FILE_NAME
      See Also:
    • document

      protected final org.apache.pdfbox.pdmodel.PDDocument document
    • resourceFileName

      protected String resourceFileName
  • Constructor Details

    • PagePdfDocumentReader

      public PagePdfDocumentReader(String resourceUrl)
    • PagePdfDocumentReader

      public PagePdfDocumentReader(org.springframework.core.io.Resource pdfResource)
    • PagePdfDocumentReader

      public PagePdfDocumentReader(String resourceUrl, PdfDocumentReaderConfig config)
    • PagePdfDocumentReader

      public PagePdfDocumentReader(org.springframework.core.io.Resource pdfResource, PdfDocumentReaderConfig config)
  • Method Details