Class JsoupDocumentReader

java.lang.Object
org.springframework.ai.reader.jsoup.JsoupDocumentReader
All Implemented Interfaces:
Supplier<List<Document>>, DocumentReader

public class JsoupDocumentReader extends Object implements DocumentReader
Reads HTML documents and extracts text content using JSoup. This reader provides options for selecting specific HTML elements to extract, handling links, and extracting metadata. It leverages the JSoup library for parsing HTML.
Author:
Alexandros Pappas
See Also:
  • Constructor Details

    • JsoupDocumentReader

      public JsoupDocumentReader(String htmlResource)
    • JsoupDocumentReader

      public JsoupDocumentReader(org.springframework.core.io.Resource htmlResource)
    • JsoupDocumentReader

      public JsoupDocumentReader(String htmlResource, JsoupDocumentReaderConfig config)
    • JsoupDocumentReader

      public JsoupDocumentReader(org.springframework.core.io.Resource htmlResource, JsoupDocumentReaderConfig config)
  • Method Details