Class JsoupDocumentReader
java.lang.Object
org.springframework.ai.reader.jsoup.JsoupDocumentReader
- All Implemented Interfaces:
Supplier<List<Document>>
,DocumentReader
Reads HTML documents and extracts text content using JSoup.
This reader provides options for selecting specific HTML elements to extract, handling
links, and extracting metadata. It leverages the JSoup library for parsing HTML.
- Author:
- Alexandros Pappas
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionJsoupDocumentReader
(String htmlResource) JsoupDocumentReader
(String htmlResource, JsoupDocumentReaderConfig config) JsoupDocumentReader
(org.springframework.core.io.Resource htmlResource) JsoupDocumentReader
(org.springframework.core.io.Resource htmlResource, JsoupDocumentReaderConfig config) -
Method Summary
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.springframework.ai.document.DocumentReader
read
-
Constructor Details
-
JsoupDocumentReader
-
JsoupDocumentReader
public JsoupDocumentReader(org.springframework.core.io.Resource htmlResource) -
JsoupDocumentReader
-
JsoupDocumentReader
public JsoupDocumentReader(org.springframework.core.io.Resource htmlResource, JsoupDocumentReaderConfig config)
-
-
Method Details