tessl/maven-org-springframework-ai--spring-ai-commons

Common classes used across Spring AI providing document processing, text transformation, embedding utilities, observability support, and tokenization capabilities for AI application development

Overview

Eval results

Files

Readers and Writers

Name: tessl/maven-org-springframework-ai--spring-ai-commons
Author: tessl

Document readers and writers provide I/O capabilities for loading documents from various sources and writing them to destinations.

Overview

The readers and writers layer consists of:

JsonReader - Read JSON documents and convert to Document objects
TextReader - Read plain text from Spring Resources
FileDocumentWriter - Write documents to files
ExtractedTextFormatter - Reformat extracted text (cleanup, alignment, line removal)

JsonReader

Reads JSON documents and converts them to Document objects.

package org.springframework.ai.reader;

import org.springframework.ai.document.Document;
import org.springframework.ai.document.DocumentReader;
import org.springframework.core.io.Resource;
import java.util.List;

class JsonReader implements DocumentReader {
    /**
     * Create reader for JSON resource.
     * Converts entire JSON to documents.
     * @param resource JSON resource to read
     */
    JsonReader(Resource resource);

    /**
     * Create reader with specific JSON keys.
     * Only specified keys are used for document content.
     * @param resource JSON resource to read
     * @param jsonKeysToUse keys to extract for content
     */
    JsonReader(Resource resource, String... jsonKeysToUse);

    /**
     * Create reader with metadata generator.
     * @param resource JSON resource to read
     * @param jsonMetadataGenerator generator for metadata from JSON
     * @param jsonKeysToUse keys to extract for content
     */
    JsonReader(Resource resource, JsonMetadataGenerator jsonMetadataGenerator, String... jsonKeysToUse);

    /**
     * Read and parse JSON into documents.
     * @return list of documents
     */
    List<Document> get();

    /**
     * Read using JSON Pointer (RFC 6901).
     * Allows navigation to specific parts of JSON structure.
     * @param pointer JSON Pointer expression (e.g., "/data/items")
     * @return list of documents from pointer location
     */
    List<Document> get(String pointer);
}

JsonMetadataGenerator Interface

package org.springframework.ai.reader;

import java.util.Map;

@FunctionalInterface
interface JsonMetadataGenerator {
    /**
     * Generate metadata from JSON document map.
     * @param jsonMap JSON document as map
     * @return metadata map
     */
    Map<String, Object> generate(Map<String, Object> jsonMap);
}

EmptyJsonMetadataGenerator

package org.springframework.ai.reader;

import java.util.Map;

class EmptyJsonMetadataGenerator implements JsonMetadataGenerator {
    /**
     * Create empty metadata generator.
     */
    EmptyJsonMetadataGenerator();

    /**
     * Generate empty metadata.
     * @param jsonMap JSON document (ignored)
     * @return empty map
     */
    Map<String, Object> generate(Map<String, Object> jsonMap);
}

Usage Examples

import org.springframework.ai.reader.JsonReader;
import org.springframework.ai.document.Document;
import org.springframework.core.io.ClassPathResource;
import org.springframework.core.io.FileSystemResource;
import java.util.List;
import java.util.Map;

// Example JSON file content:
// {
//   "articles": [
//     {
//       "title": "Introduction to AI",
//       "content": "AI is transforming...",
//       "author": "Jane Doe",
//       "category": "technology"
//     },
//     {
//       "title": "Machine Learning Basics",
//       "content": "ML involves...",
//       "author": "John Smith",
//       "category": "education"
//     }
//   ]
// }

// Basic usage - read entire JSON
JsonReader basicReader = new JsonReader(new ClassPathResource("articles.json"));
List<Document> allDocs = basicReader.get();

for (Document doc : allDocs) {
    System.out.println("Content: " + doc.getText());
    System.out.println("Metadata: " + doc.getMetadata());
}

// Read specific JSON keys for content
JsonReader specificReader = new JsonReader(
    new ClassPathResource("articles.json"),
    "title", "content"  // Only use these keys
);
List<Document> specificDocs = specificReader.get();

// Use JSON Pointer to navigate structure
JsonReader pointerReader = new JsonReader(new ClassPathResource("articles.json"));
List<Document> articleDocs = pointerReader.get("/articles");
// Directly accesses the "articles" array

// Custom metadata generation
JsonReader customReader = new JsonReader(
    new ClassPathResource("articles.json"),
    jsonMap -> {
        // Custom metadata from JSON
        return Map.of(
            "author", jsonMap.get("author"),
            "category", jsonMap.get("category"),
            "processed_at", System.currentTimeMillis()
        );
    },
    "title", "content"
);

List<Document> customDocs = customReader.get();

for (Document doc : customDocs) {
    System.out.println("Author: " + doc.getMetadata().get("author"));
    System.out.println("Category: " + doc.getMetadata().get("category"));
}

// Read from file system
JsonReader fileReader = new JsonReader(
    new FileSystemResource("/data/documents.json"),
    "text", "summary"
);
List<Document> fileDocs = fileReader.get();

JSON Pointer Examples

import org.springframework.ai.reader.JsonReader;
import org.springframework.ai.document.Document;
import org.springframework.core.io.ClassPathResource;
import java.util.List;

// Complex JSON structure:
// {
//   "database": {
//     "documents": [
//       {"id": 1, "text": "First doc"},
//       {"id": 2, "text": "Second doc"}
//     ],
//     "metadata": {
//       "version": "1.0"
//     }
//   }
// }

JsonReader reader = new JsonReader(new ClassPathResource("complex.json"));

// Access nested array
List<Document> docs = reader.get("/database/documents");

// Access specific array element
List<Document> firstDoc = reader.get("/database/documents/0");

// Access nested object
List<Document> metadata = reader.get("/database/metadata");

TextReader

Reads plain text from Spring Resources.

package org.springframework.ai.reader;

import org.springframework.ai.document.Document;
import org.springframework.ai.document.DocumentReader;
import org.springframework.core.io.Resource;
import java.nio.charset.Charset;
import java.util.List;
import java.util.Map;

class TextReader implements DocumentReader {
    // Metadata constants
    static final String CHARSET_METADATA = "charset";
    static final String SOURCE_METADATA = "source";

    /**
     * Create reader from resource URL.
     * @param resourceUrl URL to text resource
     */
    TextReader(String resourceUrl);

    /**
     * Create reader from Resource.
     * @param resource text resource to read
     */
    TextReader(Resource resource);

    /**
     * Get charset for reading.
     * Default: UTF-8
     * @return charset
     */
    Charset getCharset();

    /**
     * Set charset for reading.
     * @param charset charset to use
     */
    void setCharset(Charset charset);

    /**
     * Get custom metadata to include in documents.
     * @return metadata map
     */
    Map<String, Object> getCustomMetadata();

    /**
     * Read text resource into document.
     * @return list with single document
     */
    List<Document> get();
}

Usage Examples

import org.springframework.ai.reader.TextReader;
import org.springframework.ai.document.Document;
import org.springframework.core.io.ClassPathResource;
import org.springframework.core.io.FileSystemResource;
import org.springframework.core.io.UrlResource;
import java.net.URL;
import java.nio.charset.StandardCharsets;
import java.util.List;

// Read from classpath
TextReader classpathReader = new TextReader(new ClassPathResource("knowledge-base.txt"));
List<Document> docs = classpathReader.get();

Document doc = docs.get(0);
System.out.println("Content: " + doc.getText());
System.out.println("Source: " + doc.getMetadata().get(TextReader.SOURCE_METADATA));
System.out.println("Charset: " + doc.getMetadata().get(TextReader.CHARSET_METADATA));

// Read from file system
TextReader fileReader = new TextReader(new FileSystemResource("/data/document.txt"));
List<Document> fileDocs = fileReader.get();

// Read from URL
TextReader urlReader = new TextReader("https://example.com/document.txt");
List<Document> urlDocs = urlReader.get();

// Custom charset
TextReader customCharsetReader = new TextReader(new ClassPathResource("data-latin1.txt"));
customCharsetReader.setCharset(StandardCharsets.ISO_8859_1);
List<Document> latinDocs = customCharsetReader.get();

// Add custom metadata
TextReader customMetadataReader = new TextReader(new ClassPathResource("manual.txt"));
customMetadataReader.getCustomMetadata().put("document_type", "user-manual");
customMetadataReader.getCustomMetadata().put("version", "2.1");
customMetadataReader.getCustomMetadata().put("author", "Documentation Team");

List<Document> customDocs = customMetadataReader.get();
Document customDoc = customDocs.get(0);

System.out.println("Type: " + customDoc.getMetadata().get("document_type"));
System.out.println("Version: " + customDoc.getMetadata().get("version"));
System.out.println("Author: " + customDoc.getMetadata().get("author"));

Processing Pipeline with TextReader

import org.springframework.ai.reader.TextReader;
import org.springframework.ai.document.Document;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.core.io.ClassPathResource;
import java.util.List;

// Read large text file
TextReader reader = new TextReader(new ClassPathResource("large-document.txt"));
reader.getCustomMetadata().put("source_file", "large-document.txt");
reader.getCustomMetadata().put("import_date", System.currentTimeMillis());

List<Document> documents = reader.get();

// Split into chunks
TokenTextSplitter splitter = TokenTextSplitter.builder()
    .withChunkSize(500)
    .build();

List<Document> chunks = splitter.apply(documents);

System.out.println("Read 1 document, created " + chunks.size() + " chunks");

// All chunks inherit the custom metadata
for (Document chunk : chunks) {
    System.out.println("Chunk source: " + chunk.getMetadata().get("source_file"));
}

FileDocumentWriter

Writes documents to files.

package org.springframework.ai.writer;

import org.springframework.ai.document.Document;
import org.springframework.ai.document.DocumentWriter;
import org.springframework.ai.document.MetadataMode;
import java.util.List;

class FileDocumentWriter implements DocumentWriter {
    // Metadata constants for page numbers
    static final String METADATA_START_PAGE_NUMBER = "page_number";
    static final String METADATA_END_PAGE_NUMBER = "end_page_number";

    /**
     * Create writer for file.
     * @param fileName output file path
     */
    FileDocumentWriter(String fileName);

    /**
     * Create writer with document markers.
     * @param fileName output file path
     * @param withDocumentMarkers true to add markers between documents
     */
    FileDocumentWriter(String fileName, boolean withDocumentMarkers);

    /**
     * Create writer with full configuration.
     * @param fileName output file path
     * @param withDocumentMarkers true to add markers between documents
     * @param metadataMode metadata inclusion mode
     * @param append true to append to existing file
     */
    FileDocumentWriter(String fileName, boolean withDocumentMarkers,
                       MetadataMode metadataMode, boolean append);

    /**
     * Write documents to file.
     * @param docs documents to write
     */
    void accept(List<Document> docs);
}

Usage Examples

import org.springframework.ai.writer.FileDocumentWriter;
import org.springframework.ai.document.Document;
import org.springframework.ai.document.DocumentWriter;
import org.springframework.ai.document.MetadataMode;
import java.util.List;
import java.util.Map;

// Create documents to write
List<Document> docs = List.of(
    Document.builder()
        .text("First document content")
        .metadata("author", "Alice")
        .metadata("category", "intro")
        .build(),
    Document.builder()
        .text("Second document content")
        .metadata("author", "Bob")
        .metadata("category", "advanced")
        .build()
);

// Basic writing
DocumentWriter basicWriter = new FileDocumentWriter("output.txt");
basicWriter.write(docs);

// Write with document markers (separators between documents)
DocumentWriter markerWriter = new FileDocumentWriter(
    "output-with-markers.txt",
    true  // with markers
);
markerWriter.write(docs);

// Write with metadata
DocumentWriter metadataWriter = new FileDocumentWriter(
    "output-with-metadata.txt",
    true,                    // with markers
    MetadataMode.ALL,        // include all metadata
    false                    // overwrite file
);
metadataWriter.accept(docs);

// Append to existing file
DocumentWriter appendWriter = new FileDocumentWriter(
    "output.txt",
    true,                    // with markers
    MetadataMode.INFERENCE,  // inference metadata only
    true                     // append mode
);
appendWriter.write(docs);

// Write chunks from splitting pipeline
TokenTextSplitter splitter = TokenTextSplitter.builder()
    .withChunkSize(500)
    .build();

List<Document> sourceDoc = List.of(new Document("Long content..."));
List<Document> chunks = splitter.apply(sourceDoc);

DocumentWriter chunkWriter = new FileDocumentWriter(
    "chunks.txt",
    true,              // separate chunks with markers
    MetadataMode.NONE, // content only
    false
);
chunkWriter.write(chunks);

Page Number Metadata

import org.springframework.ai.writer.FileDocumentWriter;
import org.springframework.ai.document.Document;
import java.util.List;

// Documents with page number metadata
List<Document> pdfPages = List.of(
    Document.builder()
        .text("Content from page 1")
        .metadata(FileDocumentWriter.METADATA_START_PAGE_NUMBER, 1)
        .metadata(FileDocumentWriter.METADATA_END_PAGE_NUMBER, 1)
        .build(),
    Document.builder()
        .text("Content from pages 2-3")
        .metadata(FileDocumentWriter.METADATA_START_PAGE_NUMBER, 2)
        .metadata(FileDocumentWriter.METADATA_END_PAGE_NUMBER, 3)
        .build()
);

FileDocumentWriter writer = new FileDocumentWriter(
    "pdf-export.txt",
    true,  // with markers
    MetadataMode.ALL,
    false
);

writer.write(pdfPages);
// Output includes page number metadata for each document

ExtractedTextFormatter

Reformats extracted text by removing unwanted lines, aligning text, and consolidating blank lines.

package org.springframework.ai.reader;

class ExtractedTextFormatter {
    /**
     * Create builder for configuration.
     * @return builder instance
     */
    static Builder builder();

    /**
     * Get formatter with default settings.
     * @return default formatter
     */
    static ExtractedTextFormatter defaults();

    /**
     * Trim adjacent blank lines to single blank line.
     * @param pageText text to process
     * @return processed text
     */
    static String trimAdjacentBlankLines(String pageText);

    /**
     * Align text to left margin (remove leading whitespace).
     * @param pageText text to align
     * @return aligned text
     */
    static String alignToLeft(String pageText);

    /**
     * Delete bottom text lines.
     * @param pageText text to process
     * @param numberOfLines number of lines to delete
     * @param lineSeparator line separator to use
     * @return processed text
     */
    static String deleteBottomTextLines(String pageText, int numberOfLines, String lineSeparator);

    /**
     * Delete top text lines.
     * @param pageText text to process
     * @param numberOfLines number of lines to delete
     * @param lineSeparator line separator to use
     * @return processed text
     */
    static String deleteTopTextLines(String pageText, int numberOfLines, String lineSeparator);

    /**
     * Format page text.
     * @param pageText text to format
     * @return formatted text
     */
    String format(String pageText);

    /**
     * Format page text with page number awareness.
     * @param pageText text to format
     * @param pageNumber page number (for conditional processing)
     * @return formatted text
     */
    String format(String pageText, int pageNumber);
}

ExtractedTextFormatter.Builder

class ExtractedTextFormatter.Builder {
    /**
     * Enable/disable left alignment.
     * Default: true
     * @param leftAlignment true to align left
     * @return this builder
     */
    Builder withLeftAlignment(boolean leftAlignment);

    /**
     * Set number of top pages to skip before deleting lines.
     * Useful for preserving title pages.
     * Default: 0
     * @param numberOfPages number of pages to skip
     * @return this builder
     */
    Builder withNumberOfTopPagesToSkipBeforeDelete(int numberOfPages);

    /**
     * Set number of top text lines to delete.
     * Removes headers from each page.
     * Default: 0
     * @param numberOfLines number of lines to delete
     * @return this builder
     */
    Builder withNumberOfTopTextLinesToDelete(int numberOfLines);

    /**
     * Set number of bottom text lines to delete.
     * Removes footers from each page.
     * Default: 0
     * @param numberOfLines number of lines to delete
     * @return this builder
     */
    Builder withNumberOfBottomTextLinesToDelete(int numberOfLines);

    /**
     * Override line separator.
     * @param lineSeparator separator to use
     * @return this builder
     */
    Builder overrideLineSeparator(String lineSeparator);

    /**
     * Build the formatter.
     * @return configured formatter
     */
    ExtractedTextFormatter build();
}

Usage Examples

import org.springframework.ai.reader.ExtractedTextFormatter;

// Text extracted from PDF with issues:
String extractedText = """
    Header: Company Name - Page 1


    This is the actual content
    that we want to keep.



    It has extra whitespace issues.


    Footer: Page 1 of 10
""";

// Use defaults (left align + trim blank lines)
ExtractedTextFormatter defaultFormatter = ExtractedTextFormatter.defaults();
String cleaned = defaultFormatter.format(extractedText);

// Custom formatting - remove headers and footers
ExtractedTextFormatter customFormatter = ExtractedTextFormatter.builder()
    .withLeftAlignment(true)              // Align to left
    .withNumberOfTopTextLinesToDelete(1)  // Remove 1 line from top (header)
    .withNumberOfBottomTextLinesToDelete(1) // Remove 1 line from bottom (footer)
    .build();

String formatted = customFormatter.format(extractedText);

// Process PDF with title page preservation
ExtractedTextFormatter pdfFormatter = ExtractedTextFormatter.builder()
    .withNumberOfTopPagesToSkipBeforeDelete(1)  // Don't process first page
    .withNumberOfTopTextLinesToDelete(2)        // Remove 2-line headers from other pages
    .withNumberOfBottomTextLinesToDelete(1)     // Remove 1-line footers
    .build();

// Format with page number
String page1 = pdfFormatter.format(page1Text, 1);  // Title page - headers preserved
String page2 = pdfFormatter.format(page2Text, 2);  // Regular page - headers removed

// Static utility methods
String textWithBlanks = "Line 1\n\n\n\nLine 2";
String trimmed = ExtractedTextFormatter.trimAdjacentBlankLines(textWithBlanks);
// Result: "Line 1\n\nLine 2"

String indentedText = "    Indented line\n        More indent";
String aligned = ExtractedTextFormatter.alignToLeft(indentedText);
// Result: "Indented line\nMore indent"

String withHeader = "Header Line\nContent Line\nMore Content";
String noHeader = ExtractedTextFormatter.deleteTopTextLines(withHeader, 1, "\n");
// Result: "Content Line\nMore Content"

String withFooter = "Content Line\nMore Content\nFooter Line";
String noFooter = ExtractedTextFormatter.deleteBottomTextLines(withFooter, 1, "\n");
// Result: "Content Line\nMore Content"

PDF Processing Pipeline

import org.springframework.ai.reader.ExtractedTextFormatter;
import org.springframework.ai.document.Document;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import java.util.ArrayList;
import java.util.List;

/**
 * Complete pipeline for processing extracted PDF text.
 */
class PdfProcessor {
    private final ExtractedTextFormatter formatter;
    private final TokenTextSplitter splitter;

    public PdfProcessor() {
        // Configure formatter to clean up PDF artifacts
        this.formatter = ExtractedTextFormatter.builder()
            .withLeftAlignment(true)
            .withNumberOfTopPagesToSkipBeforeDelete(1)  // Preserve title page
            .withNumberOfTopTextLinesToDelete(2)        // Remove headers
            .withNumberOfBottomTextLinesToDelete(1)     // Remove footers
            .build();

        // Configure splitter for chunks
        this.splitter = TokenTextSplitter.builder()
            .withChunkSize(500)
            .build();
    }

    public List<Document> processPdf(List<String> pdfPages) {
        List<Document> cleanedPages = new ArrayList<>();

        for (int i = 0; i < pdfPages.size(); i++) {
            String rawText = pdfPages.get(i);
            String cleanText = formatter.format(rawText, i + 1);

            Document doc = Document.builder()
                .text(cleanText)
                .metadata("page_number", i + 1)
                .metadata("total_pages", pdfPages.size())
                .build();

            cleanedPages.add(doc);
        }

        // Split into chunks
        return splitter.apply(cleanedPages);
    }
}

// Usage
List<String> pdfPages = List.of(
    "Title Page Content",
    "Header\nPage 2 content\nFooter",
    "Header\nPage 3 content\nFooter"
);

PdfProcessor processor = new PdfProcessor();
List<Document> processedChunks = processor.processPdf(pdfPages);

Practical Use Cases

Multi-Format Document Loading

import org.springframework.ai.reader.JsonReader;
import org.springframework.ai.reader.TextReader;
import org.springframework.ai.document.Document;
import org.springframework.core.io.Resource;
import org.springframework.core.io.ClassPathResource;
import java.util.ArrayList;
import java.util.List;

/**
 * Load documents from multiple formats.
 */
class MultiFormatLoader {
    public List<Document> loadAll() {
        List<Document> allDocs = new ArrayList<>();

        // Load JSON documents
        JsonReader jsonReader = new JsonReader(
            new ClassPathResource("data.json"),
            "title", "content"
        );
        allDocs.addAll(jsonReader.get());

        // Load text files
        TextReader textReader = new TextReader(new ClassPathResource("readme.txt"));
        allDocs.addAll(textReader.get());

        // Load more text files
        TextReader manualReader = new TextReader(new ClassPathResource("manual.txt"));
        manualReader.getCustomMetadata().put("type", "manual");
        allDocs.addAll(manualReader.get());

        return allDocs;
    }
}

ETL Pipeline

import org.springframework.ai.reader.TextReader;
import org.springframework.ai.writer.FileDocumentWriter;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.document.Document;
import org.springframework.core.io.ClassPathResource;
import java.util.List;

// Extract
TextReader reader = new TextReader(new ClassPathResource("input.txt"));
List<Document> documents = reader.get();

// Transform
TokenTextSplitter splitter = TokenTextSplitter.builder()
    .withChunkSize(500)
    .build();
List<Document> chunks = splitter.apply(documents);

// Load
FileDocumentWriter writer = new FileDocumentWriter(
    "output.txt",
    true,  // with markers
    MetadataMode.ALL,
    false
);
writer.write(chunks);

Thread Safety and Performance

Thread Safety:

JsonReader: Thread-safe for reading (stateless)
TextReader: Thread-safe for reading (stateless)
FileDocumentWriter: NOT thread-safe for concurrent writes to same file
ExtractedTextFormatter: Thread-safe (stateless)

Performance:

JsonReader: Memory-based parsing, O(n) where n is JSON size
TextReader: Streaming read, efficient for large files
FileDocumentWriter: I/O bound, sequential writes
ExtractedTextFormatter: O(n) where n is text length

Memory Considerations:

Readers load entire content into memory
For very large files (>100MB), consider streaming or chunked processing
Writers buffer output before flushing

Error Handling

Common Exceptions:

IOException: File not found, permission denied, network errors
JsonProcessingException: Invalid JSON format (JsonReader)
IllegalArgumentException: Invalid parameters (null resources, empty paths)
RuntimeException: Unexpected errors (encoding issues, resource access failures)

Edge Cases:

// Empty file
TextReader reader = new TextReader(new ClassPathResource("empty.txt"));
List<Document> docs = reader.get();  // Returns list with single document containing empty string

// Invalid JSON
JsonReader reader = new JsonReader(new ClassPathResource("invalid.json"));
try {
    List<Document> docs = reader.get();  // Throws JsonProcessingException wrapped in RuntimeException
} catch (RuntimeException e) {
    // Handle invalid JSON
}

// Missing JSON pointer path
JsonReader reader = new JsonReader(new ClassPathResource("data.json"));
List<Document> docs = reader.get("/nonexistent/path");  // Returns empty list

// Concurrent writes (unsafe)
FileDocumentWriter writer = new FileDocumentWriter("output.txt");
// DON'T: Multiple threads writing concurrently
// DO: Synchronize or use separate writers

// Charset issues
TextReader reader = new TextReader(new ClassPathResource("utf16.txt"));
reader.setCharset(StandardCharsets.UTF_8);  // Wrong charset
List<Document> docs = reader.get();  // May produce garbled text

Best Practices

Handle I/O Exceptions: Always wrap reader/writer calls in try-catch
Set Correct Charset: Verify file encoding before reading
Validate JSON Structure: Test JSON pointer paths before production
Use Appropriate Readers: JsonReader for structured data, TextReader for plain text
Synchronize Writes: Protect concurrent writes to same file
Clean Headers/Footers: Use ExtractedTextFormatter for PDF/document processing
Add Custom Metadata: Enrich documents with source information

tessl/maven-org-springframework-ai--spring-ai-commons

readers-writers.mddocs/reference/

Readers and Writers

Overview

JsonReader

JsonMetadataGenerator Interface

EmptyJsonMetadataGenerator

Usage Examples

JSON Pointer Examples

TextReader

Usage Examples

Processing Pipeline with TextReader

FileDocumentWriter

Usage Examples

Page Number Metadata

ExtractedTextFormatter

ExtractedTextFormatter.Builder

Usage Examples

PDF Processing Pipeline

Practical Use Cases

Multi-Format Document Loading

ETL Pipeline

Thread Safety and Performance

Error Handling

Best Practices

Related Documentation

tessl/maven-org-springframework-ai--spring-ai-commons

readers-writers.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/reference/

Readers and Writers

Overview

JsonReader

JsonMetadataGenerator Interface

EmptyJsonMetadataGenerator

Usage Examples

JSON Pointer Examples

TextReader

Usage Examples

Processing Pipeline with TextReader

FileDocumentWriter

Usage Examples

Page Number Metadata

ExtractedTextFormatter

ExtractedTextFormatter.Builder

Usage Examples

PDF Processing Pipeline

Practical Use Cases

Multi-Format Document Loading

ETL Pipeline

Thread Safety and Performance

Error Handling

Best Practices

Related Documentation

readers-writers.mddocs/reference/