tessl/maven-dev-langchain4j--langchain4j

Build LLM-powered applications in Java with support for chatbots, agents, RAG, tools, and much more

Overview

Eval results

Files

Text Classification

Name: tessl/maven-dev-langchain4j--langchain4j
Author: tessl

Text classification using embedding-based similarity with labeled examples. Classifies text by computing similarity between input embeddings and embeddings of pre-labeled example texts.

Capabilities

TextClassifier Interface

Base interface for text classification.

package dev.langchain4j.classification;

/**
 * Interface for classifying text based on a set of labels
 * Can return zero, one, or multiple labels for each classification
 */
public interface TextClassifier<L> {
    /**
     * Classify text and return labels
     * @param text Text to classify
     * @return List of labels (may be empty)
     */
    List<L> classify(String text);

    /**
     * Classify text segment and return labels
     * @param textSegment Text segment to classify
     * @return List of labels (may be empty)
     */
    List<L> classify(TextSegment textSegment);

    /**
     * Classify document and return labels
     * @param document Document to classify
     * @return List of labels (may be empty)
     */
    List<L> classify(Document document);

    /**
     * Classify text and return results with scores
     * @param text Text to classify
     * @return Classification result with scored labels
     */
    ClassificationResult<L> classifyWithScores(String text);

    /**
     * Classify text segment and return results with scores
     * @param textSegment Text segment to classify
     * @return Classification result with scored labels
     */
    ClassificationResult<L> classifyWithScores(TextSegment textSegment);

    /**
     * Classify document and return results with scores
     * @param document Document to classify
     * @return Classification result with scored labels
     */
    ClassificationResult<L> classifyWithScores(Document document);
}

Thread Safety: The TextClassifier interface itself does not specify thread safety guarantees. Thread safety depends on the implementation. For EmbeddingModelTextClassifier, thread safety is determined by the underlying EmbeddingModel - if the embedding model is thread-safe, the classifier can be safely shared across threads. The classifier itself is immutable after construction.

Common Pitfalls:

Calling classify() with null text will throw NullPointerException
Empty string classification may produce unpredictable results depending on embedding model behavior
TextSegment and Document overloads extract text content before classification - metadata is ignored

Edge Cases:

Empty input text: May return empty list or unpredictable label depending on embedding model
No labels meet threshold: Returns empty list (not null)
All labels tied with same score: Order is implementation-dependent
Very long text: May be truncated by embedding model's token limits

Performance Notes:

Each classify() call requires one embedding API call for the input text
classifyWithScores() has same performance as classify() - use it for debugging without overhead
TextSegment/Document variants extract text first, adding minimal overhead

Cost Considerations:

One embedding API call per classification (input text)
Example embeddings are computed once during classifier construction
For high-volume classification, reuse classifier instances to avoid re-embedding examples

Exception Handling:

NullPointerException if text is null
Underlying embedding model may throw exceptions (network errors, rate limits, etc.)
No explicit classification-specific exceptions defined in interface

EmbeddingModelTextClassifier

Implementation using embedding model and example-based classification.

package dev.langchain4j.classification;

/**
 * TextClassifier implementation using EmbeddingModel and predefined examples
 * Classification is performed by computing similarity between input text's embedding
 * and embeddings of labeled example texts
 *
 * Works by:
 * 1. Embedding all example texts for each label
 * 2. Embedding the input text
 * 3. Computing cosine similarity between input and all examples
 * 4. Aggregating scores per label
 * 5. Returning labels that meet score thresholds
 */
public class EmbeddingModelTextClassifier<L> implements TextClassifier<L> {
    /**
     * Constructor with default values
     * maxResults=1, minScore=0, meanToMaxScoreRatio=0.5
     * @param embeddingModel Embedding model to use
     * @param examplesByLabel Map of labels to their example texts
     */
    public EmbeddingModelTextClassifier(
        EmbeddingModel embeddingModel,
        Map<L, ? extends Collection<String>> examplesByLabel
    );

    /**
     * Full constructor with all configuration options
     * @param embeddingModel Embedding model to use
     * @param examplesByLabel Map of labels to their example texts
     * @param maxResults Maximum number of labels to return
     * @param minScore Minimum score threshold (0.0 to 1.0)
     * @param meanToMaxScoreRatio Ratio for filtering results
     */
    public EmbeddingModelTextClassifier(
        EmbeddingModel embeddingModel,
        Map<L, ? extends Collection<String>> examplesByLabel,
        int maxResults,
        double minScore,
        double meanToMaxScoreRatio
    );

    /**
     * Classify text and return labels
     * @param text Text to classify
     * @return List of labels
     */
    public List<L> classify(String text);

    /**
     * Classify text and return results with scores
     * @param text Text to classify
     * @return Classification result with scored labels
     */
    public ClassificationResult<L> classifyWithScores(String text);
}

Thread Safety: EmbeddingModelTextClassifier is thread-safe if the underlying EmbeddingModel is thread-safe. The classifier is immutable after construction - all examples are embedded during initialization. Multiple threads can safely call classify() concurrently if the embedding model supports concurrent requests.

Common Pitfalls:

Too few examples per label: Need at least 2-3 representative examples per label for reliable classification
Imbalanced examples: Labels with many examples may dominate over labels with few examples
Similar examples within a label: Redundant examples don't improve accuracy; diverse examples are better
Examples too similar across labels: Makes classification difficult - ensure clear semantic separation
Default parameters: Default maxResults=1, minScore=0, meanToMaxScoreRatio=0.5 may not suit all use cases

Edge Cases:

Empty input text: Behavior depends on embedding model (may throw exception or return zero vector)
Label with empty example collection: Label will never be returned (zero examples means zero score)
Single example per label: Works but accuracy suffers - use multiple diverse examples
maxResults exceeds number of labels: Returns all qualifying labels up to actual label count
minScore=1.0: Only perfect/near-perfect matches returned; may return empty list frequently
meanToMaxScoreRatio=0.0: Disables ratio filtering; only minScore applies

Performance Notes:

Construction time: All examples embedded during initialization - O(total_examples) embedding calls
Classification time: One embedding call per classify() invocation for input text
Similarity computation: Computed against all example embeddings - O(total_examples) comparisons
Optimization: Reuse classifier instances; creating new classifiers is expensive
Batch considerations: Classifier processes one input at a time; no native batching support

Cost Considerations:

Initialization cost: N embedding API calls where N = total number of example strings across all labels
Per-classification cost: 1 embedding API call per classify() invocation
Total cost for 1000 classifications: N + 1000 embedding calls
Cost reduction strategies:
- Minimize number of examples (quality over quantity)
- Reuse classifier instances across requests
- Cache classifier instances if examples don't change
- Consider example text length (longer text = higher embedding cost)

Exception Handling:

NullPointerException: If embeddingModel or examplesByLabel is null
IllegalArgumentException: If maxResults < 1, minScore < 0.0, minScore > 1.0, meanToMaxScoreRatio < 0.0, meanToMaxScoreRatio > 1.0
EmbeddingModel exceptions: Network errors, rate limits, authentication failures propagate from embedding calls
RuntimeException: May occur if embedding model returns null or invalid embeddings

Classification Results

Result types for classification.

package dev.langchain4j.classification;

/**
 * Represents the result of classification with scored labels
 */
public class ClassificationResult<L> {
    /**
     * Constructor
     * @param scoredLabels List of scored labels
     */
    public ClassificationResult(List<ScoredLabel<L>> scoredLabels);

    /**
     * Get scored labels
     * Sorted by score in descending order (highest first)
     * @return List of scored labels
     */
    public List<ScoredLabel<L>> scoredLabels();

    /**
     * Equality check
     * @param obj Object to compare
     * @return true if equal
     */
    public boolean equals(Object obj);

    /**
     * Hash code
     * @return Hash code
     */
    public int hashCode();

    /**
     * String representation
     * @return String representation
     */
    public String toString();
}

/**
 * Represents a classification label with associated score
 */
public class ScoredLabel<L> {
    /**
     * Constructor
     * @param label The label
     * @param score The score (0.0 to 1.0)
     */
    public ScoredLabel(L label, double score);

    /**
     * Get the label
     * Represents confidence/similarity (0.0 to 1.0)
     * @return The label
     */
    public L label();

    /**
     * Get the score
     * Represents confidence/similarity (0.0 to 1.0)
     * @return The score
     */
    public double score();

    /**
     * Equality check
     * @param obj Object to compare
     * @return true if equal
     */
    public boolean equals(Object obj);

    /**
     * Hash code
     * @return Hash code
     */
    public int hashCode();

    /**
     * String representation
     * @return String representation
     */
    public String toString();
}

Thread Safety: ClassificationResult and ScoredLabel are immutable value objects and fully thread-safe. They can be safely shared across threads without synchronization.

Common Pitfalls:

Assuming scoredLabels() returns a modifiable list (it may be immutable)
Comparing scores with == instead of using appropriate threshold checks
Ignoring that empty scoredLabels() list means no labels met criteria

Edge Cases:

Empty scoredLabels() list: Valid result when no labels meet thresholds
Multiple labels with identical scores: All returned, order is implementation-dependent
Score values: Always in range [0.0, 1.0] but practical range often [0.3, 0.95]

Performance Notes:

Immutable objects created once per classification
No allocation overhead when accessing labels/scores
Minimal memory footprint

Cost Considerations:

Negligible - simple value objects
No API calls or expensive operations

Exception Handling:

NullPointerException: If constructor receives null scoredLabels list or list contains null elements
No exceptions thrown by accessor methods

Usage Examples

Basic Classification with Enum Labels

import dev.langchain4j.classification.EmbeddingModelTextClassifier;
import dev.langchain4j.model.embedding.EmbeddingModel;
import java.util.List;
import java.util.Map;

enum Category {
    TECHNOLOGY,
    SPORTS,
    POLITICS,
    ENTERTAINMENT
}

// Define examples for each category
Map<Category, List<String>> examples = Map.of(
    Category.TECHNOLOGY, List.of(
        "New smartphone released with advanced AI features",
        "Software update improves system performance",
        "Tech company announces breakthrough in quantum computing"
    ),
    Category.SPORTS, List.of(
        "Team wins championship in overtime",
        "Athlete breaks world record",
        "Coach announces retirement after successful season"
    ),
    Category.POLITICS, List.of(
        "Government passes new legislation",
        "Election results announced",
        "Politicians debate policy changes"
    ),
    Category.ENTERTAINMENT, List.of(
        "New movie breaks box office records",
        "Actor wins prestigious award",
        "Music festival announces lineup"
    )
);

// Create classifier
TextClassifier<Category> classifier = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples
);

// Classify text
String text = "Scientists develop new artificial intelligence algorithm";
List<Category> categories = classifier.classify(text);
System.out.println("Categories: " + categories); // [TECHNOLOGY]

Classification with Scores

import dev.langchain4j.classification.ClassificationResult;
import dev.langchain4j.classification.ScoredLabel;

// Classify with scores
String text = "Basketball team advances to playoffs";
ClassificationResult<Category> result = classifier.classifyWithScores(text);

for (ScoredLabel<Category> scoredLabel : result.scoredLabels()) {
    System.out.printf("%s: %.2f%n",
        scoredLabel.label(),
        scoredLabel.score()
    );
}
// Output:
// SPORTS: 0.87
// ENTERTAINMENT: 0.42

Custom Configuration

import dev.langchain4j.classification.EmbeddingModelTextClassifier;

// Create classifier with custom thresholds
TextClassifier<Category> classifier = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples,
    3,      // maxResults - return up to 3 labels
    0.6,    // minScore - only labels with score >= 0.6
    0.7     // meanToMaxScoreRatio - filter low-confidence results
);

String text = "Technology news about sports analytics software";
List<Category> categories = classifier.classify(text);
// May return both TECHNOLOGY and SPORTS if both score high enough

String Labels

import dev.langchain4j.classification.EmbeddingModelTextClassifier;
import java.util.List;
import java.util.Map;

// Use string labels instead of enums
Map<String, List<String>> examples = Map.of(
    "positive", List.of(
        "This product is amazing!",
        "I love it, works perfectly",
        "Excellent quality and fast shipping"
    ),
    "negative", List.of(
        "Terrible experience, very disappointed",
        "Poor quality, stopped working after a week",
        "Would not recommend to anyone"
    ),
    "neutral", List.of(
        "Product arrived on time",
        "It's okay, nothing special",
        "Average quality for the price"
    )
);

TextClassifier<String> sentimentClassifier = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples
);

String review = "The product works well but shipping was slow";
List<String> sentiment = sentimentClassifier.classify(review);
System.out.println("Sentiment: " + sentiment); // [neutral]

Multi-Label Classification

import dev.langchain4j.classification.EmbeddingModelTextClassifier;

enum Tag {
    URGENT,
    CUSTOMER_SERVICE,
    BILLING,
    TECHNICAL,
    FEEDBACK
}

Map<Tag, List<String>> examples = Map.of(
    Tag.URGENT, List.of(
        "Need immediate assistance",
        "Critical issue, please help ASAP",
        "Emergency situation"
    ),
    Tag.CUSTOMER_SERVICE, List.of(
        "Question about my order",
        "Need help with product",
        "Service inquiry"
    ),
    Tag.BILLING, List.of(
        "Charge on my credit card",
        "Invoice question",
        "Payment issue"
    ),
    Tag.TECHNICAL, List.of(
        "Software not working properly",
        "Error message when logging in",
        "System compatibility problem"
    ),
    Tag.FEEDBACK, List.of(
        "Suggestion for improvement",
        "Love your product",
        "Could be better if..."
    )
);

// Allow multiple tags
TextClassifier<Tag> classifier = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples,
    5,      // Allow up to 5 tags
    0.5,    // Lower threshold to catch multiple tags
    0.6
);

String email = "Urgent: Having billing issue with my account, need immediate help";
List<Tag> tags = classifier.classify(email);
System.out.println("Tags: " + tags); // [URGENT, BILLING, CUSTOMER_SERVICE]

Document Classification

import dev.langchain4j.classification.TextClassifier;
import dev.langchain4j.data.document.Document;
import dev.langchain4j.data.document.loader.FileSystemDocumentLoader;
import java.nio.file.Path;

enum DocumentType {
    CONTRACT,
    INVOICE,
    REPORT,
    EMAIL,
    MEMO
}

Map<DocumentType, List<String>> examples = Map.of(
    DocumentType.CONTRACT, List.of(
        "This agreement is entered into...",
        "Terms and conditions...",
        "Parties agree to the following..."
    ),
    DocumentType.INVOICE, List.of(
        "Invoice #12345, Amount due...",
        "Payment terms: Net 30...",
        "Total: $1,234.56"
    ),
    DocumentType.REPORT, List.of(
        "Executive Summary...",
        "Analysis shows...",
        "Quarterly results indicate..."
    )
    // ... more examples
);

TextClassifier<DocumentType> classifier = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples
);

// Classify document
Document document = FileSystemDocumentLoader.loadDocument(Path.of("document.txt"));
List<DocumentType> types = classifier.classify(document);
System.out.println("Document type: " + types);

Content Moderation

import dev.langchain4j.classification.EmbeddingModelTextClassifier;

enum ContentFlag {
    SAFE,
    SPAM,
    INAPPROPRIATE,
    PROMOTIONAL
}

Map<ContentFlag, List<String>> examples = Map.of(
    ContentFlag.SAFE, List.of(
        "Thanks for the helpful information",
        "Great discussion about the topic",
        "I appreciate your response"
    ),
    ContentFlag.SPAM, List.of(
        "Buy now! Limited time offer!!!",
        "Click here to win $$$",
        "Free money, click this link"
    ),
    ContentFlag.INAPPROPRIATE, List.of(
        "Offensive language...",
        "Harassing message...",
        "Hate speech..."
    ),
    ContentFlag.PROMOTIONAL, List.of(
        "Check out our new product",
        "Visit our website for deals",
        "Sign up for our newsletter"
    )
);

TextClassifier<ContentFlag> moderator = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples,
    1,      // Single most likely flag
    0.7,    // High threshold for confidence
    0.8
);

String userComment = "Thanks for sharing this information!";
List<ContentFlag> flags = moderator.classify(userComment);

if (flags.isEmpty() || flags.contains(ContentFlag.SAFE)) {
    System.out.println("Comment approved");
} else {
    System.out.println("Comment flagged: " + flags);
}

Intent Classification for Chatbots

import dev.langchain4j.classification.EmbeddingModelTextClassifier;

enum Intent {
    GREETING,
    QUESTION,
    COMPLAINT,
    REQUEST,
    GOODBYE
}

Map<Intent, List<String>> examples = Map.of(
    Intent.GREETING, List.of(
        "Hello",
        "Hi there",
        "Good morning"
    ),
    Intent.QUESTION, List.of(
        "How do I...",
        "What is...",
        "Can you explain..."
    ),
    Intent.COMPLAINT, List.of(
        "I'm not happy with...",
        "This is not working",
        "Very disappointed"
    ),
    Intent.REQUEST, List.of(
        "Please send me...",
        "I need help with...",
        "Can you provide..."
    ),
    Intent.GOODBYE, List.of(
        "Thanks, goodbye",
        "That's all, bye",
        "Thank you for your help"
    )
);

TextClassifier<Intent> intentClassifier = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples
);

// Classify user intent
String userMessage = "Hi, can you help me with my order?";
List<Intent> intents = intentClassifier.classify(userMessage);

// Route to appropriate handler
if (intents.contains(Intent.GREETING)) {
    System.out.println("Bot: Hello! How can I help you today?");
}
if (intents.contains(Intent.QUESTION) || intents.contains(Intent.REQUEST)) {
    System.out.println("Bot: I'd be happy to help with your order...");
}

Language Detection

import dev.langchain4j.classification.EmbeddingModelTextClassifier;

enum Language {
    ENGLISH,
    SPANISH,
    FRENCH,
    GERMAN,
    ITALIAN
}

Map<Language, List<String>> examples = Map.of(
    Language.ENGLISH, List.of(
        "Hello, how are you today?",
        "The weather is nice",
        "I would like to order"
    ),
    Language.SPANISH, List.of(
        "Hola, ¿cómo estás hoy?",
        "El clima es agradable",
        "Me gustaría ordenar"
    ),
    Language.FRENCH, List.of(
        "Bonjour, comment allez-vous aujourd'hui?",
        "Le temps est agréable",
        "Je voudrais commander"
    )
    // ... more examples
);

TextClassifier<Language> languageDetector = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples
);

String text = "Bonjour! Comment puis-je vous aider?";
List<Language> detected = languageDetector.classify(text);
System.out.println("Detected language: " + detected); // [FRENCH]

Configuration Parameters

maxResults

Maximum number of labels to return. Set to 1 for single-label classification or higher for multi-label classification.

Default: 1

Range: 1 to Integer.MAX_VALUE (practical maximum is number of labels)

Usage guidance:

Single-label classification: maxResults=1
Multi-label classification: maxResults=3 to 5 typical
Unlimited labels: Set to number of labels or higher
Combine with minScore to control quality vs quantity tradeoff

minScore

Minimum similarity score threshold (0.0 to 1.0). Only labels with scores above this threshold are returned.

Default: 0.0 (no filtering)

Range: 0.0 to 1.0

Usage guidance:

Permissive classification: 0.3 to 0.5
Balanced classification: 0.5 to 0.7
High-confidence only: 0.7 to 0.9
Very strict: 0.9+
Typical scores range from 0.3 to 0.95 in practice

meanToMaxScoreRatio

Filters out labels with scores too far below the maximum score. For example, with ratio 0.7, if the max score is 0.9, only labels with scores >= 0.63 (0.9 * 0.7) are included.

Default: 0.5

Range: 0.0 to 1.0

Calculation: threshold = maxScore * meanToMaxScoreRatio

Usage guidance:

Disable ratio filtering: 0.0 (only minScore applies)
Permissive: 0.3 to 0.5
Balanced: 0.5 to 0.7 (default)
Strict: 0.7 to 0.9
Very strict: 0.9+ (only near-ties)

Example scenarios:

maxScore=0.8, ratio=0.7: Accept labels with score >= 0.56
maxScore=0.6, ratio=0.8: Accept labels with score >= 0.48
Low maxScore amplifies this filter's effect

Testing Patterns

Unit Testing with Known Examples

import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.*;

@Test
void testClassifierWithKnownExamples() {
    // Use simple examples that should clearly classify
    Map<String, List<String>> examples = Map.of(
        "positive", List.of("great", "excellent", "amazing"),
        "negative", List.of("terrible", "awful", "horrible")
    );

    TextClassifier<String> classifier = new EmbeddingModelTextClassifier<>(
        embeddingModel,
        examples
    );

    // Test positive classification
    List<String> result = classifier.classify("This is great!");
    assertTrue(result.contains("positive"));

    // Test negative classification
    result = classifier.classify("This is terrible!");
    assertTrue(result.contains("negative"));
}

Testing with Score Thresholds

@Test
void testScoreThresholds() {
    Map<String, List<String>> examples = Map.of(
        "tech", List.of("computer", "software", "algorithm"),
        "sports", List.of("football", "basketball", "tennis")
    );

    TextClassifier<String> classifier = new EmbeddingModelTextClassifier<>(
        embeddingModel,
        examples,
        2,      // Return up to 2 labels
        0.7,    // Only high-confidence results
        0.8
    );

    ClassificationResult<String> result = classifier.classifyWithScores("computer");

    // Verify score threshold
    for (ScoredLabel<String> scored : result.scoredLabels()) {
        assertTrue(scored.score() >= 0.7);
    }
}

Testing Edge Cases

@Test
void testEmptyResult() {
    Map<String, List<String>> examples = Map.of(
        "tech", List.of("computer", "software"),
        "sports", List.of("football", "basketball")
    );

    TextClassifier<String> classifier = new EmbeddingModelTextClassifier<>(
        embeddingModel,
        examples,
        1,
        0.9,    // Very high threshold
        0.9
    );

    // Unrelated text may return empty list
    List<String> result = classifier.classify("The weather is nice today");
    assertNotNull(result);
    // May be empty if no label meets threshold
}

@Test
void testMultiLabelClassification() {
    Map<String, List<String>> examples = Map.of(
        "urgent", List.of("ASAP", "emergency", "critical"),
        "billing", List.of("invoice", "payment", "charge")
    );

    TextClassifier<String> classifier = new EmbeddingModelTextClassifier<>(
        embeddingModel,
        examples,
        5,      // Allow multiple labels
        0.5,
        0.6
    );

    List<String> result = classifier.classify("Urgent: billing issue");
    assertTrue(result.size() >= 1);
    // Should likely contain both labels
}

Mock Testing

@Test
void testWithMockEmbeddingModel() {
    EmbeddingModel mockModel = mock(EmbeddingModel.class);

    // Mock returns fixed embeddings
    when(mockModel.embed(anyString())).thenReturn(
        Response.from(Embedding.from(new float[]{0.1f, 0.2f, 0.3f}))
    );

    Map<String, List<String>> examples = Map.of(
        "label1", List.of("example1")
    );

    TextClassifier<String> classifier = new EmbeddingModelTextClassifier<>(
        mockModel,
        examples
    );

    List<String> result = classifier.classify("test");

    // Verify embedding was called
    verify(mockModel, atLeast(1)).embed(anyString());
}

Integration Testing

@Test
void testWithRealEmbeddingModel() {
    // Use actual embedding model for integration test
    EmbeddingModel realModel = OpenAiEmbeddingModel.builder()
        .apiKey(System.getenv("OPENAI_API_KEY"))
        .modelName("text-embedding-3-small")
        .build();

    Map<String, List<String>> examples = Map.of(
        "technology", List.of(
            "artificial intelligence and machine learning",
            "software development and programming",
            "computer hardware and systems"
        )
    );

    TextClassifier<String> classifier = new EmbeddingModelTextClassifier<>(
        realModel,
        examples
    );

    List<String> result = classifier.classify("deep learning neural networks");
    assertEquals(1, result.size());
    assertEquals("technology", result.get(0));
}

Example Selection Guide

Minimum Examples per Label

Recommendation: 3-5 examples per label

2 examples: Minimum viable, but accuracy suffers
3-5 examples: Good balance for most use cases
6-10 examples: Better accuracy, diminishing returns after 10
10+ examples: Only if examples are highly diverse

Example Diversity

Good diversity (recommended):

Map.of(
    "tech", List.of(
        "AI and machine learning algorithms",           // AI focus
        "Software development best practices",          // Development focus
        "Computer hardware specifications",             // Hardware focus
        "Cloud computing infrastructure design"         // Cloud focus
    )
)

Poor diversity (avoid):

Map.of(
    "tech", List.of(
        "AI machine learning",                          // Too similar
        "Machine learning AI",                          // Redundant
        "Artificial intelligence ML",                   // Redundant
        "ML and AI algorithms"                          // Redundant
    )
)

Example Length

Recommendation: Match expected input length

Short inputs (1-10 words): Use short examples
Medium inputs (10-50 words): Use medium examples
Long inputs (50+ words): Use longer examples or extract key sentences

// For short inputs like tags or keywords
Map.of(
    "urgent", List.of("ASAP", "emergency", "critical")
)

// For medium inputs like emails or comments
Map.of(
    "complaint", List.of(
        "I'm very disappointed with the service I received",
        "The product quality is unacceptable and I want a refund",
        "This is the third time I've had issues with your company"
    )
)

Balanced vs Imbalanced Examples

Balanced (recommended for equal importance):

Map.of(
    "positive", List.of("great", "excellent", "amazing"),      // 3 examples
    "negative", List.of("terrible", "awful", "horrible"),      // 3 examples
    "neutral", List.of("okay", "average", "fine")              // 3 examples
)

Imbalanced (acceptable if reflects real-world distribution):

Map.of(
    "spam", List.of("win money", "click here", "free offer"),  // 3 examples
    "safe", List.of(                                           // 6 examples
        "thanks for info",
        "great discussion",
        "helpful response",
        "interesting article",
        "good point",
        "agreed"
    )
)

Cross-Label Similarity

Good separation (recommended):

Map.of(
    "tech", List.of("computer software", "programming code"),
    "sports", List.of("football game", "basketball match")
)

Poor separation (avoid):

Map.of(
    "tech_news", List.of("technology announcement", "tech release"),
    "tech_tutorial", List.of("technology guide", "tech instructions")
    // These are too similar - consider combining into one label
)

Accuracy Improvement Patterns

Pattern 1: Iterative Example Refinement

// Start with simple examples
Map<String, List<String>> v1_examples = Map.of(
    "positive", List.of("good", "great", "excellent")
);

// Test and identify misclassifications
TextClassifier<String> v1 = new EmbeddingModelTextClassifier<>(model, v1_examples);
// If "fantastic" misclassifies, add it as example

// Refine with more diverse examples
Map<String, List<String>> v2_examples = Map.of(
    "positive", List.of(
        "good",                                 // Simple positive
        "great experience",                     // With noun
        "excellent quality",                    // With adjective
        "highly recommend",                     // Different phrasing
        "very satisfied with purchase"          // Complete sentence
    )
);

Pattern 2: Score Analysis and Threshold Tuning

// Step 1: Use classifyWithScores to analyze confidence levels
ClassificationResult<String> result = classifier.classifyWithScores(testText);
for (ScoredLabel<String> scored : result.scoredLabels()) {
    System.out.printf("%s: %.4f%n", scored.label(), scored.score());
}

// Step 2: Analyze score distribution
// If scores are typically 0.4-0.6: Lower minScore to 0.3-0.4
// If scores are typically 0.7-0.9: Raise minScore to 0.6-0.7

// Step 3: Adjust parameters based on observations
TextClassifier<String> tuned = new EmbeddingModelTextClassifier<>(
    model,
    examples,
    1,
    0.5,    // Adjusted based on score analysis
    0.65    // Adjusted based on multi-label needs
);

Pattern 3: Negative Examples Strategy

// When certain texts consistently misclassify, add them as examples to correct label
Map<String, List<String>> improved = Map.of(
    "tech", List.of(
        "computer software",
        "programming code",
        "digital technology",                   // Was misclassifying as "business"
        "software engineering"                  // Was misclassifying as "education"
    ),
    "business", List.of(
        "company strategy",
        "market analysis",
        "corporate management"
        // Note: Don't add tech terms here even if they appear in business context
    )
);

Pattern 4: Hierarchical Classification

// For complex classification, use two-stage approach

// Stage 1: Broad categorization
enum BroadCategory { TECHNICAL, NON_TECHNICAL }
TextClassifier<BroadCategory> stage1 = new EmbeddingModelTextClassifier<>(
    model,
    broadExamples
);

// Stage 2: Fine-grained classification based on stage 1 result
List<BroadCategory> broad = stage1.classify(text);
if (broad.contains(BroadCategory.TECHNICAL)) {
    enum TechSubcategory { SOFTWARE, HARDWARE, NETWORKING }
    TextClassifier<TechSubcategory> stage2 = new EmbeddingModelTextClassifier<>(
        model,
        techExamples
    );
    List<TechSubcategory> specific = stage2.classify(text);
}

Pattern 5: Example Augmentation

// Original examples
List<String> original = List.of(
    "great product",
    "excellent service",
    "amazing quality"
);

// Augment with variations (maintain semantic meaning)
List<String> augmented = new ArrayList<>(original);
augmented.addAll(List.of(
    "product is great",                         // Word order variation
    "service was excellent",                    // Tense variation
    "truly amazing quality",                    // Adverb addition
    "great product, highly recommend",          // Context addition
    "excellent customer service"                // Specificity variation
));

Map<String, List<String>> examples = Map.of("positive", augmented);

Pattern 6: Ensemble Classification

// Create multiple classifiers with different configurations
TextClassifier<String> conservative = new EmbeddingModelTextClassifier<>(
    model, examples, 1, 0.8, 0.8
);
TextClassifier<String> moderate = new EmbeddingModelTextClassifier<>(
    model, examples, 1, 0.6, 0.7
);
TextClassifier<String> permissive = new EmbeddingModelTextClassifier<>(
    model, examples, 1, 0.4, 0.5
);

// Combine results (voting or intersection)
List<String> result1 = conservative.classify(text);
List<String> result2 = moderate.classify(text);
List<String> result3 = permissive.classify(text);

// Intersection: Only labels agreed by all
Set<String> intersection = new HashSet<>(result1);
intersection.retainAll(result2);
intersection.retainAll(result3);

// Union: Any label suggested by any classifier
Set<String> union = new HashSet<>(result1);
union.addAll(result2);
union.addAll(result3);

// Voting: Majority wins
Map<String, Integer> votes = new HashMap<>();
for (String label : union) {
    int count = 0;
    if (result1.contains(label)) count++;
    if (result2.contains(label)) count++;
    if (result3.contains(label)) count++;
    votes.put(label, count);
}
List<String> finalResult = votes.entrySet().stream()
    .filter(e -> e.getValue() >= 2)  // At least 2 classifiers agree
    .map(Map.Entry::getKey)
    .collect(Collectors.toList());

Pattern 7: A/B Testing Classification Strategies

// Track accuracy metrics for different configurations
class ClassificationMetrics {
    int totalClassifications;
    int correctClassifications;
    Map<String, Integer> labelCounts;

    double accuracy() {
        return (double) correctClassifications / totalClassifications;
    }
}

// Test configuration A
TextClassifier<String> configA = new EmbeddingModelTextClassifier<>(
    model, examples, 1, 0.6, 0.7
);
ClassificationMetrics metricsA = evaluateClassifier(configA, testSet);

// Test configuration B
TextClassifier<String> configB = new EmbeddingModelTextClassifier<>(
    model, examples, 1, 0.5, 0.6
);
ClassificationMetrics metricsB = evaluateClassifier(configB, testSet);

// Choose better configuration
TextClassifier<String> production =
    metricsA.accuracy() > metricsB.accuracy() ? configA : configB;

Best Practices

Provide diverse examples: Include varied examples for each label to improve accuracy
Balance examples: Try to provide similar numbers of examples for each label
Quality over quantity: A few high-quality, representative examples are better than many poor examples
Tune thresholds: Adjust minScore and meanToMaxScoreRatio based on your accuracy requirements
Monitor scores: Use classifyWithScores() to understand confidence levels
Handle ambiguity: Consider multi-label classification for cases where multiple labels may apply
Cache classifiers: Create once, reuse many times to avoid re-embedding examples
Test thoroughly: Use known examples to validate classifier behavior before production
Iterate on examples: Refine examples based on misclassifications and edge cases
Consider costs: Balance accuracy needs with embedding API costs (examples + per-classification)

Related APIs

Embedding APIs

EmbeddingModel: Core dependency for classification - see embedding documentation
EmbeddingStore: For vector database storage of embeddings - see embedding store documentation
EmbeddingStoreIngestor: For bulk document processing - see ingestion documentation

Document APIs

Document: Input type for document classification - see document documentation
TextSegment: Input type for segment classification - see document documentation
DocumentSplitter: Split documents before classification - see splitting documentation

Model APIs

ChatLanguageModel: Alternative approach using LLM-based classification - see chat model documentation
StreamingChatLanguageModel: Streaming version of LLM classification - see streaming documentation

Alternative Classification Approaches

Few-shot prompting with ChatLanguageModel: Use LLM with examples in prompt instead of embeddings
Function calling: Define classification labels as function parameters
RAG-based classification: Retrieve similar examples from vector store, then classify with LLM

When to use EmbeddingModelTextClassifier vs alternatives:

Use embedding-based: Fast, cost-effective for high-volume, fixed set of labels
Use LLM-based: More flexible, handles nuanced cases, can explain reasoning, but slower and more expensive
Use hybrid: Embedding filter + LLM refinement for best balance

Integration patterns:

// Pattern 1: Classification before RAG
List<Category> categories = classifier.classify(userQuery);
// Use category to filter document retrieval

// Pattern 2: Classification after RAG
String retrievedContext = retrieveDocuments(userQuery);
List<Intent> intents = classifier.classify(userQuery);
// Use intent to select response template

// Pattern 3: Classification for routing
List<Department> departments = classifier.classify(customerEmail);
// Route to appropriate handler based on classification

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j

docs

document-processing.md

tessl/maven-dev-langchain4j--langchain4j

classification.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

Text Classification

Capabilities

TextClassifier Interface

EmbeddingModelTextClassifier

Classification Results

Usage Examples

Basic Classification with Enum Labels

Classification with Scores

Custom Configuration

String Labels

Multi-Label Classification

Document Classification

Content Moderation

Intent Classification for Chatbots

Language Detection

Configuration Parameters

maxResults

minScore

meanToMaxScoreRatio

Testing Patterns

Unit Testing with Known Examples

Testing with Score Thresholds

Testing Edge Cases

Mock Testing

Integration Testing

Example Selection Guide

Minimum Examples per Label

Example Diversity

Example Length

Balanced vs Imbalanced Examples

Cross-Label Similarity

Accuracy Improvement Patterns

Pattern 1: Iterative Example Refinement

Pattern 2: Score Analysis and Threshold Tuning

Pattern 3: Negative Examples Strategy

Pattern 4: Hierarchical Classification

Pattern 5: Example Augmentation

Pattern 6: Ensemble Classification

Pattern 7: A/B Testing Classification Strategies

Best Practices

Related APIs

Embedding APIs

Document APIs

Model APIs

Alternative Classification Approaches

classification.mddocs/