CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j

Build LLM-powered applications in Java with support for chatbots, agents, RAG, tools, and much more

Overview
Eval results
Files

classification.mddocs/

Text Classification

Text classification using embedding-based similarity with labeled examples. Classifies text by computing similarity between input embeddings and embeddings of pre-labeled example texts.

Capabilities

TextClassifier Interface

Base interface for text classification.

package dev.langchain4j.classification;

/**
 * Interface for classifying text based on a set of labels
 * Can return zero, one, or multiple labels for each classification
 */
public interface TextClassifier<L> {
    /**
     * Classify text and return labels
     * @param text Text to classify
     * @return List of labels (may be empty)
     */
    List<L> classify(String text);

    /**
     * Classify text segment and return labels
     * @param textSegment Text segment to classify
     * @return List of labels (may be empty)
     */
    List<L> classify(TextSegment textSegment);

    /**
     * Classify document and return labels
     * @param document Document to classify
     * @return List of labels (may be empty)
     */
    List<L> classify(Document document);

    /**
     * Classify text and return results with scores
     * @param text Text to classify
     * @return Classification result with scored labels
     */
    ClassificationResult<L> classifyWithScores(String text);

    /**
     * Classify text segment and return results with scores
     * @param textSegment Text segment to classify
     * @return Classification result with scored labels
     */
    ClassificationResult<L> classifyWithScores(TextSegment textSegment);

    /**
     * Classify document and return results with scores
     * @param document Document to classify
     * @return Classification result with scored labels
     */
    ClassificationResult<L> classifyWithScores(Document document);
}

Thread Safety: The TextClassifier interface itself does not specify thread safety guarantees. Thread safety depends on the implementation. For EmbeddingModelTextClassifier, thread safety is determined by the underlying EmbeddingModel - if the embedding model is thread-safe, the classifier can be safely shared across threads. The classifier itself is immutable after construction.

Common Pitfalls:

  • Calling classify() with null text will throw NullPointerException
  • Empty string classification may produce unpredictable results depending on embedding model behavior
  • TextSegment and Document overloads extract text content before classification - metadata is ignored

Edge Cases:

  • Empty input text: May return empty list or unpredictable label depending on embedding model
  • No labels meet threshold: Returns empty list (not null)
  • All labels tied with same score: Order is implementation-dependent
  • Very long text: May be truncated by embedding model's token limits

Performance Notes:

  • Each classify() call requires one embedding API call for the input text
  • classifyWithScores() has same performance as classify() - use it for debugging without overhead
  • TextSegment/Document variants extract text first, adding minimal overhead

Cost Considerations:

  • One embedding API call per classification (input text)
  • Example embeddings are computed once during classifier construction
  • For high-volume classification, reuse classifier instances to avoid re-embedding examples

Exception Handling:

  • NullPointerException if text is null
  • Underlying embedding model may throw exceptions (network errors, rate limits, etc.)
  • No explicit classification-specific exceptions defined in interface

EmbeddingModelTextClassifier

Implementation using embedding model and example-based classification.

package dev.langchain4j.classification;

/**
 * TextClassifier implementation using EmbeddingModel and predefined examples
 * Classification is performed by computing similarity between input text's embedding
 * and embeddings of labeled example texts
 *
 * Works by:
 * 1. Embedding all example texts for each label
 * 2. Embedding the input text
 * 3. Computing cosine similarity between input and all examples
 * 4. Aggregating scores per label
 * 5. Returning labels that meet score thresholds
 */
public class EmbeddingModelTextClassifier<L> implements TextClassifier<L> {
    /**
     * Constructor with default values
     * maxResults=1, minScore=0, meanToMaxScoreRatio=0.5
     * @param embeddingModel Embedding model to use
     * @param examplesByLabel Map of labels to their example texts
     */
    public EmbeddingModelTextClassifier(
        EmbeddingModel embeddingModel,
        Map<L, ? extends Collection<String>> examplesByLabel
    );

    /**
     * Full constructor with all configuration options
     * @param embeddingModel Embedding model to use
     * @param examplesByLabel Map of labels to their example texts
     * @param maxResults Maximum number of labels to return
     * @param minScore Minimum score threshold (0.0 to 1.0)
     * @param meanToMaxScoreRatio Ratio for filtering results
     */
    public EmbeddingModelTextClassifier(
        EmbeddingModel embeddingModel,
        Map<L, ? extends Collection<String>> examplesByLabel,
        int maxResults,
        double minScore,
        double meanToMaxScoreRatio
    );

    /**
     * Classify text and return labels
     * @param text Text to classify
     * @return List of labels
     */
    public List<L> classify(String text);

    /**
     * Classify text and return results with scores
     * @param text Text to classify
     * @return Classification result with scored labels
     */
    public ClassificationResult<L> classifyWithScores(String text);
}

Thread Safety: EmbeddingModelTextClassifier is thread-safe if the underlying EmbeddingModel is thread-safe. The classifier is immutable after construction - all examples are embedded during initialization. Multiple threads can safely call classify() concurrently if the embedding model supports concurrent requests.

Common Pitfalls:

  • Too few examples per label: Need at least 2-3 representative examples per label for reliable classification
  • Imbalanced examples: Labels with many examples may dominate over labels with few examples
  • Similar examples within a label: Redundant examples don't improve accuracy; diverse examples are better
  • Examples too similar across labels: Makes classification difficult - ensure clear semantic separation
  • Default parameters: Default maxResults=1, minScore=0, meanToMaxScoreRatio=0.5 may not suit all use cases

Edge Cases:

  • Empty input text: Behavior depends on embedding model (may throw exception or return zero vector)
  • Label with empty example collection: Label will never be returned (zero examples means zero score)
  • Single example per label: Works but accuracy suffers - use multiple diverse examples
  • maxResults exceeds number of labels: Returns all qualifying labels up to actual label count
  • minScore=1.0: Only perfect/near-perfect matches returned; may return empty list frequently
  • meanToMaxScoreRatio=0.0: Disables ratio filtering; only minScore applies

Performance Notes:

  • Construction time: All examples embedded during initialization - O(total_examples) embedding calls
  • Classification time: One embedding call per classify() invocation for input text
  • Similarity computation: Computed against all example embeddings - O(total_examples) comparisons
  • Optimization: Reuse classifier instances; creating new classifiers is expensive
  • Batch considerations: Classifier processes one input at a time; no native batching support

Cost Considerations:

  • Initialization cost: N embedding API calls where N = total number of example strings across all labels
  • Per-classification cost: 1 embedding API call per classify() invocation
  • Total cost for 1000 classifications: N + 1000 embedding calls
  • Cost reduction strategies:
    • Minimize number of examples (quality over quantity)
    • Reuse classifier instances across requests
    • Cache classifier instances if examples don't change
    • Consider example text length (longer text = higher embedding cost)

Exception Handling:

  • NullPointerException: If embeddingModel or examplesByLabel is null
  • IllegalArgumentException: If maxResults < 1, minScore < 0.0, minScore > 1.0, meanToMaxScoreRatio < 0.0, meanToMaxScoreRatio > 1.0
  • EmbeddingModel exceptions: Network errors, rate limits, authentication failures propagate from embedding calls
  • RuntimeException: May occur if embedding model returns null or invalid embeddings

Classification Results

Result types for classification.

package dev.langchain4j.classification;

/**
 * Represents the result of classification with scored labels
 */
public class ClassificationResult<L> {
    /**
     * Constructor
     * @param scoredLabels List of scored labels
     */
    public ClassificationResult(List<ScoredLabel<L>> scoredLabels);

    /**
     * Get scored labels
     * Sorted by score in descending order (highest first)
     * @return List of scored labels
     */
    public List<ScoredLabel<L>> scoredLabels();

    /**
     * Equality check
     * @param obj Object to compare
     * @return true if equal
     */
    public boolean equals(Object obj);

    /**
     * Hash code
     * @return Hash code
     */
    public int hashCode();

    /**
     * String representation
     * @return String representation
     */
    public String toString();
}

/**
 * Represents a classification label with associated score
 */
public class ScoredLabel<L> {
    /**
     * Constructor
     * @param label The label
     * @param score The score (0.0 to 1.0)
     */
    public ScoredLabel(L label, double score);

    /**
     * Get the label
     * Represents confidence/similarity (0.0 to 1.0)
     * @return The label
     */
    public L label();

    /**
     * Get the score
     * Represents confidence/similarity (0.0 to 1.0)
     * @return The score
     */
    public double score();

    /**
     * Equality check
     * @param obj Object to compare
     * @return true if equal
     */
    public boolean equals(Object obj);

    /**
     * Hash code
     * @return Hash code
     */
    public int hashCode();

    /**
     * String representation
     * @return String representation
     */
    public String toString();
}

Thread Safety: ClassificationResult and ScoredLabel are immutable value objects and fully thread-safe. They can be safely shared across threads without synchronization.

Common Pitfalls:

  • Assuming scoredLabels() returns a modifiable list (it may be immutable)
  • Comparing scores with == instead of using appropriate threshold checks
  • Ignoring that empty scoredLabels() list means no labels met criteria

Edge Cases:

  • Empty scoredLabels() list: Valid result when no labels meet thresholds
  • Multiple labels with identical scores: All returned, order is implementation-dependent
  • Score values: Always in range [0.0, 1.0] but practical range often [0.3, 0.95]

Performance Notes:

  • Immutable objects created once per classification
  • No allocation overhead when accessing labels/scores
  • Minimal memory footprint

Cost Considerations:

  • Negligible - simple value objects
  • No API calls or expensive operations

Exception Handling:

  • NullPointerException: If constructor receives null scoredLabels list or list contains null elements
  • No exceptions thrown by accessor methods

Usage Examples

Basic Classification with Enum Labels

import dev.langchain4j.classification.EmbeddingModelTextClassifier;
import dev.langchain4j.model.embedding.EmbeddingModel;
import java.util.List;
import java.util.Map;

enum Category {
    TECHNOLOGY,
    SPORTS,
    POLITICS,
    ENTERTAINMENT
}

// Define examples for each category
Map<Category, List<String>> examples = Map.of(
    Category.TECHNOLOGY, List.of(
        "New smartphone released with advanced AI features",
        "Software update improves system performance",
        "Tech company announces breakthrough in quantum computing"
    ),
    Category.SPORTS, List.of(
        "Team wins championship in overtime",
        "Athlete breaks world record",
        "Coach announces retirement after successful season"
    ),
    Category.POLITICS, List.of(
        "Government passes new legislation",
        "Election results announced",
        "Politicians debate policy changes"
    ),
    Category.ENTERTAINMENT, List.of(
        "New movie breaks box office records",
        "Actor wins prestigious award",
        "Music festival announces lineup"
    )
);

// Create classifier
TextClassifier<Category> classifier = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples
);

// Classify text
String text = "Scientists develop new artificial intelligence algorithm";
List<Category> categories = classifier.classify(text);
System.out.println("Categories: " + categories); // [TECHNOLOGY]

Classification with Scores

import dev.langchain4j.classification.ClassificationResult;
import dev.langchain4j.classification.ScoredLabel;

// Classify with scores
String text = "Basketball team advances to playoffs";
ClassificationResult<Category> result = classifier.classifyWithScores(text);

for (ScoredLabel<Category> scoredLabel : result.scoredLabels()) {
    System.out.printf("%s: %.2f%n",
        scoredLabel.label(),
        scoredLabel.score()
    );
}
// Output:
// SPORTS: 0.87
// ENTERTAINMENT: 0.42

Custom Configuration

import dev.langchain4j.classification.EmbeddingModelTextClassifier;

// Create classifier with custom thresholds
TextClassifier<Category> classifier = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples,
    3,      // maxResults - return up to 3 labels
    0.6,    // minScore - only labels with score >= 0.6
    0.7     // meanToMaxScoreRatio - filter low-confidence results
);

String text = "Technology news about sports analytics software";
List<Category> categories = classifier.classify(text);
// May return both TECHNOLOGY and SPORTS if both score high enough

String Labels

import dev.langchain4j.classification.EmbeddingModelTextClassifier;
import java.util.List;
import java.util.Map;

// Use string labels instead of enums
Map<String, List<String>> examples = Map.of(
    "positive", List.of(
        "This product is amazing!",
        "I love it, works perfectly",
        "Excellent quality and fast shipping"
    ),
    "negative", List.of(
        "Terrible experience, very disappointed",
        "Poor quality, stopped working after a week",
        "Would not recommend to anyone"
    ),
    "neutral", List.of(
        "Product arrived on time",
        "It's okay, nothing special",
        "Average quality for the price"
    )
);

TextClassifier<String> sentimentClassifier = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples
);

String review = "The product works well but shipping was slow";
List<String> sentiment = sentimentClassifier.classify(review);
System.out.println("Sentiment: " + sentiment); // [neutral]

Multi-Label Classification

import dev.langchain4j.classification.EmbeddingModelTextClassifier;

enum Tag {
    URGENT,
    CUSTOMER_SERVICE,
    BILLING,
    TECHNICAL,
    FEEDBACK
}

Map<Tag, List<String>> examples = Map.of(
    Tag.URGENT, List.of(
        "Need immediate assistance",
        "Critical issue, please help ASAP",
        "Emergency situation"
    ),
    Tag.CUSTOMER_SERVICE, List.of(
        "Question about my order",
        "Need help with product",
        "Service inquiry"
    ),
    Tag.BILLING, List.of(
        "Charge on my credit card",
        "Invoice question",
        "Payment issue"
    ),
    Tag.TECHNICAL, List.of(
        "Software not working properly",
        "Error message when logging in",
        "System compatibility problem"
    ),
    Tag.FEEDBACK, List.of(
        "Suggestion for improvement",
        "Love your product",
        "Could be better if..."
    )
);

// Allow multiple tags
TextClassifier<Tag> classifier = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples,
    5,      // Allow up to 5 tags
    0.5,    // Lower threshold to catch multiple tags
    0.6
);

String email = "Urgent: Having billing issue with my account, need immediate help";
List<Tag> tags = classifier.classify(email);
System.out.println("Tags: " + tags); // [URGENT, BILLING, CUSTOMER_SERVICE]

Document Classification

import dev.langchain4j.classification.TextClassifier;
import dev.langchain4j.data.document.Document;
import dev.langchain4j.data.document.loader.FileSystemDocumentLoader;
import java.nio.file.Path;

enum DocumentType {
    CONTRACT,
    INVOICE,
    REPORT,
    EMAIL,
    MEMO
}

Map<DocumentType, List<String>> examples = Map.of(
    DocumentType.CONTRACT, List.of(
        "This agreement is entered into...",
        "Terms and conditions...",
        "Parties agree to the following..."
    ),
    DocumentType.INVOICE, List.of(
        "Invoice #12345, Amount due...",
        "Payment terms: Net 30...",
        "Total: $1,234.56"
    ),
    DocumentType.REPORT, List.of(
        "Executive Summary...",
        "Analysis shows...",
        "Quarterly results indicate..."
    )
    // ... more examples
);

TextClassifier<DocumentType> classifier = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples
);

// Classify document
Document document = FileSystemDocumentLoader.loadDocument(Path.of("document.txt"));
List<DocumentType> types = classifier.classify(document);
System.out.println("Document type: " + types);

Content Moderation

import dev.langchain4j.classification.EmbeddingModelTextClassifier;

enum ContentFlag {
    SAFE,
    SPAM,
    INAPPROPRIATE,
    PROMOTIONAL
}

Map<ContentFlag, List<String>> examples = Map.of(
    ContentFlag.SAFE, List.of(
        "Thanks for the helpful information",
        "Great discussion about the topic",
        "I appreciate your response"
    ),
    ContentFlag.SPAM, List.of(
        "Buy now! Limited time offer!!!",
        "Click here to win $$$",
        "Free money, click this link"
    ),
    ContentFlag.INAPPROPRIATE, List.of(
        "Offensive language...",
        "Harassing message...",
        "Hate speech..."
    ),
    ContentFlag.PROMOTIONAL, List.of(
        "Check out our new product",
        "Visit our website for deals",
        "Sign up for our newsletter"
    )
);

TextClassifier<ContentFlag> moderator = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples,
    1,      // Single most likely flag
    0.7,    // High threshold for confidence
    0.8
);

String userComment = "Thanks for sharing this information!";
List<ContentFlag> flags = moderator.classify(userComment);

if (flags.isEmpty() || flags.contains(ContentFlag.SAFE)) {
    System.out.println("Comment approved");
} else {
    System.out.println("Comment flagged: " + flags);
}

Intent Classification for Chatbots

import dev.langchain4j.classification.EmbeddingModelTextClassifier;

enum Intent {
    GREETING,
    QUESTION,
    COMPLAINT,
    REQUEST,
    GOODBYE
}

Map<Intent, List<String>> examples = Map.of(
    Intent.GREETING, List.of(
        "Hello",
        "Hi there",
        "Good morning"
    ),
    Intent.QUESTION, List.of(
        "How do I...",
        "What is...",
        "Can you explain..."
    ),
    Intent.COMPLAINT, List.of(
        "I'm not happy with...",
        "This is not working",
        "Very disappointed"
    ),
    Intent.REQUEST, List.of(
        "Please send me...",
        "I need help with...",
        "Can you provide..."
    ),
    Intent.GOODBYE, List.of(
        "Thanks, goodbye",
        "That's all, bye",
        "Thank you for your help"
    )
);

TextClassifier<Intent> intentClassifier = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples
);

// Classify user intent
String userMessage = "Hi, can you help me with my order?";
List<Intent> intents = intentClassifier.classify(userMessage);

// Route to appropriate handler
if (intents.contains(Intent.GREETING)) {
    System.out.println("Bot: Hello! How can I help you today?");
}
if (intents.contains(Intent.QUESTION) || intents.contains(Intent.REQUEST)) {
    System.out.println("Bot: I'd be happy to help with your order...");
}

Language Detection

import dev.langchain4j.classification.EmbeddingModelTextClassifier;

enum Language {
    ENGLISH,
    SPANISH,
    FRENCH,
    GERMAN,
    ITALIAN
}

Map<Language, List<String>> examples = Map.of(
    Language.ENGLISH, List.of(
        "Hello, how are you today?",
        "The weather is nice",
        "I would like to order"
    ),
    Language.SPANISH, List.of(
        "Hola, ¿cómo estás hoy?",
        "El clima es agradable",
        "Me gustaría ordenar"
    ),
    Language.FRENCH, List.of(
        "Bonjour, comment allez-vous aujourd'hui?",
        "Le temps est agréable",
        "Je voudrais commander"
    )
    // ... more examples
);

TextClassifier<Language> languageDetector = new EmbeddingModelTextClassifier<>(
    embeddingModel,
    examples
);

String text = "Bonjour! Comment puis-je vous aider?";
List<Language> detected = languageDetector.classify(text);
System.out.println("Detected language: " + detected); // [FRENCH]

Configuration Parameters

maxResults

Maximum number of labels to return. Set to 1 for single-label classification or higher for multi-label classification.

Default: 1

Range: 1 to Integer.MAX_VALUE (practical maximum is number of labels)

Usage guidance:

  • Single-label classification: maxResults=1
  • Multi-label classification: maxResults=3 to 5 typical
  • Unlimited labels: Set to number of labels or higher
  • Combine with minScore to control quality vs quantity tradeoff

minScore

Minimum similarity score threshold (0.0 to 1.0). Only labels with scores above this threshold are returned.

Default: 0.0 (no filtering)

Range: 0.0 to 1.0

Usage guidance:

  • Permissive classification: 0.3 to 0.5
  • Balanced classification: 0.5 to 0.7
  • High-confidence only: 0.7 to 0.9
  • Very strict: 0.9+
  • Typical scores range from 0.3 to 0.95 in practice

meanToMaxScoreRatio

Filters out labels with scores too far below the maximum score. For example, with ratio 0.7, if the max score is 0.9, only labels with scores >= 0.63 (0.9 * 0.7) are included.

Default: 0.5

Range: 0.0 to 1.0

Calculation: threshold = maxScore * meanToMaxScoreRatio

Usage guidance:

  • Disable ratio filtering: 0.0 (only minScore applies)
  • Permissive: 0.3 to 0.5
  • Balanced: 0.5 to 0.7 (default)
  • Strict: 0.7 to 0.9
  • Very strict: 0.9+ (only near-ties)

Example scenarios:

  • maxScore=0.8, ratio=0.7: Accept labels with score >= 0.56
  • maxScore=0.6, ratio=0.8: Accept labels with score >= 0.48
  • Low maxScore amplifies this filter's effect

Testing Patterns

Unit Testing with Known Examples

import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.*;

@Test
void testClassifierWithKnownExamples() {
    // Use simple examples that should clearly classify
    Map<String, List<String>> examples = Map.of(
        "positive", List.of("great", "excellent", "amazing"),
        "negative", List.of("terrible", "awful", "horrible")
    );

    TextClassifier<String> classifier = new EmbeddingModelTextClassifier<>(
        embeddingModel,
        examples
    );

    // Test positive classification
    List<String> result = classifier.classify("This is great!");
    assertTrue(result.contains("positive"));

    // Test negative classification
    result = classifier.classify("This is terrible!");
    assertTrue(result.contains("negative"));
}

Testing with Score Thresholds

@Test
void testScoreThresholds() {
    Map<String, List<String>> examples = Map.of(
        "tech", List.of("computer", "software", "algorithm"),
        "sports", List.of("football", "basketball", "tennis")
    );

    TextClassifier<String> classifier = new EmbeddingModelTextClassifier<>(
        embeddingModel,
        examples,
        2,      // Return up to 2 labels
        0.7,    // Only high-confidence results
        0.8
    );

    ClassificationResult<String> result = classifier.classifyWithScores("computer");

    // Verify score threshold
    for (ScoredLabel<String> scored : result.scoredLabels()) {
        assertTrue(scored.score() >= 0.7);
    }
}

Testing Edge Cases

@Test
void testEmptyResult() {
    Map<String, List<String>> examples = Map.of(
        "tech", List.of("computer", "software"),
        "sports", List.of("football", "basketball")
    );

    TextClassifier<String> classifier = new EmbeddingModelTextClassifier<>(
        embeddingModel,
        examples,
        1,
        0.9,    // Very high threshold
        0.9
    );

    // Unrelated text may return empty list
    List<String> result = classifier.classify("The weather is nice today");
    assertNotNull(result);
    // May be empty if no label meets threshold
}

@Test
void testMultiLabelClassification() {
    Map<String, List<String>> examples = Map.of(
        "urgent", List.of("ASAP", "emergency", "critical"),
        "billing", List.of("invoice", "payment", "charge")
    );

    TextClassifier<String> classifier = new EmbeddingModelTextClassifier<>(
        embeddingModel,
        examples,
        5,      // Allow multiple labels
        0.5,
        0.6
    );

    List<String> result = classifier.classify("Urgent: billing issue");
    assertTrue(result.size() >= 1);
    // Should likely contain both labels
}

Mock Testing

@Test
void testWithMockEmbeddingModel() {
    EmbeddingModel mockModel = mock(EmbeddingModel.class);

    // Mock returns fixed embeddings
    when(mockModel.embed(anyString())).thenReturn(
        Response.from(Embedding.from(new float[]{0.1f, 0.2f, 0.3f}))
    );

    Map<String, List<String>> examples = Map.of(
        "label1", List.of("example1")
    );

    TextClassifier<String> classifier = new EmbeddingModelTextClassifier<>(
        mockModel,
        examples
    );

    List<String> result = classifier.classify("test");

    // Verify embedding was called
    verify(mockModel, atLeast(1)).embed(anyString());
}

Integration Testing

@Test
void testWithRealEmbeddingModel() {
    // Use actual embedding model for integration test
    EmbeddingModel realModel = OpenAiEmbeddingModel.builder()
        .apiKey(System.getenv("OPENAI_API_KEY"))
        .modelName("text-embedding-3-small")
        .build();

    Map<String, List<String>> examples = Map.of(
        "technology", List.of(
            "artificial intelligence and machine learning",
            "software development and programming",
            "computer hardware and systems"
        )
    );

    TextClassifier<String> classifier = new EmbeddingModelTextClassifier<>(
        realModel,
        examples
    );

    List<String> result = classifier.classify("deep learning neural networks");
    assertEquals(1, result.size());
    assertEquals("technology", result.get(0));
}

Example Selection Guide

Minimum Examples per Label

Recommendation: 3-5 examples per label

  • 2 examples: Minimum viable, but accuracy suffers
  • 3-5 examples: Good balance for most use cases
  • 6-10 examples: Better accuracy, diminishing returns after 10
  • 10+ examples: Only if examples are highly diverse

Example Diversity

Good diversity (recommended):

Map.of(
    "tech", List.of(
        "AI and machine learning algorithms",           // AI focus
        "Software development best practices",          // Development focus
        "Computer hardware specifications",             // Hardware focus
        "Cloud computing infrastructure design"         // Cloud focus
    )
)

Poor diversity (avoid):

Map.of(
    "tech", List.of(
        "AI machine learning",                          // Too similar
        "Machine learning AI",                          // Redundant
        "Artificial intelligence ML",                   // Redundant
        "ML and AI algorithms"                          // Redundant
    )
)

Example Length

Recommendation: Match expected input length

  • Short inputs (1-10 words): Use short examples
  • Medium inputs (10-50 words): Use medium examples
  • Long inputs (50+ words): Use longer examples or extract key sentences
// For short inputs like tags or keywords
Map.of(
    "urgent", List.of("ASAP", "emergency", "critical")
)

// For medium inputs like emails or comments
Map.of(
    "complaint", List.of(
        "I'm very disappointed with the service I received",
        "The product quality is unacceptable and I want a refund",
        "This is the third time I've had issues with your company"
    )
)

Balanced vs Imbalanced Examples

Balanced (recommended for equal importance):

Map.of(
    "positive", List.of("great", "excellent", "amazing"),      // 3 examples
    "negative", List.of("terrible", "awful", "horrible"),      // 3 examples
    "neutral", List.of("okay", "average", "fine")              // 3 examples
)

Imbalanced (acceptable if reflects real-world distribution):

Map.of(
    "spam", List.of("win money", "click here", "free offer"),  // 3 examples
    "safe", List.of(                                           // 6 examples
        "thanks for info",
        "great discussion",
        "helpful response",
        "interesting article",
        "good point",
        "agreed"
    )
)

Cross-Label Similarity

Good separation (recommended):

Map.of(
    "tech", List.of("computer software", "programming code"),
    "sports", List.of("football game", "basketball match")
)

Poor separation (avoid):

Map.of(
    "tech_news", List.of("technology announcement", "tech release"),
    "tech_tutorial", List.of("technology guide", "tech instructions")
    // These are too similar - consider combining into one label
)

Accuracy Improvement Patterns

Pattern 1: Iterative Example Refinement

// Start with simple examples
Map<String, List<String>> v1_examples = Map.of(
    "positive", List.of("good", "great", "excellent")
);

// Test and identify misclassifications
TextClassifier<String> v1 = new EmbeddingModelTextClassifier<>(model, v1_examples);
// If "fantastic" misclassifies, add it as example

// Refine with more diverse examples
Map<String, List<String>> v2_examples = Map.of(
    "positive", List.of(
        "good",                                 // Simple positive
        "great experience",                     // With noun
        "excellent quality",                    // With adjective
        "highly recommend",                     // Different phrasing
        "very satisfied with purchase"          // Complete sentence
    )
);

Pattern 2: Score Analysis and Threshold Tuning

// Step 1: Use classifyWithScores to analyze confidence levels
ClassificationResult<String> result = classifier.classifyWithScores(testText);
for (ScoredLabel<String> scored : result.scoredLabels()) {
    System.out.printf("%s: %.4f%n", scored.label(), scored.score());
}

// Step 2: Analyze score distribution
// If scores are typically 0.4-0.6: Lower minScore to 0.3-0.4
// If scores are typically 0.7-0.9: Raise minScore to 0.6-0.7

// Step 3: Adjust parameters based on observations
TextClassifier<String> tuned = new EmbeddingModelTextClassifier<>(
    model,
    examples,
    1,
    0.5,    // Adjusted based on score analysis
    0.65    // Adjusted based on multi-label needs
);

Pattern 3: Negative Examples Strategy

// When certain texts consistently misclassify, add them as examples to correct label
Map<String, List<String>> improved = Map.of(
    "tech", List.of(
        "computer software",
        "programming code",
        "digital technology",                   // Was misclassifying as "business"
        "software engineering"                  // Was misclassifying as "education"
    ),
    "business", List.of(
        "company strategy",
        "market analysis",
        "corporate management"
        // Note: Don't add tech terms here even if they appear in business context
    )
);

Pattern 4: Hierarchical Classification

// For complex classification, use two-stage approach

// Stage 1: Broad categorization
enum BroadCategory { TECHNICAL, NON_TECHNICAL }
TextClassifier<BroadCategory> stage1 = new EmbeddingModelTextClassifier<>(
    model,
    broadExamples
);

// Stage 2: Fine-grained classification based on stage 1 result
List<BroadCategory> broad = stage1.classify(text);
if (broad.contains(BroadCategory.TECHNICAL)) {
    enum TechSubcategory { SOFTWARE, HARDWARE, NETWORKING }
    TextClassifier<TechSubcategory> stage2 = new EmbeddingModelTextClassifier<>(
        model,
        techExamples
    );
    List<TechSubcategory> specific = stage2.classify(text);
}

Pattern 5: Example Augmentation

// Original examples
List<String> original = List.of(
    "great product",
    "excellent service",
    "amazing quality"
);

// Augment with variations (maintain semantic meaning)
List<String> augmented = new ArrayList<>(original);
augmented.addAll(List.of(
    "product is great",                         // Word order variation
    "service was excellent",                    // Tense variation
    "truly amazing quality",                    // Adverb addition
    "great product, highly recommend",          // Context addition
    "excellent customer service"                // Specificity variation
));

Map<String, List<String>> examples = Map.of("positive", augmented);

Pattern 6: Ensemble Classification

// Create multiple classifiers with different configurations
TextClassifier<String> conservative = new EmbeddingModelTextClassifier<>(
    model, examples, 1, 0.8, 0.8
);
TextClassifier<String> moderate = new EmbeddingModelTextClassifier<>(
    model, examples, 1, 0.6, 0.7
);
TextClassifier<String> permissive = new EmbeddingModelTextClassifier<>(
    model, examples, 1, 0.4, 0.5
);

// Combine results (voting or intersection)
List<String> result1 = conservative.classify(text);
List<String> result2 = moderate.classify(text);
List<String> result3 = permissive.classify(text);

// Intersection: Only labels agreed by all
Set<String> intersection = new HashSet<>(result1);
intersection.retainAll(result2);
intersection.retainAll(result3);

// Union: Any label suggested by any classifier
Set<String> union = new HashSet<>(result1);
union.addAll(result2);
union.addAll(result3);

// Voting: Majority wins
Map<String, Integer> votes = new HashMap<>();
for (String label : union) {
    int count = 0;
    if (result1.contains(label)) count++;
    if (result2.contains(label)) count++;
    if (result3.contains(label)) count++;
    votes.put(label, count);
}
List<String> finalResult = votes.entrySet().stream()
    .filter(e -> e.getValue() >= 2)  // At least 2 classifiers agree
    .map(Map.Entry::getKey)
    .collect(Collectors.toList());

Pattern 7: A/B Testing Classification Strategies

// Track accuracy metrics for different configurations
class ClassificationMetrics {
    int totalClassifications;
    int correctClassifications;
    Map<String, Integer> labelCounts;

    double accuracy() {
        return (double) correctClassifications / totalClassifications;
    }
}

// Test configuration A
TextClassifier<String> configA = new EmbeddingModelTextClassifier<>(
    model, examples, 1, 0.6, 0.7
);
ClassificationMetrics metricsA = evaluateClassifier(configA, testSet);

// Test configuration B
TextClassifier<String> configB = new EmbeddingModelTextClassifier<>(
    model, examples, 1, 0.5, 0.6
);
ClassificationMetrics metricsB = evaluateClassifier(configB, testSet);

// Choose better configuration
TextClassifier<String> production =
    metricsA.accuracy() > metricsB.accuracy() ? configA : configB;

Best Practices

  1. Provide diverse examples: Include varied examples for each label to improve accuracy
  2. Balance examples: Try to provide similar numbers of examples for each label
  3. Quality over quantity: A few high-quality, representative examples are better than many poor examples
  4. Tune thresholds: Adjust minScore and meanToMaxScoreRatio based on your accuracy requirements
  5. Monitor scores: Use classifyWithScores() to understand confidence levels
  6. Handle ambiguity: Consider multi-label classification for cases where multiple labels may apply
  7. Cache classifiers: Create once, reuse many times to avoid re-embedding examples
  8. Test thoroughly: Use known examples to validate classifier behavior before production
  9. Iterate on examples: Refine examples based on misclassifications and edge cases
  10. Consider costs: Balance accuracy needs with embedding API costs (examples + per-classification)

Related APIs

Embedding APIs

  • EmbeddingModel: Core dependency for classification - see embedding documentation
  • EmbeddingStore: For vector database storage of embeddings - see embedding store documentation
  • EmbeddingStoreIngestor: For bulk document processing - see ingestion documentation

Document APIs

Model APIs

Alternative Classification Approaches

  • Few-shot prompting with ChatLanguageModel: Use LLM with examples in prompt instead of embeddings
  • Function calling: Define classification labels as function parameters
  • RAG-based classification: Retrieve similar examples from vector store, then classify with LLM

When to use EmbeddingModelTextClassifier vs alternatives:

  • Use embedding-based: Fast, cost-effective for high-volume, fixed set of labels
  • Use LLM-based: More flexible, handles nuanced cases, can explain reasoning, but slower and more expensive
  • Use hybrid: Embedding filter + LLM refinement for best balance

Integration patterns:

// Pattern 1: Classification before RAG
List<Category> categories = classifier.classify(userQuery);
// Use category to filter document retrieval

// Pattern 2: Classification after RAG
String retrievedContext = retrieveDocuments(userQuery);
List<Intent> intents = classifier.classify(userQuery);
// Use intent to select response template

// Pattern 3: Classification for routing
List<Department> departments = classifier.classify(customerEmail);
// Route to appropriate handler based on classification

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j

docs

ai-services.md

chains.md

classification.md

data-types.md

document-processing.md

embedding-store.md

guardrails.md

index.md

memory.md

messages.md

models.md

output-parsing.md

prompts.md

rag.md

request-response.md

spi.md

tools.md

README.md

tile.json