CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j

Build LLM-powered applications in Java with support for chatbots, agents, RAG, tools, and much more

Overview
Eval results
Files

output-parsing.mddocs/

Output Parsing

Automatic conversion of LLM outputs to Java types including primitives, dates, enums, POJOs, and collections. Output parsers handle formatting instructions and parsing of LLM responses.

Capabilities

OutputParser Interface

Base interface for all output parsers.

package dev.langchain4j.service.output;

/**
 * Interface for parsing LLM output to desired types
 */
public interface OutputParser<T> {
    /**
     * Parse LLM output text to target type
     * @param text Output text from LLM
     * @return Parsed object of type T
     */
    T parse(String text);

    /**
     * Get format instructions to include in prompt
     * These instructions tell the LLM how to format its response
     * @return Format instructions string
     */
    String formatInstructions();
}

Thread Safety: All built-in OutputParser implementations are stateless and thread-safe. Multiple threads can safely call parse() and formatInstructions() concurrently on the same parser instance. Custom implementations should maintain this guarantee.

Common Pitfalls:

  • LLMs may include explanatory text before/after the actual output - parsers handle this by extracting relevant data
  • LLMs may not strictly follow format instructions, especially with complex schemas
  • Custom parser implementations must handle whitespace and newlines in LLM output
  • formatInstructions() should be clear and unambiguous to guide the LLM

Edge Cases:

  • Empty string input: Most parsers throw OutputParsingException for empty or blank strings
  • Null input: parse(null) throws NullPointerException - always validate input
  • Multiple values in output: Primitive parsers extract the first valid value they find
  • Mixed case in output: Case sensitivity depends on the parser type (e.g., enums are case-insensitive, JSON keys are case-sensitive)

Performance Notes:

  • parse() overhead is minimal for primitive types (regex-based extraction)
  • POJO parsers incur JSON parsing overhead using Jackson
  • formatInstructions() should be called once and cached when creating prompts
  • Avoid creating new parser instances repeatedly - reuse them where possible

Exception Handling:

  • parse() throws OutputParsingException (unchecked) when parsing fails
  • OutputParsingException may wrap underlying exceptions (e.g., JsonProcessingException, NumberFormatException)
  • Always catch OutputParsingException when parsing untrusted LLM output
  • Exception messages include the original text and parsing error details

OutputParserFactory

Factory interface for creating output parsers.

package dev.langchain4j.service.output;

/**
 * Factory interface for creating output parsers
 */
public interface OutputParserFactory {
    /**
     * Create output parser for the given type
     * @param returnType Type to parse to
     * @return OutputParser instance
     */
    OutputParser<?> create(Type returnType);
}

/**
 * Default implementation of OutputParserFactory
 * Automatically selected based on return type
 */
public class DefaultOutputParserFactory implements OutputParserFactory {
    /**
     * Create output parser for the given type
     * @param returnType Type to parse to
     * @return OutputParser instance
     */
    public OutputParser<?> create(Type returnType);
}

Thread Safety: DefaultOutputParserFactory is thread-safe and can be shared across multiple threads. It creates new parser instances per invocation, which are themselves thread-safe.

Common Pitfalls:

  • Unknown types throw IllegalArgumentException - ensure the return type is supported
  • Generic type information is erased at runtime - use ParameterizedType for collections
  • Custom factories must handle all possible Type subclasses (Class, ParameterizedType, etc.)
  • Factory selection happens at AiServices creation time, not per method call

Edge Cases:

  • Array types are not supported - use List or Set instead
  • Wildcard generics (List<?>) default to List<String> behavior
  • Raw types (List without generic) are treated as List<String>
  • Multiple nested generics (Map<String, List<Person>>) are not supported

Performance Notes:

  • DefaultOutputParserFactory uses type introspection - this happens once during AiServices proxy creation
  • Parser creation is fast (no reflection on parse() calls)
  • Consider implementing custom OutputParserFactory for specialized types to avoid repeated type checking

Exception Handling:

  • create() throws IllegalArgumentException for unsupported types
  • No checked exceptions are thrown during parser creation
  • Validation happens eagerly at AiServices creation time, not during method invocation

Related APIs:

  • OutputParser - Created by this factory
  • AiServices - Uses this factory during proxy creation
  • PojoOutputParser - For custom object types

Primitive Type Parsers

Parsers for Java primitive types and their wrappers.

package dev.langchain4j.service.output;

/**
 * Parses output to boolean
 */
public class BooleanOutputParser implements OutputParser<Boolean> {
    public Boolean parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to byte
 */
public class ByteOutputParser implements OutputParser<Byte> {
    public Byte parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to short
 */
public class ShortOutputParser implements OutputParser<Short> {
    public Short parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to int
 */
public class IntegerOutputParser implements OutputParser<Integer> {
    public Integer parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to long
 */
public class LongOutputParser implements OutputParser<Long> {
    public Long parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to float
 */
public class FloatOutputParser implements OutputParser<Float> {
    public Float parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to double
 */
public class DoubleOutputParser implements OutputParser<Double> {
    public Double parse(String text);
    public String formatInstructions();
}

Thread Safety: All primitive type parsers are stateless and thread-safe. They use immutable regex patterns and can be safely shared across threads.

Common Pitfalls:

  • BooleanOutputParser accepts variations like "yes/no", "true/false", "1/0" - case-insensitive
  • Numeric parsers extract the first number found in text - "The answer is 42 or maybe 43" returns 42
  • LLMs may include thousands separators (1,000) which can cause NumberFormatException
  • Floating point parsers accept both dot (1.5) and comma (1,5) as decimal separators depending on locale
  • Byte and Short parsers validate range - values outside [-128,127] or [-32768,32767] throw exception

Edge Cases:

  • BooleanOutputParser: "maybe", "unknown", "N/A" throw OutputParsingException
  • Empty strings or only whitespace throw OutputParsingException
  • Scientific notation (1.5e10) is supported by Float/Double parsers but not Integer/Long
  • Negative numbers must have minus sign directly adjacent to digits (-42 works, "- 42" may fail)
  • Leading zeros (007) are parsed correctly as 7

Performance Notes:

  • Regex-based extraction is very fast (microseconds)
  • No object allocation except for the returned wrapper object
  • BooleanOutputParser is slightly slower due to multiple pattern checks
  • Consider primitive return types (int, boolean) instead of wrappers to avoid autoboxing overhead

Exception Handling:

  • NumberFormatException wrapped in OutputParsingException for invalid numeric formats
  • OutputParsingException with descriptive message when no valid value found
  • Range overflow for Byte/Short wrapped in OutputParsingException

Related APIs:

  • BigIntegerOutputParser, BigDecimalOutputParser - For arbitrary precision
  • EnumOutputParser - For constrained value sets
  • StringListOutputParser - For extracting multiple values

Number Type Parsers

Parsers for arbitrary precision numbers.

package dev.langchain4j.service.output;

/**
 * Parses output to BigInteger
 */
public class BigIntegerOutputParser implements OutputParser<BigInteger> {
    public BigInteger parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to BigDecimal
 */
public class BigDecimalOutputParser implements OutputParser<BigDecimal> {
    public BigDecimal parse(String text);
    public String formatInstructions();
}

Thread Safety: Both parsers are stateless and thread-safe. BigInteger and BigDecimal are immutable, making the entire parsing pipeline thread-safe.

Common Pitfalls:

  • BigDecimal preserves exact decimal precision - 1.0 and 1.00 are different values
  • LLMs may output very large numbers that exceed Long.MAX_VALUE - BigInteger handles this gracefully
  • Thousands separators and currency symbols must be stripped by the LLM or preprocessing
  • Scientific notation in BigDecimal preserves precision (1.5E+10 is exact)

Edge Cases:

  • Leading/trailing whitespace is trimmed automatically
  • Plus sign prefix (+123) is accepted
  • Exponential notation (1E10) works for BigDecimal, but not BigInteger
  • Infinity and NaN strings throw OutputParsingException
  • Hexadecimal (0x123) and octal (0123) formats are not supported by default

Performance Notes:

  • BigInteger parsing is slower than Long (allocates internal byte array)
  • BigDecimal parsing involves both mantissa and scale calculation
  • For numbers fitting in Long range, prefer LongOutputParser for better performance
  • Avoid BigDecimal for high-frequency parsing if standard floating point precision suffices

Exception Handling:

  • NumberFormatException wrapped in OutputParsingException for invalid formats
  • OutputParsingException when no numeric value is extractable from text
  • No overflow exceptions - BigInteger/BigDecimal handle arbitrarily large values

Related APIs:

  • IntegerOutputParser, LongOutputParser - For standard range integers
  • DoubleOutputParser - For floating point without exact precision requirements
  • PojoOutputParser - For structured numeric data

Date and Time Parsers

Parsers for date and time types.

package dev.langchain4j.service.output;

/**
 * Parses output to Date
 */
public class DateOutputParser implements OutputParser<Date> {
    public Date parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to LocalDate
 */
public class LocalDateOutputParser implements OutputParser<LocalDate> {
    public LocalDate parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to LocalTime
 */
public class LocalTimeOutputParser implements OutputParser<LocalTime> {
    public LocalTime parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to LocalDateTime
 */
public class LocalDateTimeOutputParser implements OutputParser<LocalDateTime> {
    public LocalDateTime parse(String text);
    public String formatInstructions();
}

Thread Safety: All date/time parsers are thread-safe. They use DateTimeFormatter instances which are immutable and thread-safe. Legacy Date objects are mutable, so modify them carefully after parsing.

Common Pitfalls:

  • Date formats must match ISO-8601 standard (YYYY-MM-DD) - other formats may fail
  • DateOutputParser expects ISO-8601 with timezone info, but LLMs often omit it
  • LocalDate expects YYYY-MM-DD format - "March 15, 2024" or "15/03/2024" will fail
  • LocalTime expects HH:mm:ss format - "3:30 PM" requires custom parsing
  • LocalDateTime expects YYYY-MM-DDTHH:mm:ss format with 'T' separator
  • Ambiguous dates (01/02/2024) depend on LLM interpretation - use explicit format instructions

Edge Cases:

  • Dates with text month names ("March 15, 2024") throw DateTimeParseException
  • Two-digit years (24 instead of 2024) are not supported
  • Timestamps with milliseconds/nanoseconds must match exact format (2024-01-15T10:30:45.123)
  • Timezone abbreviations (EST, PST) are not reliably parsed - use UTC offset (+05:00)
  • Leap seconds are not handled by standard Java date/time API
  • Invalid dates (February 30th) throw DateTimeException

Performance Notes:

  • Date parsing is relatively expensive (involves complex format validation)
  • LocalDate/LocalTime/LocalDateTime are faster than legacy Date parsing
  • DateTimeFormatter.parse() is the main performance bottleneck
  • Cache formatInstructions() string to avoid repeated string concatenation
  • For high-frequency parsing, consider parsing timestamps as epoch milliseconds (Long)

Exception Handling:

  • DateTimeParseException wrapped in OutputParsingException for invalid formats
  • OutputParsingException when no valid date/time found in text
  • DateTimeException wrapped in OutputParsingException for invalid values (e.g., month 13)
  • Always catch OutputParsingException and handle gracefully with default values or retry logic

Related APIs:

  • PojoOutputParser - For complex temporal data with multiple date fields
  • StringListOutputParser - For extracting multiple dates from text
  • BigIntegerOutputParser - For epoch millisecond timestamps

Enum Parsers

Parsers for enum types and collections of enums.

package dev.langchain4j.service.output;

/**
 * Parses output to enum value
 */
public class EnumOutputParser implements OutputParser<Enum<?>> {
    /**
     * Constructor
     * @param enumClass Enum class to parse to
     */
    public EnumOutputParser(Class<? extends Enum<?>> enumClass);

    public Enum<?> parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to List of enums
 */
public class EnumListOutputParser implements OutputParser<List<? extends Enum<?>>> {
    /**
     * Constructor
     * @param enumClass Enum class for list elements
     */
    public EnumListOutputParser(Class<? extends Enum<?>> enumClass);

    public List<? extends Enum<?>> parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to Set of enums
 */
public class EnumSetOutputParser implements OutputParser<Set<? extends Enum<?>>> {
    /**
     * Constructor
     * @param enumClass Enum class for set elements
     */
    public EnumSetOutputParser(Class<? extends Enum<?>> enumClass);

    public Set<? extends Enum<?>> parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to Collection of enums
 */
public class EnumCollectionOutputParser implements OutputParser<Collection<? extends Enum<?>>> {
    /**
     * Constructor
     * @param enumClass Enum class for collection elements
     */
    public EnumCollectionOutputParser(Class<? extends Enum<?>> enumClass);

    public Collection<? extends Enum<?>> parse(String text);
    public String formatInstructions();
}

Thread Safety: All enum parsers are thread-safe. Enum classes are loaded once and cached. Multiple threads can safely parse enum values concurrently.

Common Pitfalls:

  • Enum matching is case-insensitive - "POSITIVE", "positive", "Positive" all match Sentiment.POSITIVE
  • LLMs may return enum-like values not in your enum definition - causes OutputParsingException
  • Enum names with underscores (USER_ADMIN) may be returned as "USER ADMIN" or "User Admin" by LLMs
  • Collection parsers expect comma or newline-separated values - other delimiters may fail
  • EnumSetOutputParser removes duplicates - use EnumListOutputParser to preserve them
  • formatInstructions() lists all enum constants - large enums create verbose prompts

Edge Cases:

  • Empty strings throw OutputParsingException
  • "null" string is not treated as null enum - throws OutputParsingException
  • Multiple enum values in single-value parser extracts only the first match
  • Partial matches are not supported - "POS" won't match "POSITIVE"
  • Enum values with special characters or spaces work if defined correctly
  • Collection parsers handle JSON array format ["VALUE1", "VALUE2"] automatically

Performance Notes:

  • Enum.valueOf() is fast (uses cached HashMap internally)
  • Case-insensitive matching adds overhead of String.toUpperCase()
  • formatInstructions() uses reflection to get enum constants - cache the result
  • EnumSetOutputParser uses LinkedHashSet to preserve insertion order
  • For large enum sets (>50 values), consider using String parsing with validation layer

Exception Handling:

  • OutputParsingException when enum value doesn't match any constant
  • IllegalArgumentException if enumClass is null or not an enum type
  • OutputParsingException wraps any reflection errors during enum access
  • Collection parsers fail fast on first invalid value (not partial success)

Related APIs:

  • BooleanOutputParser - For binary enum-like states
  • StringListOutputParser - For open-ended categorical values
  • PojoOutputParser - For enums embedded in complex objects

String Collection Parsers

Parsers for collections of strings.

package dev.langchain4j.service.output;

/**
 * Parses output to List<String>
 */
public class StringListOutputParser implements OutputParser<List<String>> {
    public List<String> parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to Set<String>
 */
public class StringSetOutputParser implements OutputParser<Set<String>> {
    public Set<String> parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to Collection<String>
 */
public class StringCollectionOutputParser implements OutputParser<Collection<String>> {
    public Collection<String> parse(String text);
    public String formatInstructions();
}

Thread Safety: All string collection parsers are thread-safe. They create new mutable collection instances per parse() call, so returned collections are not shared between threads.

Common Pitfalls:

  • Parsers accept both JSON array format ["a", "b", "c"] and newline/comma-separated format
  • Empty strings in collections are preserved - ["", "value"] returns list with empty string
  • LLMs may include numbered lists (1. item, 2. item) - parsers strip the numbers
  • Bullet points (-, *, •) are stripped automatically
  • Quoted strings preserve internal commas - "value, with comma" is treated as single item
  • StringSetOutputParser removes duplicates and may not preserve order (uses LinkedHashSet)

Edge Cases:

  • Empty input returns empty collection, not null
  • Single value without delimiters returns single-element collection
  • Nested JSON arrays are not flattened - ["a", ["b", "c"]] fails with JsonProcessingException
  • Strings containing only whitespace are trimmed to empty strings
  • Null strings in JSON arrays [null, "value"] are converted to "null" string
  • Mixed delimiters (commas and newlines) are both respected

Performance Notes:

  • JSON parsing (Jackson) is used when input looks like JSON array - faster than regex splitting
  • Fallback to regex splitting for plain text format - slower but more flexible
  • StringListOutputParser returns ArrayList - random access is O(1)
  • StringSetOutputParser uses LinkedHashSet - insertion order preserved, O(1) contains checks
  • Large collections (1000+ items) have significant JSON parsing overhead

Exception Handling:

  • JsonProcessingException wrapped in OutputParsingException for malformed JSON
  • OutputParsingException when input format is unrecognizable
  • No exception for empty collections - empty list/set returned
  • Null input throws NullPointerException

Related APIs:

  • EnumListOutputParser, EnumSetOutputParser - For constrained value sets
  • PojoListOutputParser - For structured objects instead of strings
  • IntegerOutputParser - For extracting single values

POJO Parsers

Parsers for Plain Old Java Objects and collections of POJOs.

package dev.langchain4j.service.output;

/**
 * Parses output to POJO (Plain Old Java Object)
 * Uses JSON parsing with Jackson
 */
public class PojoOutputParser implements OutputParser<Object> {
    /**
     * Constructor
     * @param pojoClass POJO class to parse to
     */
    public PojoOutputParser(Class<?> pojoClass);

    public Object parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to List of POJOs
 */
public class PojoListOutputParser implements OutputParser<List<?>> {
    /**
     * Constructor
     * @param pojoClass POJO class for list elements
     */
    public PojoListOutputParser(Class<?> pojoClass);

    public List<?> parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to Set of POJOs
 */
public class PojoSetOutputParser implements OutputParser<Set<?>> {
    /**
     * Constructor
     * @param pojoClass POJO class for set elements
     */
    public PojoSetOutputParser(Class<?> pojoClass);

    public Set<?> parse(String text);
    public String formatInstructions();
}

/**
 * Parses output to Collection of POJOs
 */
public class PojoCollectionOutputParser implements OutputParser<Collection<?>> {
    /**
     * Constructor
     * @param pojoClass POJO class for collection elements
     */
    public PojoCollectionOutputParser(Class<?> pojoClass);

    public Collection<?> parse(String text);
    public String formatInstructions();
}

Thread Safety: All POJO parsers are thread-safe. Jackson ObjectMapper is thread-safe. Generated schema strings are immutable. Parsed POJOs are new instances per call.

Common Pitfalls:

  • LLMs frequently don't follow complex nested schemas exactly - expect parsing failures
  • JSON keys are case-sensitive - "Name" vs "name" matters, use @JsonProperty to control mapping
  • Missing required fields cause Jackson to assign null or default values silently (no exception by default)
  • Extra fields in LLM output are ignored by default - configure ObjectMapper to fail on unknown properties
  • Deeply nested POJOs (4+ levels) often confuse LLMs - keep structures flat when possible
  • Records and classes with constructors work, but records are preferred for immutability
  • Circular references are not supported and will cause infinite loops or StackOverflowError
  • POJO collection parsers require array output format - LLMs may return objects instead

Edge Cases:

  • Empty JSON object {} creates POJO with all null/default fields
  • Null JSON (text "null") returns Java null
  • JSON with wrong types (string for int field) may auto-coerce or throw JsonMappingException
  • Collections with null elements [{"name":"John"}, null, {"name":"Jane"}] preserve nulls
  • Enums in POJOs are case-insensitive matched by Jackson
  • Dates in POJOs must be ISO-8601 strings or epoch milliseconds unless custom deserializer configured
  • POJOs with equals/hashCode required for PojoSetOutputParser to deduplicate correctly

Performance Notes:

  • JSON schema generation (formatInstructions) is expensive - happens once per parser instance
  • Jackson parsing overhead is significant (milliseconds for complex objects)
  • Large POJOs with many fields have linear parsing cost per field
  • Nested POJOs multiply parsing cost at each level
  • Collection parsers create new collection instances and parse each element independently
  • Consider streaming parsers for very large collections (not provided by default)
  • Avoid POJO parsing in tight loops - batch operations when possible

Exception Handling:

  • JsonProcessingException wrapped in OutputParsingException for malformed JSON
  • JsonMappingException wrapped in OutputParsingException for type mismatches
  • OutputParsingException when JSON is embedded in non-JSON text (parser extracts JSON block)
  • InstantiationException if POJO class has no default constructor or all-args constructor
  • Always catch OutputParsingException and log the original text for debugging
  • Consider retry logic with simplified schemas if complex POJO parsing fails

Related APIs:

  • JsonSchemas.jsonSchema() - Generates JSON schema for format instructions
  • StringListOutputParser - For simpler unstructured data
  • EnumOutputParser - For constrained categorical fields in POJOs

JSON Schema Utilities

Utilities for generating JSON schemas for POJOs.

package dev.langchain4j.service.output;

/**
 * Utility class for JSON schema generation
 * Used by POJO parsers to generate format instructions
 */
public class JsonSchemas {
    /**
     * Generate JSON schema for a class
     * @param clazz Class to generate schema for
     * @return JSON schema string
     */
    public static String jsonSchema(Class<?> clazz);
}

Thread Safety: JsonSchemas.jsonSchema() is thread-safe. Schema generation uses Jackson's thread-safe schema generator. Generated schemas are immutable strings that can be cached safely.

Common Pitfalls:

  • Schema generation uses reflection - complex classes with many fields generate verbose schemas
  • Recursive/circular class structures cause infinite loops - avoid self-referencing fields
  • Generic type information is lost at runtime - List<String> appears as List<Object> in schema
  • Jackson annotations (@JsonProperty, @JsonIgnore) are respected in schema generation
  • Private fields are not included in schema unless annotated or getters present
  • Schema format follows JSON Schema Draft 7 - some LLMs work better with simpler instructions

Edge Cases:

  • Null classes throw NullPointerException
  • Interfaces and abstract classes generate schemas based on declared methods/fields
  • Enums generate schema with "enum" constraint listing all constants
  • Collections generate array schema with item type (if determinable)
  • Nested POJOs generate nested schema definitions
  • Schema size can exceed token limits for very complex classes

Performance Notes:

  • Schema generation involves reflection and Jackson schema module - relatively slow (milliseconds)
  • Always cache schema strings - call jsonSchema() once per class, not per parse operation
  • PojoOutputParser caches schema in formatInstructions() - no need to cache manually
  • Large schemas increase prompt token count and LLM processing time
  • Consider manual schema strings for frequently used types to optimize prompt size

Exception Handling:

  • IllegalArgumentException for invalid or unsupported class types
  • JsonMappingException wrapped in RuntimeException for Jackson schema errors
  • No checked exceptions thrown

Related APIs:

  • PojoOutputParser - Primary consumer of JSON schemas
  • OutputParser.formatInstructions() - Returns schema as format instructions
  • Jackson ObjectMapper - Underlying JSON processing engine

Exceptions

package dev.langchain4j.service.output;

/**
 * Exception thrown when output parsing fails
 */
public class OutputParsingException extends LangChain4jException {
    /**
     * Constructor
     * @param message Error message
     */
    public OutputParsingException(String message);

    /**
     * Constructor with cause
     * @param message Error message
     * @param cause Underlying cause
     */
    public OutputParsingException(String message, Throwable cause);
}

Thread Safety: OutputParsingException follows standard exception thread safety. Exception instances are immutable after construction. Safe to catch and log from multiple threads.

Common Pitfalls:

  • OutputParsingException is unchecked (extends RuntimeException via LangChain4jException)
  • getMessage() may contain LLM output text - be careful logging sensitive data
  • getCause() returns underlying exception (JsonProcessingException, NumberFormatException, etc.)
  • Exception doesn't include the original text by default - log separately for debugging
  • Catching generic Exception instead of OutputParsingException misses specific parsing errors

Edge Cases:

  • Cause may be null if parsing failed without underlying exception
  • Exception may be thrown from nested parser calls (e.g., PojoListOutputParser parsing individual POJOs)
  • Stack traces can be deep when parsing complex nested structures
  • getMessage() format varies by parser implementation - don't parse exception messages

Performance Notes:

  • Exception creation involves stack trace capture - expensive operation
  • Frequent parsing failures indicate schema or prompt issues - fix root cause rather than catch exceptions
  • Don't use exceptions for control flow - validate input format before parsing when possible

Exception Handling:

  • Always catch OutputParsingException specifically when parsing untrusted LLM output
  • Log the original text along with exception for debugging - use separate logger call
  • Consider retry strategies with modified prompts if parsing fails
  • Wrap in application-specific exceptions for cleaner API boundaries
  • Use try-catch blocks around individual method calls, not batched operations

Related APIs:

  • LangChain4jException - Base exception class
  • All OutputParser implementations - Throw this exception on parse failures

Usage Examples

Primitive Types

import dev.langchain4j.service.AiServices;

interface Calculator {
    int add(int a, int b);
    double divide(double a, double b);
    boolean isEven(int number);
}

Calculator calc = AiServices.create(Calculator.class, chatModel);
int sum = calc.add(5, 3);           // Returns: 8
double result = calc.divide(10, 3); // Returns: 3.333...
boolean even = calc.isEven(4);      // Returns: true

Enum Types

import dev.langchain4j.service.AiServices;

enum Sentiment {
    POSITIVE, NEGATIVE, NEUTRAL
}

interface SentimentAnalyzer {
    Sentiment analyzeSentiment(String text);
}

SentimentAnalyzer analyzer = AiServices.create(SentimentAnalyzer.class, chatModel);
Sentiment sentiment = analyzer.analyzeSentiment("This product is amazing!");
// Returns: Sentiment.POSITIVE

Date and Time Types

import dev.langchain4j.service.AiServices;
import java.time.LocalDate;
import java.time.LocalDateTime;

interface DateExtractor {
    LocalDate extractDate(String text);
    LocalDateTime extractDateTime(String text);
}

DateExtractor extractor = AiServices.create(DateExtractor.class, chatModel);
LocalDate date = extractor.extractDate("The meeting is on March 15th, 2024");
// Returns: 2024-03-15

POJO Types

import dev.langchain4j.service.AiServices;

record Person(String name, int age, String occupation) {}

interface InformationExtractor {
    Person extractPerson(String biography);
}

InformationExtractor extractor = AiServices.create(InformationExtractor.class, chatModel);
Person person = extractor.extractPerson(
    "John Smith is a 35-year-old software engineer living in San Francisco."
);
// Returns: Person[name=John Smith, age=35, occupation=software engineer]

Collections

import dev.langchain4j.service.AiServices;
import java.util.List;
import java.util.Set;

interface TextAnalyzer {
    List<String> extractKeywords(String text);
    Set<String> extractUniqueTopics(String text);
}

TextAnalyzer analyzer = AiServices.create(TextAnalyzer.class, chatModel);
List<String> keywords = analyzer.extractKeywords("Java programming with Spring Boot");
// Returns: ["Java", "programming", "Spring Boot"]

Complex POJOs

import dev.langchain4j.service.AiServices;
import java.util.List;

record Address(String street, String city, String country) {}
record Contact(String email, String phone) {}
record Company(String name, Address address, List<Contact> contacts) {}

interface DataExtractor {
    Company extractCompany(String text);
}

DataExtractor extractor = AiServices.create(DataExtractor.class, chatModel);
Company company = extractor.extractCompany(
    "Acme Corp is located at 123 Main St, New York, USA. " +
    "Contact: info@acme.com or +1-555-0100"
);

List of POJOs

import dev.langchain4j.service.AiServices;
import java.util.List;

record Product(String name, double price, String category) {}

interface ShoppingAnalyzer {
    List<Product> extractProducts(String receipt);
}

ShoppingAnalyzer analyzer = AiServices.create(ShoppingAnalyzer.class, chatModel);
List<Product> products = analyzer.extractProducts(
    "Receipt: Laptop $999.99 (Electronics), " +
    "Mouse $29.99 (Electronics), " +
    "Desk $299.99 (Furniture)"
);
// Returns list of 3 Product objects

Enum Collections

import dev.langchain4j.service.AiServices;
import java.util.List;

enum Language {
    JAVA, PYTHON, JAVASCRIPT, GO, RUST
}

interface CodeAnalyzer {
    List<Language> detectLanguages(String codeSnippet);
}

CodeAnalyzer analyzer = AiServices.create(CodeAnalyzer.class, chatModel);
List<Language> languages = analyzer.detectLanguages(
    "This project uses Spring Boot for the backend and React for the frontend"
);
// Returns: [Language.JAVA, Language.JAVASCRIPT]

With Templates

import dev.langchain4j.service.AiServices;
import dev.langchain4j.service.UserMessage;
import dev.langchain4j.service.V;
import java.util.List;

record Recipe(String name, List<String> ingredients, List<String> steps, int prepTimeMinutes) {}

interface Chef {
    @UserMessage("Create a {{cuisine}} recipe for {{dish}} that serves {{servings}} people.")
    Recipe createRecipe(@V("cuisine") String cuisine,
                       @V("dish") String dish,
                       @V("servings") int servings);
}

Chef chef = AiServices.create(Chef.class, chatModel);
Recipe recipe = chef.createRecipe("Italian", "pasta", 4);
// Returns structured Recipe object with all fields populated

Handling Parsing Errors

import dev.langchain4j.service.AiServices;
import dev.langchain4j.service.output.OutputParsingException;

record ComplexData(String field1, int field2, List<String> field3) {}

interface DataParser {
    ComplexData parseData(String text);
}

DataParser parser = AiServices.create(DataParser.class, chatModel);

try {
    ComplexData data = parser.parseData("Some text to parse");
    System.out.println("Parsed successfully: " + data);
} catch (OutputParsingException e) {
    System.err.println("Failed to parse: " + e.getMessage());
    // Handle parsing error
}

Testing Patterns

Mocking LLM Outputs for Testing

When testing code that uses output parsers, you can mock LLM responses to verify parsing behavior without calling actual LLM APIs.

import dev.langchain4j.service.output.*;
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.*;

class OutputParserTest {

    @Test
    void testIntegerParsing() {
        IntegerOutputParser parser = new IntegerOutputParser();

        // Test clean output
        assertEquals(42, parser.parse("42"));

        // Test output with surrounding text
        assertEquals(42, parser.parse("The answer is 42"));

        // Test output with multiple numbers (extracts first)
        assertEquals(42, parser.parse("42 or maybe 43"));

        // Test negative numbers
        assertEquals(-42, parser.parse("-42"));

        // Test failure case
        assertThrows(OutputParsingException.class, () -> parser.parse("no number"));
    }

    @Test
    void testEnumParsing() {
        enum Status { ACTIVE, INACTIVE, PENDING }
        EnumOutputParser parser = new EnumOutputParser(Status.class);

        // Test exact match
        assertEquals(Status.ACTIVE, parser.parse("ACTIVE"));

        // Test case insensitive
        assertEquals(Status.ACTIVE, parser.parse("active"));

        // Test with surrounding text
        assertEquals(Status.ACTIVE, parser.parse("The status is ACTIVE"));

        // Test invalid value
        assertThrows(OutputParsingException.class, () -> parser.parse("UNKNOWN"));
    }

    @Test
    void testPojoParsing() {
        record Person(String name, int age) {}
        PojoOutputParser parser = new PojoOutputParser(Person.class);

        // Test valid JSON
        String validJson = "{\"name\":\"John\",\"age\":30}";
        Person person = (Person) parser.parse(validJson);
        assertEquals("John", person.name());
        assertEquals(30, person.age());

        // Test JSON with surrounding text
        String wrappedJson = "Here is the data: {\"name\":\"John\",\"age\":30}";
        person = (Person) parser.parse(wrappedJson);
        assertEquals("John", person.name());

        // Test malformed JSON
        assertThrows(OutputParsingException.class, () -> parser.parse("{invalid}"));
    }

    @Test
    void testCollectionParsing() {
        StringListOutputParser parser = new StringListOutputParser();

        // Test JSON array format
        List<String> result = parser.parse("[\"apple\", \"banana\", \"cherry\"]");
        assertEquals(List.of("apple", "banana", "cherry"), result);

        // Test newline-separated format
        result = parser.parse("apple\nbanana\ncherry");
        assertEquals(List.of("apple", "banana", "cherry"), result);

        // Test comma-separated format
        result = parser.parse("apple, banana, cherry");
        assertEquals(List.of("apple", "banana", "cherry"), result);

        // Test numbered list format
        result = parser.parse("1. apple\n2. banana\n3. cherry");
        assertEquals(List.of("apple", "banana", "cherry"), result);

        // Test empty input
        result = parser.parse("");
        assertTrue(result.isEmpty());
    }
}

Testing with Mock LLM Responses

import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.service.AiServices;
import org.junit.jupiter.api.Test;
import static org.mockito.Mockito.*;
import static org.junit.jupiter.api.Assertions.*;

class AiServiceOutputParsingTest {

    @Test
    void testPersonExtraction() {
        // Create mock LLM
        ChatLanguageModel mockModel = mock(ChatLanguageModel.class);
        when(mockModel.generate(anyString()))
            .thenReturn("{\"name\":\"Alice\",\"age\":25,\"occupation\":\"Engineer\"}");

        // Create AI service
        record Person(String name, int age, String occupation) {}
        interface Extractor {
            Person extract(String text);
        }

        Extractor extractor = AiServices.create(Extractor.class, mockModel);

        // Test extraction
        Person person = extractor.extract("Some biography text");
        assertEquals("Alice", person.name());
        assertEquals(25, person.age());
        assertEquals("Engineer", person.occupation());
    }

    @Test
    void testParsingFailureHandling() {
        // Create mock LLM that returns invalid JSON
        ChatLanguageModel mockModel = mock(ChatLanguageModel.class);
        when(mockModel.generate(anyString()))
            .thenReturn("This is not valid JSON");

        record Person(String name, int age) {}
        interface Extractor {
            Person extract(String text);
        }

        Extractor extractor = AiServices.create(Extractor.class, mockModel);

        // Should throw OutputParsingException
        assertThrows(OutputParsingException.class, () -> extractor.extract("text"));
    }
}

Testing Format Instructions

import dev.langchain4j.service.output.*;
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.*;

class FormatInstructionsTest {

    @Test
    void testPrimitiveInstructions() {
        IntegerOutputParser intParser = new IntegerOutputParser();
        String instructions = intParser.formatInstructions();
        assertTrue(instructions.contains("integer") || instructions.contains("number"));

        BooleanOutputParser boolParser = new BooleanOutputParser();
        instructions = boolParser.formatInstructions();
        assertTrue(instructions.contains("true") || instructions.contains("false"));
    }

    @Test
    void testEnumInstructions() {
        enum Color { RED, GREEN, BLUE }
        EnumOutputParser parser = new EnumOutputParser(Color.class);
        String instructions = parser.formatInstructions();

        // Should list all enum values
        assertTrue(instructions.contains("RED"));
        assertTrue(instructions.contains("GREEN"));
        assertTrue(instructions.contains("BLUE"));
    }

    @Test
    void testPojoInstructions() {
        record Person(String name, int age) {}
        PojoOutputParser parser = new PojoOutputParser(Person.class);
        String instructions = parser.formatInstructions();

        // Should contain JSON schema
        assertTrue(instructions.contains("name"));
        assertTrue(instructions.contains("age"));
        assertTrue(instructions.contains("string") || instructions.contains("String"));
    }
}

Schema Validation Patterns

Validating POJO Schemas

Ensure your POJO classes generate valid JSON schemas that LLMs can follow:

import dev.langchain4j.service.output.JsonSchemas;
import dev.langchain4j.service.output.PojoOutputParser;
import com.fasterxml.jackson.annotation.JsonProperty;
import com.fasterxml.jackson.annotation.JsonPropertyDescription;

// Well-structured POJO with validation-friendly design
record ValidatedPerson(
    @JsonProperty(required = true)
    @JsonPropertyDescription("Full name of the person")
    String name,

    @JsonProperty(required = true)
    @JsonPropertyDescription("Age in years, must be positive")
    int age,

    @JsonPropertyDescription("Email address in format user@domain.com")
    String email
) {
    // Validation in compact constructor
    public ValidatedPerson {
        if (name == null || name.isBlank()) {
            throw new IllegalArgumentException("Name is required");
        }
        if (age < 0 || age > 150) {
            throw new IllegalArgumentException("Age must be between 0 and 150");
        }
        if (email != null && !email.contains("@")) {
            throw new IllegalArgumentException("Invalid email format");
        }
    }
}

// Generate and inspect schema
String schema = JsonSchemas.jsonSchema(ValidatedPerson.class);
System.out.println("Generated schema:\n" + schema);

// Use in parser
PojoOutputParser parser = new PojoOutputParser(ValidatedPerson.class);
String formatInstructions = parser.formatInstructions();

Custom Validation After Parsing

Add validation layer on top of output parsing:

import dev.langchain4j.service.AiServices;
import dev.langchain4j.service.output.OutputParsingException;

record Product(String name, double price, String category) {
    // Validation in compact constructor
    public Product {
        if (name == null || name.isBlank()) {
            throw new IllegalArgumentException("Product name is required");
        }
        if (price < 0) {
            throw new IllegalArgumentException("Price cannot be negative");
        }
        if (category == null || category.isBlank()) {
            throw new IllegalArgumentException("Category is required");
        }
    }
}

interface ProductExtractor {
    Product extractProduct(String text);
}

// Usage with validation
ProductExtractor extractor = AiServices.create(ProductExtractor.class, chatModel);

try {
    Product product = extractor.extractProduct("Laptop costs $999.99 in Electronics");
    System.out.println("Valid product: " + product);
} catch (OutputParsingException e) {
    System.err.println("Parsing failed: " + e.getMessage());
} catch (IllegalArgumentException e) {
    System.err.println("Validation failed: " + e.getMessage());
}

Schema Size Optimization

Large schemas can exceed token limits or confuse LLMs. Optimize by:

import com.fasterxml.jackson.annotation.JsonIgnore;
import com.fasterxml.jackson.annotation.JsonProperty;

// Bad: Overly complex nested structure
record BadCompany(
    String name,
    Address address,
    List<Employee> employees,
    List<Project> projects,
    Map<String, Department> departments,
    FinancialInfo financials
) {}

// Good: Flattened structure
record GoodCompany(
    @JsonProperty(required = true) String name,
    @JsonProperty(required = true) String city,
    @JsonProperty(required = true) String country,
    int employeeCount,
    @JsonIgnore transient Object internalData  // Excluded from schema
) {}

// For complex data, extract in multiple steps
interface CompanyExtractor {
    GoodCompany extractBasicInfo(String text);
    List<String> extractEmployeeNames(String text);
    FinancialSummary extractFinancials(String text);
}

Handling Malformed Output

Extracting JSON from Mixed Output

LLMs often include explanatory text before/after JSON. Parsers handle this automatically, but you can pre-process for better reliability:

import dev.langchain4j.service.output.PojoOutputParser;
import dev.langchain4j.service.output.OutputParsingException;

public class RobustJsonExtractor {

    /**
     * Pre-process LLM output to extract JSON, then parse
     */
    public static <T> T parseWithFallback(String llmOutput, Class<T> clazz) {
        PojoOutputParser parser = new PojoOutputParser(clazz);

        try {
            // Try direct parsing first (parser extracts JSON automatically)
            return (T) parser.parse(llmOutput);
        } catch (OutputParsingException e) {
            // If that fails, try aggressive JSON extraction
            String extractedJson = extractJsonBlock(llmOutput);
            if (extractedJson != null) {
                try {
                    return (T) parser.parse(extractedJson);
                } catch (OutputParsingException e2) {
                    // Log both attempts
                    System.err.println("Original parsing failed: " + e.getMessage());
                    System.err.println("Extracted JSON parsing failed: " + e2.getMessage());
                    throw e2;
                }
            }
            throw e;
        }
    }

    /**
     * Extract JSON block from text using regex
     */
    private static String extractJsonBlock(String text) {
        // Match JSON objects: { ... }
        java.util.regex.Pattern objectPattern =
            java.util.regex.Pattern.compile("\\{[^{}]*(?:\\{[^{}]*\\}[^{}]*)*\\}");
        java.util.regex.Matcher matcher = objectPattern.matcher(text);
        if (matcher.find()) {
            return matcher.group();
        }

        // Match JSON arrays: [ ... ]
        java.util.regex.Pattern arrayPattern =
            java.util.regex.Pattern.compile("\\[[^\\[\\]]*(?:\\[[^\\[\\]]*\\][^\\[\\]]*)*\\]");
        matcher = arrayPattern.matcher(text);
        if (matcher.find()) {
            return matcher.group();
        }

        return null;
    }
}

// Usage
record Person(String name, int age) {}
String messyOutput = "Here is the person data:\n{\"name\":\"John\",\"age\":30}\nHope this helps!";
Person person = RobustJsonExtractor.parseWithFallback(messyOutput, Person.class);

Retry Logic with Simplified Prompts

If parsing fails repeatedly, simplify the schema:

import dev.langchain4j.service.AiServices;
import dev.langchain4j.service.output.OutputParsingException;

public class AdaptiveExtractor {

    interface ComplexExtractor {
        ComplexPerson extractPerson(String text);
    }

    interface SimpleExtractor {
        SimplePerson extractPerson(String text);
    }

    record ComplexPerson(String name, int age, String email, String phone, Address address) {}
    record SimplePerson(String name, int age) {}
    record Address(String city, String country) {}

    public static ComplexPerson extractWithFallback(
            String text,
            ChatLanguageModel model) {

        // Try complex extraction first
        ComplexExtractor complexExtractor = AiServices.create(ComplexExtractor.class, model);
        try {
            return complexExtractor.extractPerson(text);
        } catch (OutputParsingException e) {
            System.err.println("Complex extraction failed, trying simplified...");

            // Fall back to simple extraction
            SimpleExtractor simpleExtractor = AiServices.create(SimpleExtractor.class, model);
            SimplePerson simple = simpleExtractor.extractPerson(text);

            // Convert to complex with partial data
            return new ComplexPerson(
                simple.name(),
                simple.age(),
                null,  // Missing email
                null,  // Missing phone
                null   // Missing address
            );
        }
    }
}

Handling Partial Parsing Failures

For lists of objects, handle partial failures gracefully:

import dev.langchain4j.service.output.PojoListOutputParser;
import dev.langchain4j.service.output.PojoOutputParser;
import dev.langchain4j.service.output.OutputParsingException;
import java.util.List;
import java.util.ArrayList;

public class PartialListParser {

    /**
     * Parse list with error recovery - returns successfully parsed items
     */
    public static <T> List<T> parseListWithRecovery(String text, Class<T> itemClass) {
        PojoListOutputParser listParser = new PojoListOutputParser(itemClass);

        try {
            // Try parsing entire list
            return (List<T>) listParser.parse(text);
        } catch (OutputParsingException e) {
            System.err.println("Full list parsing failed, attempting per-item parsing...");

            // Fall back to parsing individual items
            List<T> results = new ArrayList<>();
            String[] items = splitJsonArray(text);
            PojoOutputParser itemParser = new PojoOutputParser(itemClass);

            for (int i = 0; i < items.length; i++) {
                try {
                    results.add((T) itemParser.parse(items[i]));
                } catch (OutputParsingException itemError) {
                    System.err.println("Failed to parse item " + i + ": " + itemError.getMessage());
                    // Skip failed items and continue
                }
            }

            if (results.isEmpty()) {
                throw new OutputParsingException("All items failed to parse", e);
            }

            return results;
        }
    }

    /**
     * Split JSON array into individual item strings
     */
    private static String[] splitJsonArray(String text) {
        // Simple split on },{ pattern for JSON objects in array
        String cleaned = text.trim();
        if (cleaned.startsWith("[")) {
            cleaned = cleaned.substring(1);
        }
        if (cleaned.endsWith("]")) {
            cleaned = cleaned.substring(0, cleaned.length() - 1);
        }
        return cleaned.split("\\},\\s*\\{");
    }
}

// Usage
record Product(String name, double price) {}
String partiallyValidJson = "[{\"name\":\"Laptop\",\"price\":999.99},{invalid},{\"name\":\"Mouse\",\"price\":29.99}]";
List<Product> products = PartialListParser.parseListWithRecovery(partiallyValidJson, Product.class);
// Returns list with 2 products (skips invalid item)

Validating Output Before Parsing

Pre-validate format before expensive parsing operations:

import dev.langchain4j.service.output.OutputParsingException;

public class OutputValidator {

    /**
     * Validate JSON format before parsing
     */
    public static boolean isValidJson(String text) {
        if (text == null || text.isBlank()) {
            return false;
        }

        String trimmed = text.trim();

        // Check for JSON object
        if (trimmed.startsWith("{") && trimmed.endsWith("}")) {
            return hasBalancedBraces(trimmed);
        }

        // Check for JSON array
        if (trimmed.startsWith("[") && trimmed.endsWith("]")) {
            return hasBalancedBrackets(trimmed);
        }

        return false;
    }

    private static boolean hasBalancedBraces(String text) {
        int count = 0;
        for (char c : text.toCharArray()) {
            if (c == '{') count++;
            if (c == '}') count--;
            if (count < 0) return false;
        }
        return count == 0;
    }

    private static boolean hasBalancedBrackets(String text) {
        int count = 0;
        for (char c : text.toCharArray()) {
            if (c == '[') count++;
            if (c == ']') count--;
            if (count < 0) return false;
        }
        return count == 0;
    }

    /**
     * Validate and parse with early failure
     */
    public static <T> T validateAndParse(String text, PojoOutputParser parser) {
        if (!isValidJson(text)) {
            throw new OutputParsingException("Input is not valid JSON format");
        }
        return (T) parser.parse(text);
    }
}

Supported Return Types Summary

Return TypeParserExample
StringNo parser (direct return)String chat(String msg)
boolean, BooleanBooleanOutputParserboolean isValid(String text)
int, IntegerIntegerOutputParserint count(String text)
long, LongLongOutputParserlong extractNumber(String text)
float, FloatFloatOutputParserfloat getScore(String text)
double, DoubleDoubleOutputParserdouble calculate(String expr)
byte, ByteByteOutputParserbyte getValue(String text)
short, ShortShortOutputParsershort getShort(String text)
BigIntegerBigIntegerOutputParserBigInteger getBigInt(String text)
BigDecimalBigDecimalOutputParserBigDecimal getPrice(String text)
DateDateOutputParserDate getDate(String text)
LocalDateLocalDateOutputParserLocalDate extractDate(String text)
LocalTimeLocalTimeOutputParserLocalTime extractTime(String text)
LocalDateTimeLocalDateTimeOutputParserLocalDateTime getTimestamp(String text)
Enum<?>EnumOutputParserSentiment analyze(String text)
List<Enum<?>>EnumListOutputParserList<Category> classify(String text)
Set<Enum<?>>EnumSetOutputParserSet<Tag> extractTags(String text)
List<String>StringListOutputParserList<String> extractWords(String text)
Set<String>StringSetOutputParserSet<String> uniqueWords(String text)
Custom POJOPojoOutputParserPerson extractPerson(String bio)
List<POJO>PojoListOutputParserList<Product> parse(String text)
Set<POJO>PojoSetOutputParserSet<Contact> extract(String text)
Result<T>Wraps any of aboveResult<Person> getPerson(String bio)
TokenStreamStreaming responseTokenStream chat(String msg)

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j@1.11.0

docs

ai-services.md

chains.md

classification.md

data-types.md

document-processing.md

embedding-store.md

guardrails.md

index.md

memory.md

messages.md

models.md

output-parsing.md

prompts.md

rag.md

request-response.md

spi.md

tools.md

README.md

tile.json