Apply production-ready LangChain SDK patterns for structured output, fallbacks, batch processing, streaming, and caching. Trigger: "langchain SDK patterns", "langchain best practices", "idiomatic langchain", "langchain architecture", "withStructuredOutput", "withFallbacks", "abatch".
84
82%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Production-grade patterns every LangChain application should use: type-safe structured output, provider fallbacks, async batch processing, streaming, caching, and retry logic.
import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { z } from "zod";
const ExtractedData = z.object({
entities: z.array(z.object({
name: z.string(),
type: z.enum(["person", "org", "location"]),
confidence: z.number().min(0).max(1),
})),
language: z.string(),
summary: z.string(),
});
const model = new ChatOpenAI({ model: "gpt-4o-mini" });
const structuredModel = model.withStructuredOutput(ExtractedData);
const prompt = ChatPromptTemplate.fromTemplate(
"Extract entities from this text:\n\n{text}"
);
const chain = prompt.pipe(structuredModel);
const result = await chain.invoke({
text: "Satya Nadella announced Microsoft's new AI lab in Seattle.",
});
// result is fully typed: { entities: [...], language: "en", summary: "..." }import { ChatOpenAI } from "@langchain/openai";
import { ChatAnthropic } from "@langchain/anthropic";
const primary = new ChatOpenAI({
model: "gpt-4o",
maxRetries: 2,
timeout: 10000,
});
const fallback = new ChatAnthropic({
model: "claude-sonnet-4-20250514",
});
// Automatically falls back on any error (rate limit, timeout, 500)
const robustModel = primary.withFallbacks({
fallbacks: [fallback],
});
// Works identically to a normal model
const chain = prompt.pipe(robustModel).pipe(new StringOutputParser());import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { StringOutputParser } from "@langchain/core/output_parsers";
const chain = ChatPromptTemplate.fromTemplate("Summarize: {text}")
.pipe(new ChatOpenAI({ model: "gpt-4o-mini" }))
.pipe(new StringOutputParser());
const texts = ["Article 1...", "Article 2...", "Article 3..."];
const inputs = texts.map((text) => ({ text }));
// Process all inputs with controlled concurrency
const results = await chain.batch(inputs, {
maxConcurrency: 5, // max 5 parallel API calls
});
// results: string[] — one summary per inputimport { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { StringOutputParser } from "@langchain/core/output_parsers";
const chain = ChatPromptTemplate.fromTemplate("{input}")
.pipe(new ChatOpenAI({ model: "gpt-4o-mini", streaming: true }))
.pipe(new StringOutputParser());
// Stream string chunks
const stream = await chain.stream({ input: "Tell me a story" });
for await (const chunk of stream) {
process.stdout.write(chunk); // each chunk is a string fragment
}import { ChatOpenAI } from "@langchain/openai";
// Built-in retry handles transient failures
const model = new ChatOpenAI({
model: "gpt-4o-mini",
maxRetries: 3, // retries with exponential backoff
timeout: 30000, // 30s timeout per request
});
// Manual retry wrapper for custom logic
async function invokeWithRetry<T>(
chain: { invoke: (input: any) => Promise<T> },
input: any,
maxRetries = 3,
): Promise<T> {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await chain.invoke(input);
} catch (error: any) {
if (attempt === maxRetries - 1) throw error;
const delay = Math.min(1000 * 2 ** attempt, 30000);
console.warn(`Retry ${attempt + 1}/${maxRetries} after ${delay}ms`);
await new Promise((r) => setTimeout(r, delay));
}
}
throw new Error("Unreachable");
}from langchain_openai import ChatOpenAI
from langchain_core.globals import set_llm_cache
from langchain_community.cache import SQLiteCache
# Enable persistent caching — identical inputs skip the API
set_llm_cache(SQLiteCache(database_path=".langchain_cache.db"))
llm = ChatOpenAI(model="gpt-4o-mini")
# First call: hits API (~500ms)
r1 = llm.invoke("What is 2+2?")
# Second identical call: cache hit (~0ms, no cost)
r2 = llm.invoke("What is 2+2?")import { RunnableLambda } from "@langchain/core/runnables";
// Wrap any function as a Runnable to use in chains
const cleanInput = RunnableLambda.from((input: { text: string }) => ({
text: input.text.trim().toLowerCase(),
}));
const addMetadata = RunnableLambda.from((result: string) => ({
answer: result,
timestamp: new Date().toISOString(),
model: "gpt-4o-mini",
}));
const chain = cleanInput
.pipe(prompt)
.pipe(model)
.pipe(new StringOutputParser())
.pipe(addMetadata);| Anti-Pattern | Why | Better |
|---|---|---|
| Hardcoded API keys | Security risk | Use env vars + dotenv |
| No error handling | Silent failures | Use .withFallbacks() + try/catch |
| Sequential when parallel works | Slow | Use RunnableParallel or .batch() |
| Parsing raw LLM text | Fragile | Use .withStructuredOutput(zodSchema) |
| No timeout | Hanging requests | Set timeout on model constructor |
| No streaming in UIs | Bad UX | Use .stream() for user-facing output |
| Error | Cause | Fix |
|---|---|---|
ZodError | LLM output doesn't match schema | Improve prompt or relax schema with .optional() |
RateLimitError | API quota exceeded | Add maxRetries, use .withFallbacks() |
TimeoutError | Slow response | Increase timeout, try smaller model |
OutputParserException | Unparseable output | Switch to .withStructuredOutput() |
Proceed to langchain-data-handling for data privacy patterns.
c8a915c
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.