Production readiness checklist for LangChain applications. Use when preparing for launch, validating deployment readiness, or auditing existing production LangChain systems. Trigger: "langchain production", "langchain prod ready", "deploy langchain", "langchain launch checklist", "go-live langchain".
85
83%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Comprehensive go-live checklist for deploying LangChain applications to production. Covers configuration, resilience, observability, performance, security, testing, deployment, and cost management.
.env in production).env files in .gitignore// Startup validation
import { z } from "zod";
const ProdConfig = z.object({
OPENAI_API_KEY: z.string().startsWith("sk-"),
LANGSMITH_API_KEY: z.string().startsWith("lsv2_"),
NODE_ENV: z.literal("production"),
});
try {
ProdConfig.parse(process.env);
} catch (e) {
console.error("Invalid production config:", e);
process.exit(1);
}maxRetries configured on all models (3-5)timeout set on all models (30-60s).withFallbacks()const model = new ChatOpenAI({
model: "gpt-4o-mini",
maxRetries: 5,
timeout: 30000,
}).withFallbacks({
fallbacks: [new ChatAnthropic({ model: "claude-sonnet-4-20250514" })],
});LANGSMITH_TRACING=true)LANGCHAIN_CALLBACKS_BACKGROUND=true (non-serverless only)maxConcurrency set on batch operationsFakeListChatModel, no API calls)// Health check endpoint
app.get("/health", async (_req, res) => {
const checks: Record<string, string> = { server: "ok" };
try {
await model.invoke("ping");
checks.llm = "ok";
} catch (e: any) {
checks.llm = `error: ${e.message.slice(0, 100)}`;
}
const healthy = Object.values(checks).every((v) => v === "ok");
res.status(healthy ? 200 : 503).json({ status: healthy ? "healthy" : "degraded", checks });
});
// Graceful shutdown
process.on("SIGTERM", async () => {
console.log("Shutting down gracefully...");
server.close(() => process.exit(0));
setTimeout(() => process.exit(1), 10000); // force after 10s
});async function validateProduction() {
const results: Record<string, string> = {};
// 1. Config
try {
ProdConfig.parse(process.env);
results["Config"] = "PASS";
} catch { results["Config"] = "FAIL: missing env vars"; }
// 2. LLM connectivity
try {
await model.invoke("ping");
results["LLM"] = "PASS";
} catch (e: any) { results["LLM"] = `FAIL: ${e.message.slice(0, 50)}`; }
// 3. Fallback
try {
const fallbackModel = model.withFallbacks({ fallbacks: [fallback] });
await fallbackModel.invoke("ping");
results["Fallback"] = "PASS";
} catch { results["Fallback"] = "FAIL"; }
// 4. LangSmith
results["LangSmith"] = process.env.LANGSMITH_TRACING === "true" ? "PASS" : "WARN: disabled";
// 5. Health endpoint
try {
const res = await fetch("http://localhost:8000/health");
results["Health"] = res.ok ? "PASS" : "FAIL";
} catch { results["Health"] = "FAIL: not reachable"; }
console.table(results);
const allPass = Object.values(results).every((v) => v === "PASS");
console.log(allPass ? "READY FOR PRODUCTION" : "ISSUES FOUND - FIX BEFORE LAUNCH");
return allPass;
}| Issue | Cause | Fix |
|---|---|---|
| API key missing at startup | Secrets not mounted | Check deployment config |
| No fallback on outage | .withFallbacks() not configured | Add fallback model |
| LangSmith trace gaps | Background callbacks in serverless | Set LANGCHAIN_CALLBACKS_BACKGROUND=false |
| Cache miss storm | Redis down | Implement graceful degradation |
After launch, use langchain-observability for monitoring and langchain-incident-runbook for incident response.
70e9fa4
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.