A practical guide to getting started with LangSmith, covering installation, environment setup, first trace, and first evaluation.
Install the LangSmith SDK using npm or yarn:
npm install langsmithyarn add langsmithFor use with specific frameworks:
# With LangChain
npm install langsmith @langchain/core @langchain/openai
# With OpenAI SDK
npm install langsmith openai
# With Anthropic SDK
npm install langsmith @anthropic-ai/sdk
# With Vercel AI SDK
npm install langsmith ai @ai-sdk/openaiSet the following environment variables:
export LANGCHAIN_API_KEY="your-api-key-here"
export LANGCHAIN_PROJECT="my-first-project"
# Optional: Custom endpoint (defaults to https://api.smith.langchain.com)
export LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"For development, use a .env file:
# .env
LANGCHAIN_API_KEY=lsv2_pt_...
LANGCHAIN_PROJECT=my-first-projectLoad environment variables in your application:
import * as dotenv from "dotenv";
dotenv.config();Check that your environment is configured correctly:
import { Client } from "langsmith";
const client = new Client();
// Test API connectivity
try {
const config = Client.getDefaultClientConfig();
console.log("API URL:", config.apiUrl);
console.log("API Key configured:", !!config.apiKey);
// Try creating a simple project
const project = await client.createProject({
projectName: "test-connection",
description: "Testing LangSmith connection"
});
console.log("Connection successful! Project ID:", project.id);
} catch (error) {
console.error("Configuration error:", error.message);
}The simplest way to trace an LLM call is using the traceable() decorator.
import { traceable } from "langsmith/traceable";
// Wrap any function with traceable
const greet = traceable(
async (name: string) => {
return `Hello, ${name}!`;
},
{ name: "greet-user", run_type: "chain" }
);
// Call the function - automatically traced to LangSmith
const greeting = await greet("Alice");
console.log(greeting); // "Hello, Alice!"import { traceable } from "langsmith/traceable";
import OpenAI from "openai";
const openai = new OpenAI();
const generateAnswer = traceable(
async (question: string) => {
const completion = await openai.chat.completions.create({
model: "gpt-4",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: question }
],
temperature: 0.7,
});
return completion.choices[0].message.content;
},
{ name: "generate-answer", run_type: "llm" }
);
// Execute and trace
const answer = await generateAnswer("What is the capital of France?");
console.log("Answer:", answer);
// View your trace at: https://smith.langchain.comCreate hierarchical traces automatically:
import { traceable } from "langsmith/traceable";
// Child function
const retrieveDocs = traceable(
async (query: string) => {
// Simulate document retrieval
return ["Doc 1 about Paris", "Doc 2 about France"];
},
{ name: "retrieve-docs", run_type: "retriever" }
);
// Parent function that calls child
const ragPipeline = traceable(
async (question: string) => {
// This call is automatically traced as a child
const docs = await retrieveDocs(question);
const context = docs.join("\n");
const answer = `Based on: ${context}\nAnswer: Paris is the capital.`;
return answer;
},
{ name: "rag-pipeline", run_type: "chain" }
);
// Execute - creates a parent trace with child traces
const result = await ragPipeline("What is the capital of France?");After running traced functions, view them in the LangSmith UI:
Evaluation helps you systematically test your LLM application against a dataset of examples.
First, create a dataset with test examples:
import { Client } from "langsmith";
const client = new Client();
// Create a dataset
const dataset = await client.createDataset({
datasetName: "capital-cities-qa",
description: "Questions about capital cities",
dataType: "kv",
});
// Add test examples
await client.createExamples({
datasetId: dataset.id,
inputs: [
{ question: "What is the capital of France?" },
{ question: "What is the capital of Japan?" },
{ question: "What is the capital of Brazil?" },
],
outputs: [
{ answer: "Paris" },
{ answer: "Tokyo" },
{ answer: "Brasília" },
],
});
console.log("Dataset created:", dataset.id);This is the function you want to evaluate:
import { traceable } from "langsmith/traceable";
import OpenAI from "openai";
const openai = new OpenAI();
const answerQuestion = traceable(
async (inputs: { question: string }) => {
const completion = await openai.chat.completions.create({
model: "gpt-3.5-turbo",
messages: [
{ role: "system", content: "Answer questions concisely." },
{ role: "user", content: inputs.question }
],
temperature: 0,
});
return {
answer: completion.choices[0].message.content
};
},
{ name: "answer-question", run_type: "chain" }
);Define how to score your results:
// Check if the answer matches the expected output
const correctnessEvaluator = ({ run, example }) => {
const predicted = run.outputs?.answer || "";
const expected = example?.outputs?.answer || "";
// Simple exact match
const isCorrect = predicted.toLowerCase().includes(expected.toLowerCase());
return {
key: "correctness",
score: isCorrect ? 1 : 0,
comment: isCorrect ? "Correct answer" : "Incorrect answer"
};
};
// Check response length
const lengthEvaluator = ({ run }) => {
const answer = run.outputs?.answer || "";
const wordCount = answer.split(" ").length;
// Prefer concise answers (under 50 words)
const score = wordCount <= 50 ? 1 : 0;
return {
key: "conciseness",
score: score,
value: wordCount,
comment: `${wordCount} words`
};
};Run your target function against the dataset:
import { evaluate } from "langsmith/evaluation";
// Run evaluation
const results = await evaluate(answerQuestion, {
data: "capital-cities-qa", // Dataset name
evaluators: [correctnessEvaluator, lengthEvaluator],
experimentPrefix: "capital-cities-eval",
metadata: {
model: "gpt-3.5-turbo",
temperature: 0,
},
});
// View results
console.log("Evaluation complete!");
console.log("Results:", results.results.length);
// Calculate aggregate scores
let correctCount = 0;
let totalConcise = 0;
for (const row of results.results) {
const correctness = row.evaluation_results.find(e => e.key === "correctness");
const conciseness = row.evaluation_results.find(e => e.key === "conciseness");
if (correctness?.score === 1) correctCount++;
if (conciseness?.score === 1) totalConcise++;
}
const accuracy = correctCount / results.results.length;
const concisenessRate = totalConcise / results.results.length;
console.log(`Accuracy: ${(accuracy * 100).toFixed(1)}%`);
console.log(`Conciseness: ${(concisenessRate * 100).toFixed(1)}%`);Use tracing to debug and improve your application:
import { traceable } from "langsmith/traceable";
const pipeline = traceable(
async (input: string) => {
// Step 1: Preprocess
const cleaned = input.trim().toLowerCase();
// Step 2: Process with LLM
const result = await callLLM(cleaned);
// Step 3: Post-process
const final = result.toUpperCase();
return final;
},
{
name: "my-pipeline",
run_type: "chain",
// Add metadata for debugging
metadata: { version: "1.0" }
}
);
// Run and check traces in UI
await pipeline(" Hello World ");Evaluate different models on the same dataset:
import { evaluate } from "langsmith/evaluation";
// Evaluator
const qualityEvaluator = ({ run, example }) => ({
key: "quality",
score: run.outputs?.answer === example?.outputs?.answer ? 1 : 0
});
// Evaluate GPT-3.5
const gpt35Results = await evaluate(
(input) => answerWithModel(input, "gpt-3.5-turbo"),
{
data: "capital-cities-qa",
evaluators: [qualityEvaluator],
experimentPrefix: "gpt-3.5",
}
);
// Evaluate GPT-4
const gpt4Results = await evaluate(
(input) => answerWithModel(input, "gpt-4"),
{
data: "capital-cities-qa",
evaluators: [qualityEvaluator],
experimentPrefix: "gpt-4",
}
);
// Compare results in LangSmith UIGather feedback from users on production traces:
import { Client } from "langsmith";
import { traceable } from "langsmith/traceable";
const client = new Client();
const chatbot = traceable(
async (message: string) => {
// Your chatbot logic
const response = await generateResponse(message);
return response;
},
{ name: "chatbot", run_type: "chain" }
);
// Execute chatbot and get run ID
const response = await chatbot("Hello!");
// Later, collect user feedback
await client.createFeedback({
run_id: response.runId, // Captured from trace context
key: "user-rating",
score: 1, // thumbs up
comment: "Great response!"
});Use tracing to monitor production LLM applications:
import { traceable } from "langsmith/traceable";
import { Client } from "langsmith";
const client = new Client({
projectName: "production-chatbot",
// Sampling: only trace 10% of requests in production
tracingSamplingRate: 0.1,
});
const productionBot = traceable(
async (userInput: string) => {
try {
const response = await processInput(userInput);
return { success: true, response };
} catch (error) {
// Errors are automatically captured in traces
return { success: false, error: error.message };
}
},
{
name: "production-bot",
run_type: "chain",
client: client,
metadata: {
environment: "production",
version: "2.1.0"
}
}
);
// Runs are sampled and traced
await productionBot("User query");Now that you have tracing and evaluation working, explore more features:
Issue: Traces not appearing in LangSmith UI
LANGCHAIN_API_KEY is set correctlyawait client.awaitPendingTraceBatches() is called before app shutdownIssue: Import errors with traceable
// ✓ Correct
import { traceable } from "langsmith/traceable";
// ✗ Incorrect
import { traceable } from "langsmith"; // Won't workIssue: Traces are batched and delayed
LangSmith batches traces for performance. To ensure immediate upload:
import { Client } from "langsmith";
const client = new Client({
autoBatchTracing: false, // Disable batching for debugging
});
// Or wait for pending batches
await client.awaitPendingTraceBatches();Issue: Missing environment variables in production
Make sure environment variables are set in your deployment platform:
// Verify at runtime
if (!process.env.LANGCHAIN_API_KEY) {
console.warn("LANGCHAIN_API_KEY not set - tracing disabled");
}Here's a complete working example combining tracing and evaluation:
import { traceable } from "langsmith/traceable";
import { Client, evaluate } from "langsmith";
import OpenAI from "openai";
import * as dotenv from "dotenv";
// Load environment
dotenv.config();
const client = new Client();
const openai = new OpenAI();
// 1. Define your application
const qaBot = traceable(
async (inputs: { question: string }) => {
const completion = await openai.chat.completions.create({
model: "gpt-3.5-turbo",
messages: [{ role: "user", content: inputs.question }],
temperature: 0,
});
return {
answer: completion.choices[0].message.content
};
},
{ name: "qa-bot", run_type: "chain" }
);
// 2. Create test dataset
const dataset = await client.createDataset({
datasetName: "qa-test-set",
description: "Test questions for QA bot",
});
await client.createExamples({
datasetId: dataset.id,
inputs: [
{ question: "What is 2+2?" },
{ question: "What is the capital of Spain?" },
],
outputs: [
{ answer: "4" },
{ answer: "Madrid" },
],
});
// 3. Create evaluator
const correctnessEval = ({ run, example }) => {
const predicted = run.outputs?.answer || "";
const expected = example?.outputs?.answer || "";
return {
key: "correctness",
score: predicted.includes(expected) ? 1 : 0
};
};
// 4. Run evaluation
const results = await evaluate(qaBot, {
data: "qa-test-set",
evaluators: [correctnessEval],
experimentPrefix: "qa-bot-v1",
});
// 5. View results
console.log(`Evaluated ${results.results.length} examples`);
const accuracy = results.results.filter(r =>
r.evaluation_results.find(e => e.key === "correctness")?.score === 1
).length / results.results.length;
console.log(`Accuracy: ${(accuracy * 100).toFixed(1)}%`);
console.log("View detailed results at: https://smith.langchain.com");
// 6. Cleanup - ensure traces are uploaded
await client.awaitPendingTraceBatches();Run this example:
node --loader ts-node/esm example.ts// Core
import { Client } from "langsmith";
import { traceable } from "langsmith/traceable";
import { RunTree } from "langsmith";
// Evaluation
import { evaluate } from "langsmith/evaluation";
// Wrappers
import { wrapOpenAI } from "langsmith/wrappers/openai";
import { wrapAnthropic } from "langsmith/wrappers/anthropic";
// LangChain
import { getLangchainCallbacks, RunnableTraceable } from "langsmith/langchain";
// Testing
import { test, expect, wrapEvaluator } from "langsmith/vitest";
import { test, expect, wrapEvaluator } from "langsmith/jest";LANGCHAIN_API_KEY=lsv2_pt_... # Required: Your API key
LANGCHAIN_PROJECT=my-project # Optional: Default project name
LANGCHAIN_ENDPOINT=https://... # Optional: API endpoint
LANGCHAIN_TRACING=true # Optional: Enable/disable tracingimport { Client } from "langsmith";
// Use environment variables
const client = new Client();
// Or explicit configuration
const client = new Client({
apiUrl: "https://api.smith.langchain.com",
apiKey: process.env.LANGCHAIN_API_KEY,
timeout_ms: 10000,
});import { traceable } from "langsmith/traceable";
const myFunction = traceable(
async (input) => {
// Your logic here
return output;
},
{
name: "my-function", // Run name
run_type: "chain", // Run type: llm, chain, tool, retriever, etc.
metadata: { version: "1.0" }, // Optional metadata
tags: ["production"], // Optional tags
}
);LangSmith provides several utility functions for common tasks like ID generation, environment configuration, custom fetch handling, and prompt caching.
Override the fetch implementation used by the client for proxies, mocking, or custom HTTP handling.
/**
* Override the fetch implementation used by the client
* @param fetch - Custom fetch function (e.g., for proxies or mocking)
*/
function overrideFetchImplementation(fetch: typeof globalThis.fetch): void;Usage Examples:
import { overrideFetchImplementation } from "langsmith";
// Use custom fetch (e.g., for proxy or testing)
const customFetch = (url: string, init?: RequestInit) => {
console.log("Fetching:", url);
return fetch(url, init);
};
overrideFetchImplementation(customFetch);
// With proxy
const proxyFetch = (url: string, init?: RequestInit) => {
return fetch(url, {
...init,
agent: proxyAgent,
});
};
overrideFetchImplementation(proxyFetch);Get the default project name from environment variables.
/**
* Get the default project name from environment variables
* @returns Project name from LANGCHAIN_PROJECT or LANGCHAIN_SESSION env vars
*/
function getDefaultProjectName(): string;Usage Examples:
import { getDefaultProjectName } from "langsmith";
// Get default project name from environment
const projectName = getDefaultProjectName();
console.log("Using project:", projectName);
// Use in client configuration
const client = new Client({
projectName: getDefaultProjectName(),
});Generate UUID v7 identifiers for runs and other entities.
/**
* Generate a random UUID v7 string
* @returns A UUID v7 string
*/
function uuid7(): string;
/**
* Generate a UUID v7 from a timestamp
* @param timestamp - The timestamp in milliseconds or ISO string
* @returns A UUID v7 string
*/
function uuid7FromTime(timestamp: number | string): string;Usage Examples:
import { uuid7, uuid7FromTime } from "langsmith";
// Generate UUID v7
const runId = uuid7();
console.log("Run ID:", runId);
// Generate UUID v7 from timestamp
const timestampId = uuid7FromTime(Date.now());
const dateId = uuid7FromTime("2024-01-01T00:00:00Z");
// Use for manual run creation
await client.createRun({
id: uuid7(),
name: "my-run",
run_type: "chain",
// ...
});Built-in caching mechanism for prompts to reduce latency and API calls.
/**
* Cache class for storing and retrieving prompts with TTL and refresh capabilities
*/
class Cache {
constructor(config?: CacheConfig);
/** Get cached value or fetch if missing/stale */
get(key: string): Promise<PromptCommit | undefined>;
/** Store value in cache */
set(key: string, value: PromptCommit): void;
/** Clear all cached entries */
clear(): void;
/** Stop background refresh timers */
stop(): void;
}
interface CacheConfig {
/** Maximum entries in cache (LRU eviction when exceeded). Default: 100 */
maxSize?: number;
/** Time in seconds before entry is stale. null = infinite TTL. Default: 3600 */
ttlSeconds?: number | null;
/** How often to check for stale entries in seconds. Default: 60 */
refreshIntervalSeconds?: number;
/** Function to fetch fresh data when cache miss or stale */
fetchFunc?: (key: string) => Promise<PromptCommit>;
}Usage Examples:
import { Cache } from "langsmith";
// Use prompt cache
const cache = new Cache({
maxSize: 100,
ttlSeconds: 3600,
fetchFunc: async (key) => {
// Fetch prompt from LangSmith
return await client.pullPromptCommit(key);
},
});
const prompt = await cache.get("my-prompt:latest");
// Cleanup when done
cache.stop();Access the package version constant for debugging and compatibility checks.
/**
* Package version constant
*/
const __version__: string;Usage Examples:
import { __version__ } from "langsmith";
console.log("LangSmith SDK version:", __version__);
// Include in metadata for debugging
const client = new Client({
metadata: {
sdkVersion: __version__,
},
});