or run

npx @tessl/cli init

Getting Started with LangSmith

A practical guide to getting started with LangSmith, covering installation, environment setup, first trace, and first evaluation.

Installation

Install the LangSmith SDK using npm or yarn:

npm install langsmith

yarn add langsmith

For use with specific frameworks:

# With LangChain
npm install langsmith @langchain/core @langchain/openai

# With OpenAI SDK
npm install langsmith openai

# With Anthropic SDK
npm install langsmith @anthropic-ai/sdk

# With Vercel AI SDK
npm install langsmith ai @ai-sdk/openai

Environment Setup

Get Your API Key

Sign up for a LangSmith account at smith.langchain.com
Navigate to Settings to find your API key
Optionally create a new API key for your project

Configure Environment Variables

Set the following environment variables:

export LANGCHAIN_API_KEY="your-api-key-here"
export LANGCHAIN_PROJECT="my-first-project"

# Optional: Custom endpoint (defaults to https://api.smith.langchain.com)
export LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"

For development, use a .env file:

# .env
LANGCHAIN_API_KEY=lsv2_pt_...
LANGCHAIN_PROJECT=my-first-project

Load environment variables in your application:

import * as dotenv from "dotenv";
dotenv.config();

Verify Configuration

Check that your environment is configured correctly:

import { Client } from "langsmith";

const client = new Client();

// Test API connectivity
try {
  const config = Client.getDefaultClientConfig();
  console.log("API URL:", config.apiUrl);
  console.log("API Key configured:", !!config.apiKey);

  // Try creating a simple project
  const project = await client.createProject({
    projectName: "test-connection",
    description: "Testing LangSmith connection"
  });

  console.log("Connection successful! Project ID:", project.id);
} catch (error) {
  console.error("Configuration error:", error.message);
}

Your First Trace

The simplest way to trace an LLM call is using the traceable() decorator.

Basic Tracing

import { traceable } from "langsmith/traceable";

// Wrap any function with traceable
const greet = traceable(
  async (name: string) => {
    return `Hello, ${name}!`;
  },
  { name: "greet-user", run_type: "chain" }
);

// Call the function - automatically traced to LangSmith
const greeting = await greet("Alice");
console.log(greeting); // "Hello, Alice!"

Tracing an LLM Call

import { traceable } from "langsmith/traceable";
import OpenAI from "openai";

const openai = new OpenAI();

const generateAnswer = traceable(
  async (question: string) => {
    const completion = await openai.chat.completions.create({
      model: "gpt-4",
      messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: question }
      ],
      temperature: 0.7,
    });

    return completion.choices[0].message.content;
  },
  { name: "generate-answer", run_type: "llm" }
);

// Execute and trace
const answer = await generateAnswer("What is the capital of France?");
console.log("Answer:", answer);

// View your trace at: https://smith.langchain.com

Nested Tracing

Create hierarchical traces automatically:

import { traceable } from "langsmith/traceable";

// Child function
const retrieveDocs = traceable(
  async (query: string) => {
    // Simulate document retrieval
    return ["Doc 1 about Paris", "Doc 2 about France"];
  },
  { name: "retrieve-docs", run_type: "retriever" }
);

// Parent function that calls child
const ragPipeline = traceable(
  async (question: string) => {
    // This call is automatically traced as a child
    const docs = await retrieveDocs(question);

    const context = docs.join("\n");
    const answer = `Based on: ${context}\nAnswer: Paris is the capital.`;

    return answer;
  },
  { name: "rag-pipeline", run_type: "chain" }
);

// Execute - creates a parent trace with child traces
const result = await ragPipeline("What is the capital of France?");

View Your Traces

After running traced functions, view them in the LangSmith UI:

Go to smith.langchain.com
Navigate to your project (e.g., "my-first-project")
Click on a trace to see:
- Input and output data
- Execution timeline
- Nested run hierarchy
- Performance metrics (latency, tokens)
- Any errors or exceptions

Your First Evaluation

Evaluation helps you systematically test your LLM application against a dataset of examples.

Create a Dataset

First, create a dataset with test examples:

import { Client } from "langsmith";

const client = new Client();

// Create a dataset
const dataset = await client.createDataset({
  datasetName: "capital-cities-qa",
  description: "Questions about capital cities",
  dataType: "kv",
});

// Add test examples
await client.createExamples({
  datasetId: dataset.id,
  inputs: [
    { question: "What is the capital of France?" },
    { question: "What is the capital of Japan?" },
    { question: "What is the capital of Brazil?" },
  ],
  outputs: [
    { answer: "Paris" },
    { answer: "Tokyo" },
    { answer: "Brasília" },
  ],
});

console.log("Dataset created:", dataset.id);

Define Your Target Function

This is the function you want to evaluate:

import { traceable } from "langsmith/traceable";
import OpenAI from "openai";

const openai = new OpenAI();

const answerQuestion = traceable(
  async (inputs: { question: string }) => {
    const completion = await openai.chat.completions.create({
      model: "gpt-3.5-turbo",
      messages: [
        { role: "system", content: "Answer questions concisely." },
        { role: "user", content: inputs.question }
      ],
      temperature: 0,
    });

    return {
      answer: completion.choices[0].message.content
    };
  },
  { name: "answer-question", run_type: "chain" }
);

Create Custom Evaluators

Define how to score your results:

// Check if the answer matches the expected output
const correctnessEvaluator = ({ run, example }) => {
  const predicted = run.outputs?.answer || "";
  const expected = example?.outputs?.answer || "";

  // Simple exact match
  const isCorrect = predicted.toLowerCase().includes(expected.toLowerCase());

  return {
    key: "correctness",
    score: isCorrect ? 1 : 0,
    comment: isCorrect ? "Correct answer" : "Incorrect answer"
  };
};

// Check response length
const lengthEvaluator = ({ run }) => {
  const answer = run.outputs?.answer || "";
  const wordCount = answer.split(" ").length;

  // Prefer concise answers (under 50 words)
  const score = wordCount <= 50 ? 1 : 0;

  return {
    key: "conciseness",
    score: score,
    value: wordCount,
    comment: `${wordCount} words`
  };
};

Run Evaluation

Run your target function against the dataset:

import { evaluate } from "langsmith/evaluation";

// Run evaluation
const results = await evaluate(answerQuestion, {
  data: "capital-cities-qa", // Dataset name
  evaluators: [correctnessEvaluator, lengthEvaluator],
  experimentPrefix: "capital-cities-eval",
  metadata: {
    model: "gpt-3.5-turbo",
    temperature: 0,
  },
});

// View results
console.log("Evaluation complete!");
console.log("Results:", results.results.length);

// Calculate aggregate scores
let correctCount = 0;
let totalConcise = 0;

for (const row of results.results) {
  const correctness = row.evaluation_results.find(e => e.key === "correctness");
  const conciseness = row.evaluation_results.find(e => e.key === "conciseness");

  if (correctness?.score === 1) correctCount++;
  if (conciseness?.score === 1) totalConcise++;
}

const accuracy = correctCount / results.results.length;
const concisenessRate = totalConcise / results.results.length;

console.log(`Accuracy: ${(accuracy * 100).toFixed(1)}%`);
console.log(`Conciseness: ${(concisenessRate * 100).toFixed(1)}%`);

View Evaluation Results

Go to smith.langchain.com
Navigate to the Datasets tab
Find your dataset ("capital-cities-qa")
View the experiment run
See detailed results including:
- Individual example results
- Aggregate metrics
- Score distributions
- Comparison with previous runs

Common Patterns

Pattern 1: Trace and Iterate

Use tracing to debug and improve your application:

import { traceable } from "langsmith/traceable";

const pipeline = traceable(
  async (input: string) => {
    // Step 1: Preprocess
    const cleaned = input.trim().toLowerCase();

    // Step 2: Process with LLM
    const result = await callLLM(cleaned);

    // Step 3: Post-process
    const final = result.toUpperCase();

    return final;
  },
  {
    name: "my-pipeline",
    run_type: "chain",
    // Add metadata for debugging
    metadata: { version: "1.0" }
  }
);

// Run and check traces in UI
await pipeline("  Hello World  ");

Pattern 2: Compare Models

Evaluate different models on the same dataset:

import { evaluate } from "langsmith/evaluation";

// Evaluator
const qualityEvaluator = ({ run, example }) => ({
  key: "quality",
  score: run.outputs?.answer === example?.outputs?.answer ? 1 : 0
});

// Evaluate GPT-3.5
const gpt35Results = await evaluate(
  (input) => answerWithModel(input, "gpt-3.5-turbo"),
  {
    data: "capital-cities-qa",
    evaluators: [qualityEvaluator],
    experimentPrefix: "gpt-3.5",
  }
);

// Evaluate GPT-4
const gpt4Results = await evaluate(
  (input) => answerWithModel(input, "gpt-4"),
  {
    data: "capital-cities-qa",
    evaluators: [qualityEvaluator],
    experimentPrefix: "gpt-4",
  }
);

// Compare results in LangSmith UI

Pattern 3: Collect User Feedback

Gather feedback from users on production traces:

import { Client } from "langsmith";
import { traceable } from "langsmith/traceable";

const client = new Client();

const chatbot = traceable(
  async (message: string) => {
    // Your chatbot logic
    const response = await generateResponse(message);
    return response;
  },
  { name: "chatbot", run_type: "chain" }
);

// Execute chatbot and get run ID
const response = await chatbot("Hello!");

// Later, collect user feedback
await client.createFeedback({
  run_id: response.runId, // Captured from trace context
  key: "user-rating",
  score: 1, // thumbs up
  comment: "Great response!"
});

Pattern 4: Monitor Production

Use tracing to monitor production LLM applications:

import { traceable } from "langsmith/traceable";
import { Client } from "langsmith";

const client = new Client({
  projectName: "production-chatbot",
  // Sampling: only trace 10% of requests in production
  tracingSamplingRate: 0.1,
});

const productionBot = traceable(
  async (userInput: string) => {
    try {
      const response = await processInput(userInput);
      return { success: true, response };
    } catch (error) {
      // Errors are automatically captured in traces
      return { success: false, error: error.message };
    }
  },
  {
    name: "production-bot",
    run_type: "chain",
    client: client,
    metadata: {
      environment: "production",
      version: "2.1.0"
    }
  }
);

// Runs are sampled and traced
await productionBot("User query");

Next Steps

Now that you have tracing and evaluation working, explore more features:

Tracing and Observability

Tracing: Learn about advanced tracing features including distributed tracing, run tree manipulation, and context management
Client API: Explore the full Client API for managing projects, runs, and batching

Datasets and Evaluation

Datasets: Create and manage datasets, including versioning, sharing, and converting production runs to examples
Evaluation: Advanced evaluation patterns including LLM-as-judge, custom evaluators, and comparative evaluation
Feedback: Collect and manage feedback from humans, models, and automated systems

Framework Integrations

LangChain Integration: Integrate with LangChain applications using callbacks and runnable wrappers
SDK Wrappers: Automatically trace OpenAI, Anthropic, and Vercel AI SDK calls
Testing Frameworks: Overview of Jest and Vitest integration
- Jest Integration: Test-driven evaluation with Jest
- Vitest Integration: Test-driven evaluation with Vitest

Advanced Topics

Prompt Management: Version control and share prompts through the LangSmith Hub
OpenTelemetry Integration: Standards-based instrumentation and distributed tracing
Data Anonymization: Remove sensitive information from traces
Annotation Queues: Human-in-the-loop feedback workflows
Advanced Topics: Overview of all advanced features
Type Reference: Complete TypeScript type definitions for all LangSmith entities

Troubleshooting

Common Issues

Issue: Traces not appearing in LangSmith UI

Check that LANGCHAIN_API_KEY is set correctly
Verify project name in environment variables or client config
Ensure await client.awaitPendingTraceBatches() is called before app shutdown
Check for network connectivity to api.smith.langchain.com

Issue: Import errors with traceable

// ✓ Correct
import { traceable } from "langsmith/traceable";

// ✗ Incorrect
import { traceable } from "langsmith"; // Won't work

Issue: Traces are batched and delayed

LangSmith batches traces for performance. To ensure immediate upload:

import { Client } from "langsmith";

const client = new Client({
  autoBatchTracing: false, // Disable batching for debugging
});

// Or wait for pending batches
await client.awaitPendingTraceBatches();

Issue: Missing environment variables in production

Make sure environment variables are set in your deployment platform:

// Verify at runtime
if (!process.env.LANGCHAIN_API_KEY) {
  console.warn("LANGCHAIN_API_KEY not set - tracing disabled");
}

Getting Help

Documentation: smith.langchain.com/docs
GitHub Issues: github.com/langchain-ai/langsmith-sdk
Discord Community: LangChain Discord

Complete Example

Here's a complete working example combining tracing and evaluation:

import { traceable } from "langsmith/traceable";
import { Client, evaluate } from "langsmith";
import OpenAI from "openai";
import * as dotenv from "dotenv";

// Load environment
dotenv.config();

const client = new Client();
const openai = new OpenAI();

// 1. Define your application
const qaBot = traceable(
  async (inputs: { question: string }) => {
    const completion = await openai.chat.completions.create({
      model: "gpt-3.5-turbo",
      messages: [{ role: "user", content: inputs.question }],
      temperature: 0,
    });

    return {
      answer: completion.choices[0].message.content
    };
  },
  { name: "qa-bot", run_type: "chain" }
);

// 2. Create test dataset
const dataset = await client.createDataset({
  datasetName: "qa-test-set",
  description: "Test questions for QA bot",
});

await client.createExamples({
  datasetId: dataset.id,
  inputs: [
    { question: "What is 2+2?" },
    { question: "What is the capital of Spain?" },
  ],
  outputs: [
    { answer: "4" },
    { answer: "Madrid" },
  ],
});

// 3. Create evaluator
const correctnessEval = ({ run, example }) => {
  const predicted = run.outputs?.answer || "";
  const expected = example?.outputs?.answer || "";
  return {
    key: "correctness",
    score: predicted.includes(expected) ? 1 : 0
  };
};

// 4. Run evaluation
const results = await evaluate(qaBot, {
  data: "qa-test-set",
  evaluators: [correctnessEval],
  experimentPrefix: "qa-bot-v1",
});

// 5. View results
console.log(`Evaluated ${results.results.length} examples`);
const accuracy = results.results.filter(r =>
  r.evaluation_results.find(e => e.key === "correctness")?.score === 1
).length / results.results.length;

console.log(`Accuracy: ${(accuracy * 100).toFixed(1)}%`);
console.log("View detailed results at: https://smith.langchain.com");

// 6. Cleanup - ensure traces are uploaded
await client.awaitPendingTraceBatches();

Run this example:

node --loader ts-node/esm example.ts

Quick Reference

Essential Imports

// Core
import { Client } from "langsmith";
import { traceable } from "langsmith/traceable";
import { RunTree } from "langsmith";

// Evaluation
import { evaluate } from "langsmith/evaluation";

// Wrappers
import { wrapOpenAI } from "langsmith/wrappers/openai";
import { wrapAnthropic } from "langsmith/wrappers/anthropic";

// LangChain
import { getLangchainCallbacks, RunnableTraceable } from "langsmith/langchain";

// Testing
import { test, expect, wrapEvaluator } from "langsmith/vitest";
import { test, expect, wrapEvaluator } from "langsmith/jest";

Environment Variables

LANGCHAIN_API_KEY=lsv2_pt_...        # Required: Your API key
LANGCHAIN_PROJECT=my-project         # Optional: Default project name
LANGCHAIN_ENDPOINT=https://...       # Optional: API endpoint
LANGCHAIN_TRACING=true               # Optional: Enable/disable tracing

Basic Client Setup

import { Client } from "langsmith";

// Use environment variables
const client = new Client();

// Or explicit configuration
const client = new Client({
  apiUrl: "https://api.smith.langchain.com",
  apiKey: process.env.LANGCHAIN_API_KEY,
  timeout_ms: 10000,
});

Basic Traceable Usage

import { traceable } from "langsmith/traceable";

const myFunction = traceable(
  async (input) => {
    // Your logic here
    return output;
  },
  {
    name: "my-function",       // Run name
    run_type: "chain",         // Run type: llm, chain, tool, retriever, etc.
    metadata: { version: "1.0" }, // Optional metadata
    tags: ["production"],      // Optional tags
  }
);

Utility Functions

LangSmith provides several utility functions for common tasks like ID generation, environment configuration, custom fetch handling, and prompt caching.

Override Fetch Implementation

Override the fetch implementation used by the client for proxies, mocking, or custom HTTP handling.

/**
 * Override the fetch implementation used by the client
 * @param fetch - Custom fetch function (e.g., for proxies or mocking)
 */
function overrideFetchImplementation(fetch: typeof globalThis.fetch): void;

Usage Examples:

import { overrideFetchImplementation } from "langsmith";

// Use custom fetch (e.g., for proxy or testing)
const customFetch = (url: string, init?: RequestInit) => {
  console.log("Fetching:", url);
  return fetch(url, init);
};
overrideFetchImplementation(customFetch);

// With proxy
const proxyFetch = (url: string, init?: RequestInit) => {
  return fetch(url, {
    ...init,
    agent: proxyAgent,
  });
};
overrideFetchImplementation(proxyFetch);

Get Default Project Name

Get the default project name from environment variables.

/**
 * Get the default project name from environment variables
 * @returns Project name from LANGCHAIN_PROJECT or LANGCHAIN_SESSION env vars
 */
function getDefaultProjectName(): string;

Usage Examples:

import { getDefaultProjectName } from "langsmith";

// Get default project name from environment
const projectName = getDefaultProjectName();
console.log("Using project:", projectName);

// Use in client configuration
const client = new Client({
  projectName: getDefaultProjectName(),
});

UUID Generation

Generate UUID v7 identifiers for runs and other entities.

/**
 * Generate a random UUID v7 string
 * @returns A UUID v7 string
 */
function uuid7(): string;

/**
 * Generate a UUID v7 from a timestamp
 * @param timestamp - The timestamp in milliseconds or ISO string
 * @returns A UUID v7 string
 */
function uuid7FromTime(timestamp: number | string): string;

Usage Examples:

import { uuid7, uuid7FromTime } from "langsmith";

// Generate UUID v7
const runId = uuid7();
console.log("Run ID:", runId);

// Generate UUID v7 from timestamp
const timestampId = uuid7FromTime(Date.now());
const dateId = uuid7FromTime("2024-01-01T00:00:00Z");

// Use for manual run creation
await client.createRun({
  id: uuid7(),
  name: "my-run",
  run_type: "chain",
  // ...
});

Prompt Cache

Built-in caching mechanism for prompts to reduce latency and API calls.

/**
 * Cache class for storing and retrieving prompts with TTL and refresh capabilities
 */
class Cache {
  constructor(config?: CacheConfig);

  /** Get cached value or fetch if missing/stale */
  get(key: string): Promise<PromptCommit | undefined>;

  /** Store value in cache */
  set(key: string, value: PromptCommit): void;

  /** Clear all cached entries */
  clear(): void;

  /** Stop background refresh timers */
  stop(): void;
}

interface CacheConfig {
  /** Maximum entries in cache (LRU eviction when exceeded). Default: 100 */
  maxSize?: number;
  /** Time in seconds before entry is stale. null = infinite TTL. Default: 3600 */
  ttlSeconds?: number | null;
  /** How often to check for stale entries in seconds. Default: 60 */
  refreshIntervalSeconds?: number;
  /** Function to fetch fresh data when cache miss or stale */
  fetchFunc?: (key: string) => Promise<PromptCommit>;
}

Usage Examples:

import { Cache } from "langsmith";

// Use prompt cache
const cache = new Cache({
  maxSize: 100,
  ttlSeconds: 3600,
  fetchFunc: async (key) => {
    // Fetch prompt from LangSmith
    return await client.pullPromptCommit(key);
  },
});

const prompt = await cache.get("my-prompt:latest");

// Cleanup when done
cache.stop();

Package Version

Access the package version constant for debugging and compatibility checks.

/**
 * Package version constant
 */
const __version__: string;

Usage Examples:

import { __version__ } from "langsmith";

console.log("LangSmith SDK version:", __version__);

// Include in metadata for debugging
const client = new Client({
  metadata: {
    sdkVersion: __version__,
  },
});

Version

Tile

Files

getting-started.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

Getting Started with LangSmith

Installation

Environment Setup

Get Your API Key

Configure Environment Variables

Verify Configuration

Your First Trace

Basic Tracing

Tracing an LLM Call

Nested Tracing

View Your Traces

Your First Evaluation

Create a Dataset

Define Your Target Function

Create Custom Evaluators

Run Evaluation

View Evaluation Results

Common Patterns

Pattern 1: Trace and Iterate

Pattern 2: Compare Models

Pattern 3: Collect User Feedback

Pattern 4: Monitor Production

Next Steps

Tracing and Observability

Datasets and Evaluation

Framework Integrations

Advanced Topics

Troubleshooting

Common Issues

Getting Help

Complete Example

Quick Reference

Essential Imports

Environment Variables

Basic Client Setup

Basic Traceable Usage

Utility Functions

Override Fetch Implementation

Get Default Project Name

UUID Generation

Prompt Cache

Package Version

Related Documentation

getting-started.mddocs/