or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/langsmith@0.4.x

docs

index.md
tile.json

tessl/npm-langsmith

tessl install tessl/npm-langsmith@0.4.3

TypeScript client SDK for the LangSmith LLM tracing, evaluation, and monitoring platform.

anti-patterns.mddocs/guides/

Anti-Patterns and Common Mistakes

What NOT to do when using LangSmith - a comprehensive guide to avoiding common pitfalls.

Overview

This guide documents anti-patterns, common mistakes, and their corrections. Following these guidelines prevents performance issues, data loss, security vulnerabilities, and incorrect behavior.

Tracing Anti-Patterns

❌ Creating New Wrappers on Every Call

Problem: Creates unnecessary overhead and loses trace context.

Don't:

async function makeOpenAICall(prompt: string) {
  // BAD: Creates new wrapper on every call
  const openai = wrapOpenAI(new OpenAI());
  return await openai.chat.completions.create({
    model: "gpt-4",
    messages: [{ role: "user", content: prompt }]
  });
}

Do:

// GOOD: Create wrapper once at module level
const openai = wrapOpenAI(new OpenAI(), {
  projectName: "my-project"
});

async function makeOpenAICall(prompt: string) {
  return await openai.chat.completions.create({
    model: "gpt-4",
    messages: [{ role: "user", content: prompt }]
  });
}

❌ Wrapping Same Client Multiple Times

Problem: Multiple wrappers conflict and cause duplicate traces.

Don't:

const openai = new OpenAI();
const wrapped1 = wrapOpenAI(openai);
const wrapped2 = wrapOpenAI(wrapped1); // BAD: Double wrapping

Do:

const openai = new OpenAI();
const wrappedOpenAI = wrapOpenAI(openai); // GOOD: Wrap once

❌ Forgetting to Flush in Serverless

Problem: Traces may not upload before function terminates.

Don't:

export const handler = async (event) => {
  await tracedFunction(event);
  return { statusCode: 200 }; // BAD: Traces may be lost
};

Do:

import { Client } from "langsmith";

const client = new Client();

export const handler = async (event) => {
  try {
    const result = await tracedFunction(event);

    // GOOD: Ensure traces upload before return
    await client.awaitPendingTraceBatches();

    return { statusCode: 200, body: result };
  } catch (error) {
    await client.awaitPendingTraceBatches(); // Also flush on error
    throw error;
  }
};

❌ Using Wrong Import Paths

Problem: Subpath exports are required - importing from main export fails.

Don't:

// BAD: Won't work - traceable not exported from main
import { traceable } from "langsmith";

// BAD: Won't work - evaluate not in main export
import { evaluate } from "langsmith";

Do:

// GOOD: Use subpath exports
import { traceable } from "langsmith/traceable";
import { evaluate } from "langsmith/evaluation";
import { wrapOpenAI } from "langsmith/wrappers/openai";

❌ Using Blocking Mode in Production

Problem: Blocking mode adds latency to every traced call.

Don't:

const client = new Client({
  blockOnRootRunFinalization: true // BAD: Blocks on every root run
});

Do:

const client = new Client({
  blockOnRootRunFinalization: false, // GOOD: Non-blocking async upload
  autoBatchTracing: true
});

// Flush only when needed (shutdown, critical operations)
await client.awaitPendingTraceBatches();

❌ Not Using Descriptive Run Names

Problem: Makes traces hard to understand and debug.

Don't:

const fn1 = traceable(async (x) => x, { name: "fn1" }); // BAD: Generic name
const process = traceable(async (x) => x, { name: "process" }); // BAD: Too vague
const func = traceable(async (x) => x); // BAD: No name (uses function name)

Do:

const extractEntities = traceable(async (text) => {...}, {
  name: "extract-entities",
  run_type: "tool"
});

const generateSummary = traceable(async (doc) => {...}, {
  name: "generate-summary",
  run_type: "llm"
});

Configuration Anti-Patterns

❌ Hardcoding API Keys

Problem: Security risk, not portable across environments.

Don't:

const client = new Client({
  apiKey: "lsv2_pt_abc123..." // BAD: Hardcoded secret
});

Do:

// GOOD: Use environment variables
const client = new Client({
  apiKey: process.env.LANGCHAIN_API_KEY
});

// BEST: Use default env-based client
const client = new Client(); // Reads from LANGCHAIN_API_KEY automatically

❌ Tracing 100% in High-Volume Production

Problem: Unnecessary cost and potential performance impact.

Don't:

const client = new Client({
  tracingSamplingRate: 1.0 // BAD: Traces every single request
});

Do:

const client = new Client({
  tracingSamplingRate: process.env.NODE_ENV === "production" ? 0.1 : 1.0,
  // GOOD: Sample 10% in prod, 100% in dev
});

❌ Not Handling Missing API Key

Problem: Silent failures or unclear error messages.

Don't:

const client = new Client();
// BAD: May fail later with unclear error
await client.createRun({...});

Do:

if (!process.env.LANGCHAIN_API_KEY) {
  console.warn("LANGCHAIN_API_KEY not set - tracing disabled");
}

const client = new Client();

// Or: Verify at startup
try {
  const config = Client.getDefaultClientConfig();
  if (!config.apiKey) {
    throw new Error("LANGCHAIN_API_KEY environment variable is required");
  }
} catch (error) {
  console.error("LangSmith configuration error:", error);
  process.exit(1);
}

❌ Disabling Batching Without Reason

Problem: Increases network overhead and reduces performance.

Don't:

const client = new Client({
  autoBatchTracing: false // BAD: Unless debugging, keep enabled
});

Do:

// GOOD: Use batching in production
const client = new Client({
  autoBatchTracing: true,
  batchSizeBytesLimit: 20_000_000
});

// ACCEPTABLE: Disable only for debugging specific issues
const debugClient = new Client({
  autoBatchTracing: false, // OK for debugging
  debug: true
});

Dataset Anti-Patterns

❌ Creating Examples Without Dataset Reference

Problem: Examples must belong to a dataset.

Don't:

await client.createExample({
  inputs: { question: "What is AI?" },
  outputs: { answer: "AI is..." }
  // BAD: Missing dataset_id or datasetName
});

Do:

await client.createExample({
  dataset_id: datasetId, // GOOD: Always specify dataset
  inputs: { question: "What is AI?" },
  outputs: { answer: "AI is..." }
});

❌ Not Checking Dataset Existence

Problem: Creates duplicate datasets or fails with unclear errors.

Don't:

// BAD: May create duplicate
await client.createDataset({
  datasetName: "my-dataset"
});

Do:

// GOOD: Check existence first
const exists = await client.hasDataset({ datasetName: "my-dataset" });

if (!exists) {
  await client.createDataset({
    datasetName: "my-dataset",
    description: "My test dataset"
  });
}

// BETTER: Use upsert
await client.createDataset({
  datasetName: "my-dataset",
  description: "My test dataset",
  upsert: true // Creates or uses existing
});

❌ Mismatched Array Lengths in createExamples

Problem: Arrays must be parallel - same length.

Don't:

await client.createExamples({
  datasetName: "qa",
  inputs: [{ q: "A" }, { q: "B" }, { q: "C" }], // 3 items
  outputs: [{ a: "1" }, { a: "2" }] // BAD: Only 2 items
});

Do:

// GOOD: Ensure parallel arrays have same length
await client.createExamples({
  datasetName: "qa",
  inputs: [{ q: "A" }, { q: "B" }, { q: "C" }],
  outputs: [{ a: "1" }, { a: "2" }, { a: "3" }] // GOOD: 3 items
});

// BETTER: Use examples array to avoid this issue
await client.createExamples({
  datasetName: "qa",
  examples: [
    { inputs: { q: "A" }, outputs: { a: "1" } },
    { inputs: { q: "B" }, outputs: { a: "2" } },
    { inputs: { q: "C" }, outputs: { a: "3" } }
  ]
});

Privacy Anti-Patterns

❌ Logging Credentials in Traces

Problem: API keys, passwords, tokens visible in LangSmith UI.

Don't:

const traced = traceable(
  async (apiKey: string, data: any) => {
    // BAD: API key will be in trace inputs
    return await callExternalAPI(apiKey, data);
  },
  { name: "api-call" }
);

Do:

const traced = traceable(
  async (apiKey: string, data: any) => {
    return await callExternalAPI(apiKey, data);
  },
  {
    name: "api-call",
    processInputs: (inputs) => ({
      data: inputs.data
      // GOOD: API key not included in logged inputs
    })
  }
);

❌ Not Anonymizing User PII

Problem: User emails, SSNs, phone numbers in traces.

Don't:

const processUser = traceable(
  async (user: { email: string; ssn: string }) => {
    // BAD: PII will be logged
    return await analyze(user);
  }
);

Do:

import { createAnonymizer } from "langsmith/anonymizer";

const anonymizer = createAnonymizer([
  { pattern: /\b[\w\.-]+@[\w\.-]+\.\w+\b/g, replace: "[EMAIL]" },
  { pattern: /\b\d{3}-\d{2}-\d{4}\b/g, replace: "[SSN]" }
]);

const processUser = traceable(
  async (user: { email: string; ssn: string }) => {
    return await analyze(user);
  },
  {
    processInputs: anonymizer,
    processOutputs: anonymizer
  }
);

❌ Storing Sensitive Data in Metadata

Problem: Metadata is visible and searchable.

Don't:

const traced = traceable(
  async (data) => {...},
  {
    metadata: {
      userId: "user-123",
      apiKey: "sk-...", // BAD: Secret in metadata
      creditCard: "4111..." // BAD: PII in metadata
    }
  }
);

Do:

const traced = traceable(
  async (data) => {...},
  {
    metadata: {
      userId: "user-123", // OK: Non-sensitive ID
      hasApiKey: true, // GOOD: Boolean instead of value
      paymentMethod: "credit" // GOOD: Category, not actual number
    }
  }
);

Evaluation Anti-Patterns

❌ Not Handling Evaluator Errors

Problem: One failing evaluator stops entire evaluation.

Don't:

const riskyEvaluator = async ({ run }) => {
  const score = run.outputs.value / run.inputs.divisor; // BAD: May throw division by zero
  return { key: "score", score };
};

await evaluate(target, {
  data: "dataset",
  evaluators: [riskyEvaluator] // BAD: Will crash on error
});

Do:

const safeEvaluator = async ({ run }) => {
  try {
    const score = run.outputs.value / run.inputs.divisor;
    return { key: "score", score };
  } catch (error) {
    // GOOD: Handle errors gracefully
    return {
      key: "score",
      score: null,
      comment: `Evaluation failed: ${error.message}`
    };
  }
};

await evaluate(target, {
  data: "dataset",
  evaluators: [safeEvaluator]
});

❌ Using Synchronous Blocking Operations

Problem: Blocks event loop during evaluation.

Don't:

const slowEvaluator = ({ run, example }) => {
  // BAD: Synchronous heavy computation
  for (let i = 0; i < 1000000000; i++) {
    // blocking work
  }
  return { key: "quality", score: 1 };
};

Do:

const asyncEvaluator = async ({ run, example }) => {
  // GOOD: Async operation
  const score = await computeQualityAsync(run.outputs);
  return { key: "quality", score };
};

❌ Not Specifying Dataset Correctly

Problem: Evaluation fails to find dataset.

Don't:

await evaluate(target, {
  data: "my dataset" // BAD: Spaces in name may cause issues
});

Do:

await evaluate(target, {
  data: "my-dataset" // GOOD: Use kebab-case or underscores
});

// BETTER: Use dataset ID for certainty
await evaluate(target, {
  data: datasetId // BEST: Unambiguous
});

Client API Anti-Patterns

❌ Not Using Async Iteration for Large Results

Problem: Loading all runs into memory causes OOM errors.

Don't:

// BAD: Tries to load all runs into array
const allRuns = [];
for await (const run of client.listRuns({ projectName: "big-project" })) {
  allRuns.push(run); // BAD: May run out of memory
}

Do:

// GOOD: Process runs one at a time
let processedCount = 0;
for await (const run of client.listRuns({ projectName: "big-project" })) {
  await processRun(run); // GOOD: Stream processing
  processedCount++;

  // GOOD: Safety limit
  if (processedCount >= 10000) {
    console.warn("Reached safety limit");
    break;
  }
}

❌ Ignoring Pagination Limits

Problem: Tries to fetch unlimited results.

Don't:

// BAD: No limit - may timeout or OOM
for await (const run of client.listRuns({ projectName: "huge-project" })) {
  console.log(run.name);
}

Do:

// GOOD: Set reasonable limit
for await (const run of client.listRuns({
  projectName: "huge-project",
  limit: 1000 // GOOD: Explicit limit
})) {
  console.log(run.name);
}

❌ Not Handling API Errors

Problem: Unhandled rejections crash application.

Don't:

// BAD: No error handling
const run = await client.readRun(runId);
console.log(run.name);

Do:

// GOOD: Handle expected errors
try {
  const run = await client.readRun(runId);
  console.log(run.name);
} catch (error) {
  if (error.status === 404) {
    console.error("Run not found");
  } else if (error.status === 401) {
    console.error("Authentication failed - check LANGCHAIN_API_KEY");
  } else {
    console.error("API error:", error.message);
  }
}

Performance Anti-Patterns

❌ Creating RunTrees Without Posting

Problem: Memory leaks from unposted runs.

Don't:

for (let i = 0; i < 10000; i++) {
  const run = new RunTree({ name: `run-${i}`, run_type: "tool" });
  await run.end({ result: i });
  // BAD: Never posted - accumulates in memory
}

Do:

for (let i = 0; i < 10000; i++) {
  const run = new RunTree({ name: `run-${i}`, run_type: "tool" });
  await run.end({ result: i });
  await run.postRun(); // GOOD: Post immediately
}

❌ Excessive Metadata in High-Volume Traces

Problem: Large metadata increases trace size and cost.

Don't:

const traced = traceable(
  async (input) => {...},
  {
    metadata: {
      fullConfig: entireConfigObject, // BAD: Large object
      history: allPreviousRequests, // BAD: Huge array
      debug: complexDebugInfo // BAD: Verbose data
    }
  }
);

Do:

const traced = traceable(
  async (input) => {...},
  {
    metadata: {
      configVersion: "v1.2.3", // GOOD: Minimal reference
      requestCount: 42, // GOOD: Summary metric
      debugEnabled: true // GOOD: Boolean flag
    }
  }
);

❌ Not Cleaning Up Resources

Problem: Background timers and resources leak.

Don't:

async function main() {
  const client = new Client({ cache: true });
  await doWork();
  // BAD: Cache timers still running
  process.exit(0);
}

Do:

async function main() {
  const client = new Client({ cache: true });

  try {
    await doWork();
  } finally {
    await client.awaitPendingTraceBatches();
    client.cleanup(); // GOOD: Stop timers, cleanup resources
  }
}

Feedback Anti-Patterns

❌ Inconsistent Feedback Keys

Problem: Hard to query and analyze feedback.

Don't:

await client.createFeedback(run_id, "rating1", { score: 1 });
await client.createFeedback(run_id, "user-rating", { score: 1 });
await client.createFeedback(run_id, "UserRating", { score: 1 });
// BAD: Three different keys for same concept

Do:

// GOOD: Consistent naming convention
await client.createFeedback(run_id, "user_rating", { score: 1 });
await client.createFeedback(run_id, "user_rating", { score: 1 });
await client.createFeedback(run_id, "user_rating", { score: 1 });

// GOOD: Use constants
const FeedbackKeys = {
  USER_RATING: "user_rating",
  CORRECTNESS: "correctness",
  HELPFULNESS: "helpfulness"
} as const;

await client.createFeedback(run_id, FeedbackKeys.USER_RATING, { score: 1 });

❌ Unnormalized Scores

Problem: Inconsistent score ranges make analysis difficult.

Don't:

// BAD: Mixed score ranges
await client.createFeedback(run_id, "quality", { score: 4 }); // Out of 5
await client.createFeedback(run_id, "accuracy", { score: 0.8 }); // Out of 1.0
await client.createFeedback(run_id, "speed", { score: 100 }); // Out of 100

Do:

// GOOD: Normalize to 0.0-1.0
await client.createFeedback(run_id, "quality", {
  score: 4 / 5, // Normalized: 0.8,
  value: 4, // Store original in value,
  comment: "4/5 stars",
});

await client.createFeedback(run_id, "accuracy", { score: 0.8 // Already normalized });

await client.createFeedback(run_id, "speed", {
  score: 100 / 100, // Normalized: 1.0,
  value: 100,
});

❌ Long-Lived Presigned Tokens

Problem: Security risk if tokens leak.

Don't:

const token = await client.createPresignedFeedbackToken(runId, "rating", {
  expiration: new Date(Date.now() + 365 * 24 * 60 * 60 * 1000) // BAD: 1 year
});

Do:

// GOOD: Short expiration for public tokens
const token = await client.createPresignedFeedbackToken(runId, "rating", {
  expiration: new Date(Date.now() + 24 * 60 * 60 * 1000) // GOOD: 24 hours
});

// ACCEPTABLE: Longer for email links
const emailToken = await client.createPresignedFeedbackToken(runId, "review", {
  expiration: new Date(Date.now() + 7 * 24 * 60 * 60 * 1000) // OK: 7 days
});

Prompt Management Anti-Patterns

❌ Not Versioning Prompts

Problem: Lost history, can't rollback changes.

Don't:

// BAD: Overwrites without version history
await client.createPrompt("system-prompt");
// ... later, no versioning:
await client.updatePrompt("system-prompt", {
  content: "New prompt" // BAD: Old version lost
});

Do:

// GOOD: Use pushPrompt for versioning
await client.createPrompt("system-prompt");

// Push v1
await client.pushPrompt("system-prompt", {
  object: { type: "chat", messages: [...] },
  description: "Initial version"
});

// Push v2 (v1 preserved in history)
await client.pushPrompt("system-prompt", {
  object: { type: "chat", messages: [...] },
  description: "Improved tone"
});

// Can pull any version
const v1 = await client.pullPrompt({ promptName: "system-prompt:v1" });

❌ Not Caching Prompt Pulls

Problem: Repeated API calls for same prompt.

Don't:

async function generateResponse(input: string) {
  // BAD: Fetches prompt on every call
  const prompt = await client.pullPrompt({ promptName: "system-prompt" });
  return await llm.generate({ prompt, input });
}

Do:

import { Cache } from "langsmith";

// GOOD: Cache prompt pulls
const promptCache = new Cache({
  ttlSeconds: 3600,
  fetchFunc: async (key) => {
    return await client.pullPromptCommit(key);
  }
});

async function generateResponse(input: string) {
  const prompt = await promptCache.get("system-prompt:latest");
  return await llm.generate({ prompt, input });
}

// Cleanup when done
promptCache.stop();

Integration Anti-Patterns

❌ Using Generic Wrapper for Specialized SDKs

Problem: Loses SDK-specific optimizations.

Don't:

import { wrapSDK } from "langsmith/wrappers";
import OpenAI from "openai";

// BAD: Use specialized wrapper instead
const openai = wrapSDK(new OpenAI());

Do:

import { wrapOpenAI } from "langsmith/wrappers/openai";
import OpenAI from "openai";

// GOOD: Use specialized wrapper
const openai = wrapOpenAI(new OpenAI());
// Captures token usage, streaming, function calls properly

❌ Missing wrapLanguageModel for Vercel AI SDK

Problem: wrapAISDK requires wrapLanguageModel to function.

Don't:

import { wrapAISDK } from "langsmith/experimental/vercel";
import { generateText } from "ai";

// BAD: Missing wrapLanguageModel
const wrapped = wrapAISDK({ generateText });

Do:

import { wrapAISDK } from "langsmith/experimental/vercel";
import { wrapLanguageModel, generateText } from "ai";

// GOOD: Include wrapLanguageModel
const wrapped = wrapAISDK({ wrapLanguageModel, generateText });

❌ Not Passing Callbacks to LangChain

Problem: LangChain calls not traced as children.

Don't:

const myFunction = traceable(async (input) => {
  const model = new ChatOpenAI();
  // BAD: Missing callbacks - not traced as child
  const result = await model.invoke(input);
  return result;
});

Do:

import { getLangchainCallbacks } from "langsmith/langchain";

const myFunction = traceable(async (input) => {
  const callbacks = getLangchainCallbacks(); // GOOD: Get callbacks
  const model = new ChatOpenAI();
  const result = await model.invoke(input, { callbacks }); // GOOD: Pass callbacks
  return result;
});

Testing Framework Anti-Patterns

❌ Missing Vitest Reporter Configuration

Problem: Vitest tests don't create experiments.

Don't:

// vitest.config.ts
export default defineConfig({
  test: {
    // BAD: Missing LangSmith reporter
  }
});

Do:

// vitest.config.ts
import { defineConfig } from "vitest/config";

export default defineConfig({
  test: {
    reporters: ["default", "langsmith/vitest/reporter"] // GOOD: Include reporter
  }
});

❌ Mixing Unit Tests with Evaluations

Problem: Conflates test types, unclear separation.

Don't:

// BAD: Mixed in same file
test("unit test: parser works", () => {
  expect(parse("data")).toBe("parsed");
});

test("eval test: chatbot quality", async () => {
  const result = await chatbot("query");
  expect(result).toContain("answer");
}, wrapEvaluator({ datasetName: "qa" }));

Do:

// tests/unit/parser.test.ts - GOOD: Separate unit tests
test("parser works", () => {
  expect(parse("data")).toBe("parsed");
});

// tests/eval/chatbot.test.ts - GOOD: Separate evaluations
import { test } from "langsmith/jest";

test("chatbot quality", async () => {
  const result = await chatbot("query");
  expect(result).toContain("answer");
}, wrapEvaluator({ datasetName: "qa" }));

Run Management Anti-Patterns

❌ Not Setting Run Type

Problem: Runs harder to filter and analyze.

Don't:

await client.createRun({
  name: "MyOperation",
  // BAD: Missing run_type
  inputs: {...}
});

Do:

await client.createRun({
  name: "MyOperation",
  run_type: "chain", // GOOD: Explicit type
  inputs: {...}
});

// Use appropriate types:
// - "llm" for LLM API calls
// - "chain" for workflows
// - "tool" for individual tools
// - "retriever" for document retrieval
// - "embedding" for embeddings

❌ Using session_id (Deprecated)

Problem: Deprecated field, use project_name instead.

Don't:

await client.createRun({
  name: "MyRun",
  run_type: "chain",
  session_id: "session-123", // BAD: Deprecated
  session_name: "MySession" // BAD: Deprecated
});

Do:

await client.createRun({
  name: "MyRun",
  run_type: "chain",
  project_name: "my-project" // GOOD: Use project_name
});

❌ Creating Runs with Future Timestamps

Problem: Breaks time-based queries and analytics.

Don't:

await client.createRun({
  name: "MyRun",
  run_type: "chain",
  start_time: Date.now() + 1000000 // BAD: Future timestamp
});

Do:

await client.createRun({
  name: "MyRun",
  run_type: "chain",
  start_time: Date.now() // GOOD: Current time
  // Or omit - auto-set to current time
});

General Best Practices (Violations are Anti-Patterns)

✅ Always Await Pending Batches Before Shutdown

// Application shutdown
process.on('SIGTERM', async () => {
  await client.awaitPendingTraceBatches();
  client.cleanup();
  process.exit(0);
});

// Lambda/serverless
export const handler = async (event) => {
  const result = await processEvent(event);
  await client.awaitPendingTraceBatches();
  return result;
};

✅ Use Environment Variables for Configuration

// GOOD: Environment-based
const client = new Client();

// GOOD: With fallbacks
const client = new Client({
  apiKey: process.env.LANGCHAIN_API_KEY,
  projectName: process.env.LANGCHAIN_PROJECT || "default"
});

✅ Implement Retry Logic for Transient Errors

async function robustAPICall() {
  const maxRetries = 3;
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await client.someMethod();
    } catch (error) {
      if (error.status === 429 || error.status >= 500) {
        if (i < maxRetries - 1) {
          await new Promise(r => setTimeout(r, Math.pow(2, i) * 1000));
          continue;
        }
      }
      throw error;
    }
  }
}

✅ Tag Runs for Better Organization

// GOOD: Use tags for filtering
const traced = traceable(
  async (input) => {...},
  {
    tags: ["production", "customer-facing", "v2"],
    metadata: { version: "2.1.0", team: "ml-team" }
  }
);

✅ Use Appropriate Sampling in Production

const client = new Client({
  // GOOD: Environment-based sampling
  tracingSamplingRate: process.env.NODE_ENV === "production" ? 0.1 : 1.0
});

Quick Anti-Pattern Checklist

Before deploying code using LangSmith, check:

  • Not creating wrappers inside loops or hot paths
  • Using correct import paths (subpath exports)
  • Flushing traces before serverless function exits
  • Not hardcoding API keys
  • Using sampling in high-volume production
  • Handling API errors (404, 401, 429, 500)
  • Using consistent feedback key naming
  • Normalizing feedback scores to 0.0-1.0
  • Not logging sensitive data (PII, credentials)
  • Setting appropriate run_type for all runs
  • Not using deprecated fields (session_id)
  • Cleaning up resources (client.cleanup())
  • Using specialized wrappers (not generic) when available
  • Awaiting pending batches before process exit
  • Using appropriate data types for datasets ("kv", "llm", "chat")

Related Documentation

  • Decision Trees - Choosing the right API
  • Error Handling - Comprehensive error patterns
  • Client Configuration - Configuration best practices
  • Quick Reference - Common correct patterns
  • Tracing Guide - Tracing best practices
  • Evaluation Guide - Evaluation best practices