LangSmith provides seamless integration with Jest and Vitest testing frameworks, enabling test-driven evaluation workflows where tests automatically create datasets, run evaluations, and track results.
The testing integrations extend familiar testing APIs with LangSmith-specific features:
When to use testing integration:
LangSmith provides identical testing APIs for both Jest and Vitest. Choose based on your existing test infrastructure:
Use if you're already using Jest or prefer its ecosystem.
import { test, expect } from "langsmith/jest";
test(
"greeting generation",
{
input: { name: "Alice" },
expected: { greeting: "Hello, Alice!" },
},
async (input) => {
return { greeting: `Hello, ${input.name}!` };
}
);Use if you're using Vite/Vitest or want faster test execution.
import { test, expect } from "langsmith/vitest";
test(
"summarize text correctly",
{
input: { text: "Long document..." },
expected: { summary: "Summary" }
},
async (input) => {
const result = await summarizeText(input.text);
return result;
}
);Vitest requires reporter configuration:
// vitest.config.ts
import { defineConfig } from "vitest/config";
export default defineConfig({
test: {
reporters: ["default", "langsmith/vitest/reporter"]
}
});Complete Vitest documentation →
| Feature | Jest | Vitest |
|---|---|---|
| Test API | Identical | Identical |
| Custom Matchers | ✓ | ✓ |
| Evaluators | ✓ | ✓ |
| Reporter Config | Not required | Required in vitest.config.ts |
| Performance | Good | Faster (Vite-based) |
| Watch Mode | Built-in | Built-in |
| Parallel Tests | ✓ | ✓ (better performance) |
| ES Modules | Requires configuration | Native support |
| TypeScript | Requires ts-jest | Native support |
Recommendation: Both provide identical LangSmith functionality. Choose based on:
Both frameworks provide:
// Relative closeness (normalized edit distance)
expect(output).toBeRelativeCloseTo("Expected text", { threshold: 0.8 });
// Absolute closeness (raw edit distance)
expect(output).toBeAbsoluteCloseTo("Expected text", { threshold: 5 });
// Semantic similarity (embeddings-based)
expect(output).toBeSemanticCloseTo("Expected meaning", { threshold: 0.85 });
// Custom evaluators
expect(output).evaluatedBy(customEvaluator);// Log evaluation feedback
logFeedback({
key: "accuracy",
score: 0.95,
comment: "High quality output"
});
// Log intermediate outputs
logOutputs({ step1: result1, step2: result2 });import { wrapEvaluator } from "langsmith/jest"; // or "langsmith/vitest"
const customEvaluator = wrapEvaluator((args) => {
const { input, output, expected } = args;
return {
key: "custom_metric",
score: calculateScore(output, expected),
comment: "Evaluation comment"
};
});// Jest
import { test } from "langsmith/jest";
// Vitest
import { test } from "langsmith/vitest";
// Identical API
test(
"test name",
{
input: { data: "input" },
expected: { result: "output" }
},
async (input) => {
return { result: processData(input.data) };
}
);import { test, expect, wrapEvaluator } from "langsmith/jest"; // or vitest
const qualityEvaluator = wrapEvaluator((args) => ({
key: "quality",
score: args.output.score > 0.8 ? 1 : 0
}));
test(
"quality check",
{
input: { prompt: "Test" },
evaluators: [qualityEvaluator]
},
async (input) => {
const result = await generate(input.prompt);
expect(result).evaluatedBy(qualityEvaluator);
return result;
}
);test(
"translation test",
{
input: { text: "Hello", lang: "es" },
expected: "Hola",
datasetName: "translation-tests", // Automatically creates dataset
projectName: "translation-eval"
},
async (input) => {
return await translate(input.text, input.lang);
}
);