Langfuse API client for universal JavaScript environments providing observability, prompt management, datasets, experiments, and scoring capabilities
npx @tessl/cli install tessl/npm-langfuse--client@4.2.0Langfuse Client (@langfuse/client) is a comprehensive API client for Langfuse, an observability platform for LLM applications. It provides core abstractions for prompt management, dataset operations, experiment execution, scoring, and media handling. The package is designed for universal JavaScript environments including browsers, Node.js, Edge Functions, and other JavaScript runtimes.
npm install @langfuse/client @langfuse/tracing @opentelemetry/api@langfuse/core (workspace)@langfuse/tracing (workspace)@opentelemetry/api (peer dependency)mustacheimport { LangfuseClient } from '@langfuse/client';CommonJS:
const { LangfuseClient } = require('@langfuse/client');Specific imports:
import {
LangfuseClient,
TextPromptClient,
ChatPromptClient,
createEvaluatorFromAutoevals,
// Type imports
type ExperimentParams,
type ExperimentResult,
type ExperimentTask,
type ExperimentItem,
type ExperimentItemResult,
type ExperimentTaskParams,
type Evaluator,
type EvaluatorParams,
type Evaluation,
type RunEvaluator,
type RunEvaluatorParams,
type FetchedDataset,
type RunExperimentOnDataset,
type LinkDatasetItemFunction,
type ChatMessageOrPlaceholder,
type ChatMessageWithPlaceholders,
type LangchainMessagesPlaceholder,
type ChatMessageType
} from '@langfuse/client';import { LangfuseClient } from '@langfuse/client';
// Initialize client with credentials
const langfuse = new LangfuseClient({
publicKey: 'pk_...',
secretKey: 'sk_...',
baseUrl: 'https://cloud.langfuse.com' // optional, default shown
});
// Or use environment variables (LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_BASE_URL)
const langfuse = new LangfuseClient();
// Fetch and compile a prompt
const prompt = await langfuse.prompt.get('my-prompt');
const compiled = prompt.compile({ variable: 'value' });
// Get a dataset and run an experiment
const dataset = await langfuse.dataset.get('my-dataset');
const result = await dataset.runExperiment({
name: 'Model Evaluation',
task: async ({ input }) => myModel.generate(input),
evaluators: [myEvaluator]
});
// Create scores
langfuse.score.create({
name: 'quality',
value: 0.85,
traceId: 'trace-123'
});
// Flush pending data before exit
await langfuse.flush();Langfuse Client is built around several key components:
The main LangfuseClient class provides centralized access to all Langfuse functionality and direct API access for advanced use cases.
class LangfuseClient {
constructor(params?: LangfuseClientParams);
// Manager access
readonly api: LangfuseAPIClient;
readonly prompt: PromptManager;
readonly dataset: DatasetManager;
readonly score: ScoreManager;
readonly media: MediaManager;
readonly experiment: ExperimentManager;
// Utility methods
flush(): Promise<void>;
shutdown(): Promise<void>;
getTraceUrl(traceId: string): Promise<string>;
}
interface LangfuseClientParams {
publicKey?: string;
secretKey?: string;
baseUrl?: string;
timeout?: number;
additionalHeaders?: Record<string, string>;
}Fetch, create, and manage prompts with built-in caching, version control, and variable substitution. Supports both text and chat prompts with LangChain compatibility.
class PromptManager {
get(name: string, options?: { version?: number; label?: string; cacheTtlSeconds?: number; fallback?: string | ChatMessage[]; maxRetries?: number; type?: "chat" | "text"; fetchTimeoutMs?: number }): Promise<TextPromptClient | ChatPromptClient>;
create(body: CreatePromptRequest): Promise<TextPromptClient | ChatPromptClient>;
update(params: { name: string; version: number; newLabels: string[] }): Promise<Prompt>;
}
class TextPromptClient {
readonly name: string;
readonly version: number;
readonly prompt: string;
readonly config: unknown;
readonly labels: string[];
readonly tags: string[];
readonly isFallback: boolean;
compile(variables?: Record<string, string>): string;
getLangchainPrompt(): string;
toJSON(): string;
}
class ChatPromptClient {
readonly name: string;
readonly version: number;
readonly prompt: ChatMessageWithPlaceholders[];
readonly config: unknown;
readonly labels: string[];
readonly tags: string[];
readonly isFallback: boolean;
compile(
variables?: Record<string, string>,
placeholders?: Record<string, any>
): (ChatMessageOrPlaceholder | any)[];
getLangchainPrompt(options?: { placeholders?: Record<string, any> }): (ChatMessage | LangchainMessagesPlaceholder | any)[];
toJSON(): string;
}Retrieve datasets with all items, link dataset items to traces for experiment tracking, and run experiments directly on datasets.
class DatasetManager {
get(name: string, options?: { fetchItemsPageSize: number }): Promise<FetchedDataset>;
}
type FetchedDataset = Dataset & {
items: (DatasetItem & { link: LinkDatasetItemFunction })[];
runExperiment: RunExperimentOnDataset;
};
type LinkDatasetItemFunction = (
obj: { otelSpan: Span },
runName: string,
runArgs?: { description?: string; metadata?: any }
) => Promise<DatasetRunItem>;
type RunExperimentOnDataset = (
params: Omit<ExperimentParams<any, any, Record<string, any>>, "data">
) => Promise<ExperimentResult<any, any, Record<string, any>>>;Create and manage scores for traces and observations with automatic batching for efficient API usage.
class ScoreManager {
create(data: ScoreBody): void;
observation(observation: { otelSpan: Span }, data: Omit<ScoreBody, "traceId" | "sessionId" | "observationId" | "datasetRunId">): void;
trace(observation: { otelSpan: Span }, data: Omit<ScoreBody, "traceId" | "sessionId" | "observationId" | "datasetRunId">): void;
activeObservation(data: Omit<ScoreBody, "traceId" | "sessionId" | "observationId" | "datasetRunId">): void;
activeTrace(data: Omit<ScoreBody, "traceId" | "sessionId" | "observationId" | "datasetRunId">): void;
flush(): Promise<void>;
shutdown(): Promise<void>;
}Resolve media reference strings in objects by fetching media content and converting to base64 data URIs.
class MediaManager {
resolveReferences<T>(params: LangfuseMediaResolveMediaReferencesParams<T>): Promise<T>;
static parseReferenceString(referenceString: string): ParsedMediaReference;
}
type LangfuseMediaResolveMediaReferencesParams<T> = {
obj: T;
resolveWith: "base64DataUri";
maxDepth?: number;
};Run comprehensive experiments on datasets or custom data with automatic tracing, evaluation, and result formatting.
class ExperimentManager {
run<Input, ExpectedOutput, Metadata extends Record<string, any>>(
config: ExperimentParams<Input, ExpectedOutput, Metadata>
): Promise<ExperimentResult<Input, ExpectedOutput, Metadata>>;
}
type ExperimentParams<Input, ExpectedOutput, Metadata> = {
name: string;
runName?: string;
description?: string;
metadata?: Record<string, any>;
data: ExperimentItem<Input, ExpectedOutput, Metadata>[];
task: ExperimentTask<Input, ExpectedOutput, Metadata>;
evaluators?: Evaluator<Input, ExpectedOutput, Metadata>[];
runEvaluators?: RunEvaluator<Input, ExpectedOutput, Metadata>[];
maxConcurrency?: number;
};
type ExperimentResult<Input, ExpectedOutput, Metadata> = {
runName: string;
datasetRunId?: string;
datasetRunUrl?: string;
itemResults: ExperimentItemResult<Input, ExpectedOutput, Metadata>[];
runEvaluations: Evaluation[];
format: (options?: { includeItemResults?: boolean }) => Promise<string>;
};Convert AutoEvals library evaluators to Langfuse-compatible evaluators for seamless integration.
function createEvaluatorFromAutoevals<E extends CallableFunction>(
autoevalEvaluator: E,
params?: Params<E>
): Evaluator;// Dataset types
interface Dataset {
id: string;
name: string;
description?: string;
metadata?: any;
projectId: string;
createdAt: string;
updatedAt: string;
}
interface DatasetItem {
id: string;
datasetId: string;
input: any;
expectedOutput?: any;
metadata?: any;
sourceTraceId?: string;
sourceObservationId?: string;
status: string;
createdAt: string;
updatedAt: string;
}
interface DatasetRunItem {
id: string;
datasetRunId: string;
datasetRunName: string;
datasetItemId: string;
traceId: string;
observationId?: string;
createdAt: string;
updatedAt: string;
}
// Score type
interface ScoreBody {
id?: string;
name: string;
value: number | string;
traceId?: string;
observationId?: string;
sessionId?: string;
datasetRunId?: string;
comment?: string;
metadata?: any;
dataType?: 'NUMERIC' | 'CATEGORICAL' | 'BOOLEAN';
environment?: string;
}
// Chat message types
interface ChatMessage {
role: string;
content: string;
}
// Prompt types
type Prompt = Prompt.Text | Prompt.Chat;
namespace Prompt {
interface Text {
name: string;
version: number;
type: 'text';
prompt: string;
config: unknown;
labels: string[];
tags: string[];
commitMessage?: string | null;
}
interface Chat {
name: string;
version: number;
type: 'chat';
prompt: ChatMessageWithPlaceholders[];
config: unknown;
labels: string[];
tags: string[];
commitMessage?: string | null;
}
}
// Chat message types
enum ChatMessageType {
ChatMessage = "chatmessage",
Placeholder = "placeholder"
}
interface ChatMessageWithPlaceholders {
type: "chatmessage" | "placeholder";
role?: string;
content?: string;
name?: string;
}
type ChatMessageOrPlaceholder =
| ChatMessage
| { type: "placeholder"; name: string };
interface LangchainMessagesPlaceholder {
variableName: string;
optional?: boolean;
}
// Experiment types
type ExperimentItem<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> =
| {
input?: Input;
expectedOutput?: ExpectedOutput;
metadata?: Metadata;
}
| DatasetItem;
type ExperimentTaskParams<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> =
ExperimentItem<Input, ExpectedOutput, Metadata>;
type ExperimentTask<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> = (
params: ExperimentTaskParams<Input, ExpectedOutput, Metadata>
) => Promise<any>;
type Evaluation = Pick<ScoreBody, "name" | "value" | "comment" | "metadata" | "dataType">;
interface EvaluatorParams<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> {
input: Input;
output: any;
expectedOutput?: ExpectedOutput;
metadata?: Metadata;
}
type Evaluator<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> = (
params: EvaluatorParams<Input, ExpectedOutput, Metadata>
) => Promise<Evaluation[] | Evaluation>;
interface RunEvaluatorParams<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> {
itemResults: ExperimentItemResult<Input, ExpectedOutput, Metadata>[];
}
type RunEvaluator<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> = (
params: RunEvaluatorParams<Input, ExpectedOutput, Metadata>
) => Promise<Evaluation[] | Evaluation>;
type ExperimentItemResult<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> = {
item: ExperimentItem<Input, ExpectedOutput, Metadata>;
output: any;
evaluations: Evaluation[];
traceId?: string;
datasetRunId?: string;
};
// Media types
interface ParsedMediaReference {
mediaId: string;
source: string;
contentType: string;
}The client supports configuration via environment variables:
LANGFUSE_PUBLIC_KEY: Public API keyLANGFUSE_SECRET_KEY: Secret API keyLANGFUSE_BASE_URL (or LANGFUSE_BASEURL): Langfuse instance URLLANGFUSE_TIMEOUT: Request timeout in secondsLANGFUSE_FLUSH_AT: Number of scores to batch before flushing (default: 10)LANGFUSE_FLUSH_INTERVAL: Flush interval in seconds (default: 1)LANGFUSE_TRACING_ENVIRONMENT: Default environment tag for tracesThis package integrates with OpenTelemetry for distributed tracing. Score methods accept OpenTelemetry Span objects to automatically link scores to traces and observations. The experiment framework uses OpenTelemetry for automatic tracing of task executions.
import { Span } from '@opentelemetry/api';
// Link dataset item to a span
await datasetItem.link({ otelSpan: span }, 'experiment-run-1');
// Score an observation using its span
langfuse.score.observation({ otelSpan: span }, {
name: 'accuracy',
value: 0.95
});Methods that fetch data (like prompt.get(), dataset.get()) support fallback mechanisms:
// Prompt with fallback
const prompt = await langfuse.prompt.get('my-prompt', {
type: 'text',
fallback: 'Default prompt text: {{variable}}'
});
// If fetch fails, fallback content is used
// prompt.isFallback will be trueExperiment evaluators handle failures gracefully - failed evaluators are logged but don't stop the experiment:
const result = await langfuse.experiment.run({
name: 'Test',
data: items,
task: myTask,
evaluators: [
goodEvaluator, // Works fine
brokenEvaluator, // Fails but logged
anotherEvaluator // Still runs
]
});
// result.itemResults contains evaluations from successful evaluatorsAlways flush pending data before application exit:
// Option 1: Manual flush
await langfuse.flush();
// Option 2: Graceful shutdown (flushes all managers)
await langfuse.shutdown();Scores are batched automatically but can be flushed manually:
langfuse.score.create({ name: 'quality', value: 0.8, traceId: 'abc' });
langfuse.score.create({ name: 'latency', value: 120, traceId: 'abc' });
// Force immediate send
await langfuse.score.flush();The package maintains v3 compatibility with deprecated methods:
getPrompt() → Use prompt.get()createPrompt() → Use prompt.create()updatePrompt() → Use prompt.update()getDataset() → Use dataset.get()fetchTrace() → Use api.trace.get()fetchTraces() → Use api.trace.list()fetchObservation() → Use api.observations.get()fetchObservations() → Use api.observations.getMany()fetchSessions() → Use api.sessions.get()getDatasetRun() → Use api.datasets.getRun()getDatasetRuns() → Use api.datasets.getRuns()createDataset() → Use api.datasets.create()getDatasetItem() → Use api.datasetItems.get()createDatasetItem() → Use api.datasetItems.create()fetchMedia() → Use api.media.get()resolveMediaReferences() → Use media.resolveReferences()All deprecated methods are maintained for backward compatibility but the new manager-based API is recommended for new code.