or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

autoevals-adapter.mdclient.mddatasets.mdexperiments.mdindex.mdmedia.mdprompts.mdscores.md
tile.json

tessl/npm-langfuse--client

Langfuse API client for universal JavaScript environments providing observability, prompt management, datasets, experiments, and scoring capabilities

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/@langfuse/client@4.2.x

To install, run

npx @tessl/cli install tessl/npm-langfuse--client@4.2.0

index.mddocs/

Langfuse Client

Langfuse Client (@langfuse/client) is a comprehensive API client for Langfuse, an observability platform for LLM applications. It provides core abstractions for prompt management, dataset operations, experiment execution, scoring, and media handling. The package is designed for universal JavaScript environments including browsers, Node.js, Edge Functions, and other JavaScript runtimes.

Package Information

  • Package Name: @langfuse/client
  • Package Type: npm
  • Language: TypeScript
  • Installation: npm install @langfuse/client @langfuse/tracing @opentelemetry/api
  • Version: 4.2.0
  • Dependencies:
    • @langfuse/core (workspace)
    • @langfuse/tracing (workspace)
    • @opentelemetry/api (peer dependency)
    • mustache

Core Imports

import { LangfuseClient } from '@langfuse/client';

CommonJS:

const { LangfuseClient } = require('@langfuse/client');

Specific imports:

import {
  LangfuseClient,
  TextPromptClient,
  ChatPromptClient,
  createEvaluatorFromAutoevals,
  // Type imports
  type ExperimentParams,
  type ExperimentResult,
  type ExperimentTask,
  type ExperimentItem,
  type ExperimentItemResult,
  type ExperimentTaskParams,
  type Evaluator,
  type EvaluatorParams,
  type Evaluation,
  type RunEvaluator,
  type RunEvaluatorParams,
  type FetchedDataset,
  type RunExperimentOnDataset,
  type LinkDatasetItemFunction,
  type ChatMessageOrPlaceholder,
  type ChatMessageWithPlaceholders,
  type LangchainMessagesPlaceholder,
  type ChatMessageType
} from '@langfuse/client';

Basic Usage

import { LangfuseClient } from '@langfuse/client';

// Initialize client with credentials
const langfuse = new LangfuseClient({
  publicKey: 'pk_...',
  secretKey: 'sk_...',
  baseUrl: 'https://cloud.langfuse.com' // optional, default shown
});

// Or use environment variables (LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_BASE_URL)
const langfuse = new LangfuseClient();

// Fetch and compile a prompt
const prompt = await langfuse.prompt.get('my-prompt');
const compiled = prompt.compile({ variable: 'value' });

// Get a dataset and run an experiment
const dataset = await langfuse.dataset.get('my-dataset');
const result = await dataset.runExperiment({
  name: 'Model Evaluation',
  task: async ({ input }) => myModel.generate(input),
  evaluators: [myEvaluator]
});

// Create scores
langfuse.score.create({
  name: 'quality',
  value: 0.85,
  traceId: 'trace-123'
});

// Flush pending data before exit
await langfuse.flush();

Architecture

Langfuse Client is built around several key components:

  • LangfuseClient: Main entry point providing access to all managers and API client
  • Managers: Specialized managers for different capabilities (PromptManager, DatasetManager, etc.)
  • Prompt Clients: Type-specific clients for text and chat prompts with compilation support
  • Experiment Framework: Comprehensive system for running experiments with evaluation
  • Batching & Caching: Automatic batching for scores and intelligent caching for prompts
  • OpenTelemetry Integration: Built-in support for distributed tracing via OTel spans

Capabilities

Client Initialization

The main LangfuseClient class provides centralized access to all Langfuse functionality and direct API access for advanced use cases.

class LangfuseClient {
  constructor(params?: LangfuseClientParams);

  // Manager access
  readonly api: LangfuseAPIClient;
  readonly prompt: PromptManager;
  readonly dataset: DatasetManager;
  readonly score: ScoreManager;
  readonly media: MediaManager;
  readonly experiment: ExperimentManager;

  // Utility methods
  flush(): Promise<void>;
  shutdown(): Promise<void>;
  getTraceUrl(traceId: string): Promise<string>;
}

interface LangfuseClientParams {
  publicKey?: string;
  secretKey?: string;
  baseUrl?: string;
  timeout?: number;
  additionalHeaders?: Record<string, string>;
}

Client Initialization

Prompt Management

Fetch, create, and manage prompts with built-in caching, version control, and variable substitution. Supports both text and chat prompts with LangChain compatibility.

class PromptManager {
  get(name: string, options?: { version?: number; label?: string; cacheTtlSeconds?: number; fallback?: string | ChatMessage[]; maxRetries?: number; type?: "chat" | "text"; fetchTimeoutMs?: number }): Promise<TextPromptClient | ChatPromptClient>;
  create(body: CreatePromptRequest): Promise<TextPromptClient | ChatPromptClient>;
  update(params: { name: string; version: number; newLabels: string[] }): Promise<Prompt>;
}

class TextPromptClient {
  readonly name: string;
  readonly version: number;
  readonly prompt: string;
  readonly config: unknown;
  readonly labels: string[];
  readonly tags: string[];
  readonly isFallback: boolean;

  compile(variables?: Record<string, string>): string;
  getLangchainPrompt(): string;
  toJSON(): string;
}

class ChatPromptClient {
  readonly name: string;
  readonly version: number;
  readonly prompt: ChatMessageWithPlaceholders[];
  readonly config: unknown;
  readonly labels: string[];
  readonly tags: string[];
  readonly isFallback: boolean;

  compile(
    variables?: Record<string, string>,
    placeholders?: Record<string, any>
  ): (ChatMessageOrPlaceholder | any)[];
  getLangchainPrompt(options?: { placeholders?: Record<string, any> }): (ChatMessage | LangchainMessagesPlaceholder | any)[];
  toJSON(): string;
}

Prompt Management

Dataset Operations

Retrieve datasets with all items, link dataset items to traces for experiment tracking, and run experiments directly on datasets.

class DatasetManager {
  get(name: string, options?: { fetchItemsPageSize: number }): Promise<FetchedDataset>;
}

type FetchedDataset = Dataset & {
  items: (DatasetItem & { link: LinkDatasetItemFunction })[];
  runExperiment: RunExperimentOnDataset;
};

type LinkDatasetItemFunction = (
  obj: { otelSpan: Span },
  runName: string,
  runArgs?: { description?: string; metadata?: any }
) => Promise<DatasetRunItem>;

type RunExperimentOnDataset = (
  params: Omit<ExperimentParams<any, any, Record<string, any>>, "data">
) => Promise<ExperimentResult<any, any, Record<string, any>>>;

Dataset Operations

Score Management

Create and manage scores for traces and observations with automatic batching for efficient API usage.

class ScoreManager {
  create(data: ScoreBody): void;
  observation(observation: { otelSpan: Span }, data: Omit<ScoreBody, "traceId" | "sessionId" | "observationId" | "datasetRunId">): void;
  trace(observation: { otelSpan: Span }, data: Omit<ScoreBody, "traceId" | "sessionId" | "observationId" | "datasetRunId">): void;
  activeObservation(data: Omit<ScoreBody, "traceId" | "sessionId" | "observationId" | "datasetRunId">): void;
  activeTrace(data: Omit<ScoreBody, "traceId" | "sessionId" | "observationId" | "datasetRunId">): void;
  flush(): Promise<void>;
  shutdown(): Promise<void>;
}

Score Management

Media Reference Resolution

Resolve media reference strings in objects by fetching media content and converting to base64 data URIs.

class MediaManager {
  resolveReferences<T>(params: LangfuseMediaResolveMediaReferencesParams<T>): Promise<T>;
  static parseReferenceString(referenceString: string): ParsedMediaReference;
}

type LangfuseMediaResolveMediaReferencesParams<T> = {
  obj: T;
  resolveWith: "base64DataUri";
  maxDepth?: number;
};

Media Management

Experiment Execution

Run comprehensive experiments on datasets or custom data with automatic tracing, evaluation, and result formatting.

class ExperimentManager {
  run<Input, ExpectedOutput, Metadata extends Record<string, any>>(
    config: ExperimentParams<Input, ExpectedOutput, Metadata>
  ): Promise<ExperimentResult<Input, ExpectedOutput, Metadata>>;
}

type ExperimentParams<Input, ExpectedOutput, Metadata> = {
  name: string;
  runName?: string;
  description?: string;
  metadata?: Record<string, any>;
  data: ExperimentItem<Input, ExpectedOutput, Metadata>[];
  task: ExperimentTask<Input, ExpectedOutput, Metadata>;
  evaluators?: Evaluator<Input, ExpectedOutput, Metadata>[];
  runEvaluators?: RunEvaluator<Input, ExpectedOutput, Metadata>[];
  maxConcurrency?: number;
};

type ExperimentResult<Input, ExpectedOutput, Metadata> = {
  runName: string;
  datasetRunId?: string;
  datasetRunUrl?: string;
  itemResults: ExperimentItemResult<Input, ExpectedOutput, Metadata>[];
  runEvaluations: Evaluation[];
  format: (options?: { includeItemResults?: boolean }) => Promise<string>;
};

Experiment Execution

AutoEvals Integration

Convert AutoEvals library evaluators to Langfuse-compatible evaluators for seamless integration.

function createEvaluatorFromAutoevals<E extends CallableFunction>(
  autoevalEvaluator: E,
  params?: Params<E>
): Evaluator;

AutoEvals Integration

Core Types

Common Types from @langfuse/core

// Dataset types
interface Dataset {
  id: string;
  name: string;
  description?: string;
  metadata?: any;
  projectId: string;
  createdAt: string;
  updatedAt: string;
}

interface DatasetItem {
  id: string;
  datasetId: string;
  input: any;
  expectedOutput?: any;
  metadata?: any;
  sourceTraceId?: string;
  sourceObservationId?: string;
  status: string;
  createdAt: string;
  updatedAt: string;
}

interface DatasetRunItem {
  id: string;
  datasetRunId: string;
  datasetRunName: string;
  datasetItemId: string;
  traceId: string;
  observationId?: string;
  createdAt: string;
  updatedAt: string;
}

// Score type
interface ScoreBody {
  id?: string;
  name: string;
  value: number | string;
  traceId?: string;
  observationId?: string;
  sessionId?: string;
  datasetRunId?: string;
  comment?: string;
  metadata?: any;
  dataType?: 'NUMERIC' | 'CATEGORICAL' | 'BOOLEAN';
  environment?: string;
}

// Chat message types
interface ChatMessage {
  role: string;
  content: string;
}

// Prompt types
type Prompt = Prompt.Text | Prompt.Chat;

namespace Prompt {
  interface Text {
    name: string;
    version: number;
    type: 'text';
    prompt: string;
    config: unknown;
    labels: string[];
    tags: string[];
    commitMessage?: string | null;
  }

  interface Chat {
    name: string;
    version: number;
    type: 'chat';
    prompt: ChatMessageWithPlaceholders[];
    config: unknown;
    labels: string[];
    tags: string[];
    commitMessage?: string | null;
  }
}

// Chat message types
enum ChatMessageType {
  ChatMessage = "chatmessage",
  Placeholder = "placeholder"
}

interface ChatMessageWithPlaceholders {
  type: "chatmessage" | "placeholder";
  role?: string;
  content?: string;
  name?: string;
}

type ChatMessageOrPlaceholder =
  | ChatMessage
  | { type: "placeholder"; name: string };

interface LangchainMessagesPlaceholder {
  variableName: string;
  optional?: boolean;
}

// Experiment types
type ExperimentItem<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> =
  | {
      input?: Input;
      expectedOutput?: ExpectedOutput;
      metadata?: Metadata;
    }
  | DatasetItem;

type ExperimentTaskParams<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> =
  ExperimentItem<Input, ExpectedOutput, Metadata>;

type ExperimentTask<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> = (
  params: ExperimentTaskParams<Input, ExpectedOutput, Metadata>
) => Promise<any>;

type Evaluation = Pick<ScoreBody, "name" | "value" | "comment" | "metadata" | "dataType">;

interface EvaluatorParams<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> {
  input: Input;
  output: any;
  expectedOutput?: ExpectedOutput;
  metadata?: Metadata;
}

type Evaluator<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> = (
  params: EvaluatorParams<Input, ExpectedOutput, Metadata>
) => Promise<Evaluation[] | Evaluation>;

interface RunEvaluatorParams<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> {
  itemResults: ExperimentItemResult<Input, ExpectedOutput, Metadata>[];
}

type RunEvaluator<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> = (
  params: RunEvaluatorParams<Input, ExpectedOutput, Metadata>
) => Promise<Evaluation[] | Evaluation>;

type ExperimentItemResult<Input = any, ExpectedOutput = any, Metadata extends Record<string, any> = Record<string, any>> = {
  item: ExperimentItem<Input, ExpectedOutput, Metadata>;
  output: any;
  evaluations: Evaluation[];
  traceId?: string;
  datasetRunId?: string;
};

// Media types
interface ParsedMediaReference {
  mediaId: string;
  source: string;
  contentType: string;
}

Environment Variables

The client supports configuration via environment variables:

  • LANGFUSE_PUBLIC_KEY: Public API key
  • LANGFUSE_SECRET_KEY: Secret API key
  • LANGFUSE_BASE_URL (or LANGFUSE_BASEURL): Langfuse instance URL
  • LANGFUSE_TIMEOUT: Request timeout in seconds
  • LANGFUSE_FLUSH_AT: Number of scores to batch before flushing (default: 10)
  • LANGFUSE_FLUSH_INTERVAL: Flush interval in seconds (default: 1)
  • LANGFUSE_TRACING_ENVIRONMENT: Default environment tag for traces

OpenTelemetry Integration

This package integrates with OpenTelemetry for distributed tracing. Score methods accept OpenTelemetry Span objects to automatically link scores to traces and observations. The experiment framework uses OpenTelemetry for automatic tracing of task executions.

import { Span } from '@opentelemetry/api';

// Link dataset item to a span
await datasetItem.link({ otelSpan: span }, 'experiment-run-1');

// Score an observation using its span
langfuse.score.observation({ otelSpan: span }, {
  name: 'accuracy',
  value: 0.95
});

Error Handling

Methods that fetch data (like prompt.get(), dataset.get()) support fallback mechanisms:

// Prompt with fallback
const prompt = await langfuse.prompt.get('my-prompt', {
  type: 'text',
  fallback: 'Default prompt text: {{variable}}'
});

// If fetch fails, fallback content is used
// prompt.isFallback will be true

Experiment evaluators handle failures gracefully - failed evaluators are logged but don't stop the experiment:

const result = await langfuse.experiment.run({
  name: 'Test',
  data: items,
  task: myTask,
  evaluators: [
    goodEvaluator,      // Works fine
    brokenEvaluator,    // Fails but logged
    anotherEvaluator    // Still runs
  ]
});
// result.itemResults contains evaluations from successful evaluators

Lifecycle Management

Always flush pending data before application exit:

// Option 1: Manual flush
await langfuse.flush();

// Option 2: Graceful shutdown (flushes all managers)
await langfuse.shutdown();

Scores are batched automatically but can be flushed manually:

langfuse.score.create({ name: 'quality', value: 0.8, traceId: 'abc' });
langfuse.score.create({ name: 'latency', value: 120, traceId: 'abc' });

// Force immediate send
await langfuse.score.flush();

Deprecated APIs

The package maintains v3 compatibility with deprecated methods:

  • getPrompt() → Use prompt.get()
  • createPrompt() → Use prompt.create()
  • updatePrompt() → Use prompt.update()
  • getDataset() → Use dataset.get()
  • fetchTrace() → Use api.trace.get()
  • fetchTraces() → Use api.trace.list()
  • fetchObservation() → Use api.observations.get()
  • fetchObservations() → Use api.observations.getMany()
  • fetchSessions() → Use api.sessions.get()
  • getDatasetRun() → Use api.datasets.getRun()
  • getDatasetRuns() → Use api.datasets.getRuns()
  • createDataset() → Use api.datasets.create()
  • getDatasetItem() → Use api.datasetItems.get()
  • createDatasetItem() → Use api.datasetItems.create()
  • fetchMedia() → Use api.media.get()
  • resolveMediaReferences() → Use media.resolveReferences()

All deprecated methods are maintained for backward compatibility but the new manager-based API is recommended for new code.