or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

advanced.mdcore-tracing.mddatasets.mdexperiments.mdindex.mdintegrations.mdobservation-types.mdprompts.mdscoring.md

observation-types.mddocs/

0

# Observation Types

1

2

Specialized span types for different AI application components, each optimized for specific use cases with appropriate metadata and visualization in the Langfuse UI.

3

4

## Capabilities

5

6

### Base Observation Wrapper

7

8

All observation types inherit common functionality from the base wrapper class.

9

10

```python { .api }

11

class LangfuseObservationWrapper:

12

def end(self, *, end_time: int = None) -> "LangfuseObservationWrapper":

13

"""End the observation."""

14

15

def update(self, *, name: str = None, input: Any = None, output: Any = None,

16

metadata: Any = None, level: SpanLevel = None,

17

status_message: str = None, **kwargs) -> "LangfuseObservationWrapper":

18

"""Update observation attributes."""

19

20

def update_trace(self, *, name: str = None, user_id: str = None,

21

session_id: str = None, tags: List[str] = None,

22

**kwargs) -> "LangfuseObservationWrapper":

23

"""Update trace-level attributes."""

24

25

def score(self, *, name: str, value: Union[float, str],

26

data_type: ScoreDataType = None, comment: str = None) -> None:

27

"""Add score to this observation."""

28

29

def score_trace(self, *, name: str, value: Union[float, str],

30

data_type: ScoreDataType = None, comment: str = None) -> None:

31

"""Add score to the entire trace."""

32

33

# Attributes

34

trace_id: str

35

id: str

36

```

37

38

### LangfuseSpan

39

40

General-purpose span for tracing any operation. Use when no other specialized type fits your use case.

41

42

```python { .api }

43

class LangfuseSpan(LangfuseObservationWrapper):

44

def start_span(self, name: str, *, input: Any = None, output: Any = None,

45

metadata: Any = None, **kwargs) -> "LangfuseSpan":

46

"""Create child span."""

47

48

def start_as_current_span(self, *, name: str, **kwargs) -> ContextManager["LangfuseSpan"]:

49

"""Create child span as context manager (deprecated)."""

50

51

def start_generation(self, *, name: str, **kwargs) -> "LangfuseGeneration":

52

"""Create child generation (deprecated)."""

53

54

def create_event(self, *, name: str, **kwargs) -> "LangfuseEvent":

55

"""Create event observation."""

56

```

57

58

**Usage Example:**

59

60

```python

61

# General operations

62

with langfuse.start_as_current_observation(name="data-processing", as_type="span") as span:

63

result = process_data()

64

span.update(output=result)

65

```

66

67

### LangfuseGeneration

68

69

Specialized span for AI model generation operations with support for model metrics, token usage, and cost tracking.

70

71

```python { .api }

72

class LangfuseGeneration(LangfuseObservationWrapper):

73

def update(self, *, completion_start_time: datetime = None, model: str = None,

74

model_parameters: Dict[str, Any] = None, usage_details: Dict[str, int] = None,

75

cost_details: Dict[str, float] = None, prompt: PromptClient = None,

76

**kwargs) -> "LangfuseGeneration":

77

"""Update generation with model-specific attributes.

78

79

Args:

80

completion_start_time: When model started generating response

81

model: Model name/identifier (e.g., "gpt-4", "claude-3")

82

model_parameters: Model parameters (temperature, max_tokens, etc.)

83

usage_details: Token usage (prompt_tokens, completion_tokens, etc.)

84

cost_details: Cost breakdown (input_cost, output_cost, total_cost)

85

prompt: Associated prompt template

86

"""

87

```

88

89

**Usage Example:**

90

91

```python

92

@observe(as_type="generation")

93

def call_openai(prompt):

94

response = openai.chat.completions.create(

95

model="gpt-4",

96

messages=[{"role": "user", "content": prompt}],

97

temperature=0.7

98

)

99

100

# Automatically captured by decorator or update manually:

101

# span.update(

102

# model="gpt-4",

103

# model_parameters={"temperature": 0.7},

104

# usage_details={

105

# "prompt_tokens": response.usage.prompt_tokens,

106

# "completion_tokens": response.usage.completion_tokens

107

# }

108

# )

109

110

return response.choices[0].message.content

111

```

112

113

### LangfuseEvent

114

115

Point-in-time event observations for discrete occurrences. Events are automatically ended and cannot be updated.

116

117

```python { .api }

118

class LangfuseEvent(LangfuseObservationWrapper):

119

def update(self, **kwargs) -> "LangfuseEvent":

120

"""Update is not allowed for events. Logs warning and returns self."""

121

```

122

123

**Usage Example:**

124

125

```python

126

# Log discrete events

127

event = langfuse.create_event(

128

name="user-login",

129

input={"user_id": "123", "method": "oauth"},

130

metadata={"ip": "192.168.1.1"}

131

)

132

```

133

134

### LangfuseAgent

135

136

Observation for agent reasoning blocks that act on tools using LLM guidance. Use for autonomous agents and AI assistants.

137

138

```python { .api }

139

class LangfuseAgent(LangfuseObservationWrapper):

140

"""Agent observation for reasoning blocks using LLM guidance."""

141

```

142

143

**Usage Example:**

144

145

```python

146

@observe(as_type="agent")

147

def autonomous_agent(task):

148

# Agent reasoning with tool usage

149

plan = create_plan(task)

150

151

for step in plan:

152

with langfuse.start_as_current_observation(name="tool-call", as_type="tool") as tool:

153

result = execute_tool(step)

154

tool.update(output=result)

155

156

return final_result

157

```

158

159

### LangfuseTool

160

161

Observation for external tool calls such as API requests, database queries, or file operations.

162

163

```python { .api }

164

class LangfuseTool(LangfuseObservationWrapper):

165

"""Tool observation for external tool calls."""

166

```

167

168

**Usage Example:**

169

170

```python

171

@observe(as_type="tool")

172

def call_weather_api(location):

173

response = requests.get(f"https://api.weather.com/v1/{location}")

174

return response.json()

175

176

@observe(as_type="tool")

177

def database_query(query):

178

with database.connection() as conn:

179

result = conn.execute(query)

180

return result.fetchall()

181

```

182

183

### LangfuseChain

184

185

Observation for connecting LLM application steps, representing workflows or pipelines that pass context between stages.

186

187

```python { .api }

188

class LangfuseChain(LangfuseObservationWrapper):

189

"""Chain observation for connecting application steps."""

190

```

191

192

**Usage Example:**

193

194

```python

195

@observe(as_type="chain")

196

def rag_pipeline(question):

197

# Multi-step RAG chain

198

199

with langfuse.start_as_current_observation(name="retrieve", as_type="retriever") as retriever:

200

documents = vector_search(question)

201

retriever.update(output=documents)

202

203

with langfuse.start_as_current_observation(name="generate", as_type="generation") as gen:

204

context = format_context(documents)

205

answer = llm.generate(f"Context: {context}\nQuestion: {question}")

206

gen.update(output=answer)

207

208

return answer

209

```

210

211

### LangfuseRetriever

212

213

Observation for data retrieval operations such as vector database searches, document lookups, or knowledge base queries.

214

215

```python { .api }

216

class LangfuseRetriever(LangfuseObservationWrapper):

217

"""Retriever observation for data retrieval operations."""

218

```

219

220

**Usage Example:**

221

222

```python

223

@observe(as_type="retriever")

224

def vector_search(query, top_k=5):

225

embedding = embed_query(query)

226

results = vector_db.search(embedding, top_k=top_k)

227

return [{"content": r.content, "score": r.score} for r in results]

228

229

@observe(as_type="retriever")

230

def knowledge_lookup(entity):

231

return knowledge_graph.get_facts(entity)

232

```

233

234

### LangfuseEmbedding

235

236

Specialized observation for embedding generation operations with support for model metrics like generation observations.

237

238

```python { .api }

239

class LangfuseEmbedding(LangfuseObservationWrapper):

240

"""Embedding observation for embedding generation operations."""

241

```

242

243

**Usage Example:**

244

245

```python

246

@observe(as_type="embedding")

247

def generate_embeddings(texts):

248

response = openai.embeddings.create(

249

model="text-embedding-ada-002",

250

input=texts

251

)

252

return [embedding.embedding for embedding in response.data]

253

```

254

255

### LangfuseEvaluator

256

257

Observation for evaluation and assessment operations, measuring quality, correctness, or other metrics.

258

259

```python { .api }

260

class LangfuseEvaluator(LangfuseObservationWrapper):

261

"""Evaluator observation for assessment operations."""

262

```

263

264

**Usage Example:**

265

266

```python

267

@observe(as_type="evaluator")

268

def relevance_evaluator(query, response):

269

# Evaluate response relevance

270

relevance_score = calculate_relevance(query, response)

271

return {"relevance": relevance_score, "threshold": 0.8}

272

273

@observe(as_type="evaluator")

274

def toxicity_checker(text):

275

toxicity_score = toxicity_model.predict(text)

276

return {"is_toxic": toxicity_score > 0.7, "score": toxicity_score}

277

```

278

279

### LangfuseGuardrail

280

281

Observation for safety and security checks such as jailbreak prevention, content filtering, or policy enforcement.

282

283

```python { .api }

284

class LangfuseGuardrail(LangfuseObservationWrapper):

285

"""Guardrail observation for safety/security checks."""

286

```

287

288

**Usage Example:**

289

290

```python

291

@observe(as_type="guardrail")

292

def content_filter(user_input):

293

# Check for inappropriate content

294

if contains_inappropriate_content(user_input):

295

return {"allowed": False, "reason": "inappropriate_content"}

296

return {"allowed": True}

297

298

@observe(as_type="guardrail")

299

def jailbreak_detector(prompt):

300

jailbreak_score = jailbreak_model.predict(prompt)

301

return {

302

"is_jailbreak": jailbreak_score > 0.8,

303

"score": jailbreak_score,

304

"blocked": jailbreak_score > 0.8

305

}

306

```

307

308

## Common Patterns

309

310

### Nested Observations

311

312

```python

313

@observe(as_type="chain")

314

def complete_workflow():

315

with langfuse.start_as_current_observation(name="safety-check", as_type="guardrail") as guard:

316

safety_result = check_safety()

317

guard.update(output=safety_result)

318

319

if safety_result["allowed"]:

320

with langfuse.start_as_current_observation(name="retrieve", as_type="retriever") as ret:

321

context = retrieve_context()

322

ret.update(output=context)

323

324

with langfuse.start_as_current_observation(name="generate", as_type="generation") as gen:

325

response = generate_response(context)

326

gen.update(output=response)

327

328

return response

329

else:

330

return "Request blocked by safety filter"

331

```

332

333

### Error Handling

334

335

```python

336

@observe(as_type="tool")

337

def external_api_call():

338

try:

339

result = make_api_request()

340

return result

341

except APIError as e:

342

# Error automatically captured with ERROR level

343

raise

344

```

345

346

### Observation Type Selection Guide

347

348

- **span**: General operations, business logic

349

- **generation**: LLM/model calls (text, code, completions)

350

- **embedding**: Embedding generation operations

351

- **agent**: Autonomous agents, AI assistants

352

- **tool**: External APIs, databases, file operations

353

- **chain**: Workflows, pipelines, multi-step processes

354

- **retriever**: Search, lookup, data retrieval

355

- **evaluator**: Quality assessment, scoring, validation

356

- **guardrail**: Safety checks, content filtering, policy enforcement

357

- **event**: Discrete occurrences, logging, milestones