Tessl Tile for pypi/opentelemetry-instrumentation-bedrock@0.46.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

events.md index.md instrumentation.md metrics.md utilities.md

metrics.mddocs/

0
# Metrics and Monitoring
1

2
Advanced metrics collection system providing comprehensive observability for AWS Bedrock AI model invocations. Includes token usage tracking, request duration monitoring, error analytics, guardrail interaction metrics, and prompt caching statistics.
3

4
## Capabilities
5

6
### Metrics Parameters Container
7

8
Central container for all metrics instruments and runtime state, managing the complete lifecycle of metrics collection for Bedrock operations.
9

10
```python { .api }
11
class MetricParams:
12
    """
13
    Container for metrics configuration and runtime state.
14
    
15
    Manages all metrics instruments and maintains request-specific
16
    state throughout the lifecycle of Bedrock API calls.
17
    """
18
    
19
    def __init__(
20
        self,
21
        token_histogram: Histogram,
22
        choice_counter: Counter,
23
        duration_histogram: Histogram,
24
        exception_counter: Counter,
25
        guardrail_activation: Counter,
26
        guardrail_latency_histogram: Histogram,
27
        guardrail_coverage: Counter,
28
        guardrail_sensitive_info: Counter,
29
        guardrail_topic: Counter,
30
        guardrail_content: Counter,
31
        guardrail_words: Counter,
32
        prompt_caching: Counter,
33
    ):
34
        """
35
        Initialize metrics container with all required instruments.
36
        
37
        Parameters:
38
        - token_histogram: Token usage tracking (input/output tokens)
39
        - choice_counter: Number of completion choices generated  
40
        - duration_histogram: Request/response latency tracking
41
        - exception_counter: Error and exception counting
42
        - guardrail_activation: Guardrail trigger frequency
43
        - guardrail_latency_histogram: Guardrail processing time
44
        - guardrail_coverage: Text coverage by guardrails
45
        - guardrail_sensitive_info: PII detection events
46
        - guardrail_topic: Topic policy violations
47
        - guardrail_content: Content policy violations  
48
        - guardrail_words: Word filter violations
49
        - prompt_caching: Prompt caching utilization
50
        """
51
    
52
    # Runtime state attributes
53
    vendor: str
54
    """AI model vendor (e.g., 'anthropic', 'cohere', 'ai21')"""
55
    
56
    model: str  
57
    """Specific model name (e.g., 'claude-3-sonnet-20240229-v1:0')"""
58
    
59
    is_stream: bool
60
    """Whether the current request uses streaming responses"""
61
    
62
    start_time: float
63
    """Request start timestamp for duration calculation"""
64
```
65

66
### Core Metrics Constants
67

68
Metric name constants for standard AI model observability, following OpenTelemetry semantic conventions.
69

70
```python { .api }
71
class GuardrailMeters:
72
    """
73
    Metric name constants for Bedrock Guardrails observability.
74
    
75
    Provides standardized metric names for all guardrail-related
76
    measurements following semantic convention patterns.
77
    """
78
    
79
    LLM_BEDROCK_GUARDRAIL_ACTIVATION = "gen_ai.bedrock.guardrail.activation"
80
    """Counter for guardrail activation events"""
81
    
82
    LLM_BEDROCK_GUARDRAIL_LATENCY = "gen_ai.bedrock.guardrail.latency" 
83
    """Histogram for guardrail processing latency in milliseconds"""
84
    
85
    LLM_BEDROCK_GUARDRAIL_COVERAGE = "gen_ai.bedrock.guardrail.coverage"
86
    """Counter for text coverage by guardrails in characters"""
87
    
88
    LLM_BEDROCK_GUARDRAIL_SENSITIVE = "gen_ai.bedrock.guardrail.sensitive_info"
89
    """Counter for sensitive information detection events"""
90
    
91
    LLM_BEDROCK_GUARDRAIL_TOPICS = "gen_ai.bedrock.guardrail.topics"
92
    """Counter for topic policy violation events"""
93
    
94
    LLM_BEDROCK_GUARDRAIL_CONTENT = "gen_ai.bedrock.guardrail.content"
95
    """Counter for content policy violation events"""
96
    
97
    LLM_BEDROCK_GUARDRAIL_WORDS = "gen_ai.bedrock.guardrail.words"
98
    """Counter for word filter violation events"""
99

100

101
class PromptCaching:
102
    """
103
    Metric name constants for prompt caching observability.
104
    
105
    Provides standardized metric names for prompt caching
106
    utilization and performance tracking.
107
    """
108
    
109
    LLM_BEDROCK_PROMPT_CACHING = "gen_ai.prompt.caching"
110
    """Counter for cached token utilization"""
111

112

113
class GuardrailAttributes:
114
    """
115
    Span attribute constants for guardrail information.
116
    
117
    Standardized attribute names for recording guardrail-related
118
    data in OpenTelemetry spans.
119
    """
120
    
121
    GUARDRAIL = "gen_ai.guardrail"
122
    """Base guardrail attribute namespace"""
123
    
124
    TYPE = "gen_ai.guardrail.type"
125
    """Guardrail processing type (input/output)"""
126
    
127
    PII = "gen_ai.guardrail.pii"
128
    """PII detection attribute"""
129
    
130
    PATTERN = "gen_ai.guardrail.pattern"
131
    """Pattern matching attribute"""
132
    
133
    TOPIC = "gen_ai.guardrail.topic"
134
    """Topic policy attribute"""
135
    
136
    CONTENT = "gen_ai.guardrail.content"
137
    """Content policy attribute"""
138
    
139
    CONFIDENCE = "gen_ai.guardrail.confidence"
140
    """Confidence score attribute"""
141
    
142
    MATCH = "gen_ai.guardrail.match"
143
    """Match result attribute"""
144

145

146
class Type(Enum):
147
    """
148
    Guardrail processing type enumeration.
149
    
150
    Defines whether guardrail processing applies to input
151
    or output content in AI model interactions.
152
    """
153
    
154
    INPUT = "input"
155
    """Input content processing"""
156
    
157
    OUTPUT = "output"
158
    """Output content processing"""
159
```
160

161
### Metrics Creation and Management
162

163
Functions for creating and managing the complete set of metrics instruments required for Bedrock observability.
164

165
```python { .api }
166
def _create_metrics(meter: Meter) -> tuple:
167
    """
168
    Create all metrics instruments for Bedrock observability.
169
    
170
    Initializes the complete set of histograms and counters needed
171
    for comprehensive monitoring of Bedrock AI model interactions.
172
    
173
    Parameters:
174
    - meter: OpenTelemetry Meter instance for creating instruments
175
    
176
    Returns:
177
    Tuple containing all metrics instruments:
178
    (token_histogram, choice_counter, duration_histogram, exception_counter,
179
     guardrail_activation, guardrail_latency_histogram, guardrail_coverage,
180
     guardrail_sensitive_info, guardrail_topic, guardrail_content, 
181
     guardrail_words, prompt_caching)
182
    """
183

184

185
def is_metrics_enabled() -> bool:
186
    """
187
    Check if metrics collection is globally enabled.
188
    
189
    Returns:
190
    Boolean indicating if metrics should be collected based on
191
    TRACELOOP_METRICS_ENABLED environment variable (default: true)
192
    """
193
```
194

195
### Guardrail Metrics Processing
196

197
Specialized functions for processing and recording guardrail-related metrics with detailed categorization and attribution.
198

199
```python { .api }
200
def is_guardrail_activated(response) -> bool:
201
    """
202
    Check if any guardrails were activated in the response.
203
    
204
    Examines response metadata to determine if Bedrock Guardrails
205
    processed the request and applied any filtering or monitoring.
206
    
207
    Parameters:
208
    - response: Bedrock API response containing guardrail metadata
209
    
210
    Returns:
211
    Boolean indicating guardrail activation status
212
    """
213

214

215
def guardrail_converse(span, response, vendor, model, metric_params) -> None:
216
    """
217
    Process guardrail metrics for converse API responses.
218
    
219
    Extracts and records guardrail-related metrics from converse API
220
    responses, including policy violations and processing latency.
221
    
222
    Parameters:
223
    - span: OpenTelemetry span for attribute setting
224
    - response: Converse API response with guardrail metadata
225
    - vendor: AI model vendor identifier
226
    - model: Specific model name
227
    - metric_params: MetricParams instance for recording metrics
228
    """
229

230

231
def guardrail_handling(span, response_body, vendor, model, metric_params) -> None:
232
    """
233
    Process guardrail metrics for invoke_model API responses.
234
    
235
    Handles guardrail metric extraction and recording for traditional
236
    invoke_model API calls with comprehensive policy violation tracking.
237
    
238
    Parameters:
239
    - span: OpenTelemetry span for attribute setting
240
    - response_body: Parsed response body with guardrail data
241
    - vendor: AI model vendor identifier  
242
    - model: Specific model name
243
    - metric_params: MetricParams instance for recording metrics
244
    """
245

246

247
def handle_invoke_metrics(t: Type, guardrail, attrs, metric_params) -> None:
248
    """
249
    Handle metrics processing for guardrail invocations.
250
    
251
    Extracts and records guardrail processing latency and coverage
252
    metrics from guardrail invocation metadata.
253
    
254
    Parameters:
255
    - t: Guardrail processing type (INPUT or OUTPUT)
256
    - guardrail: Guardrail response data containing metrics
257
    - attrs: Base metric attributes for categorization
258
    - metric_params: MetricParams instance for recording metrics
259
    """
260

261

262
def handle_sensitive(t: Type, guardrail, attrs, metric_params) -> None:
263
    """
264
    Handle metrics for sensitive information policy violations.
265
    
266
    Records metrics for PII detection and sensitive content
267
    filtering by Bedrock Guardrails.
268
    
269
    Parameters:
270
    - t: Guardrail processing type (INPUT or OUTPUT)
271
    - guardrail: Guardrail response data with PII detection results
272
    - attrs: Base metric attributes for categorization
273
    - metric_params: MetricParams instance for recording metrics
274
    """
275

276

277
def handle_topic(t: Type, guardrail, attrs, metric_params) -> None:
278
    """
279
    Handle metrics for topic policy violations.
280
    
281
    Records metrics for topic policy enforcement including
282
    forbidden topics and conversation steering.
283
    
284
    Parameters:
285
    - t: Guardrail processing type (INPUT or OUTPUT)
286
    - guardrail: Guardrail response data with topic policy results
287
    - attrs: Base metric attributes for categorization
288
    - metric_params: MetricParams instance for recording metrics
289
    """
290

291

292
def handle_content(t: Type, guardrail, attrs, metric_params) -> None:
293
    """
294
    Handle metrics for content policy violations.
295
    
296
    Records metrics for content filtering including harmful content
297
    detection and safety policy enforcement.
298
    
299
    Parameters:
300
    - t: Guardrail processing type (INPUT or OUTPUT)
301
    - guardrail: Guardrail response data with content policy results
302
    - attrs: Base metric attributes for categorization
303
    - metric_params: MetricParams instance for recording metrics
304
    """
305

306

307
def handle_words(t: Type, guardrail, attrs, metric_params) -> None:
308
    """
309
    Handle metrics for word filter violations.
310
    
311
    Records metrics for word-level filtering including blocked
312
    words and phrases detected by guardrails.
313
    
314
    Parameters:
315
    - t: Guardrail processing type (INPUT or OUTPUT)
316
    - guardrail: Guardrail response data with word filter results
317
    - attrs: Base metric attributes for categorization
318
    - metric_params: MetricParams instance for recording metrics
319
    """
320
```
321

322
### Prompt Caching Metrics
323

324
Functions for tracking prompt caching utilization and performance metrics.
325

326
```python { .api }
327
def prompt_caching_handling(headers, vendor, model, metric_params) -> None:
328
    """
329
    Process prompt caching metrics from response headers.
330
    
331
    Extracts caching information from HTTP response headers and
332
    records metrics for cache hits, misses, and token savings.
333
    
334
    Parameters:
335
    - headers: HTTP response headers containing caching metadata
336
    - vendor: AI model vendor identifier
337
    - model: Specific model name  
338
    - metric_params: MetricParams instance for recording metrics
339
    """
340

341

342
class CachingHeaders:
343
    """
344
    HTTP header constants for prompt caching detection.
345
    
346
    Defines the standard headers used by Bedrock to communicate
347
    prompt caching status and token counts.
348
    """
349
    
350
    READ = "x-amzn-bedrock-cache-read-input-token-count"
351
    """Header indicating cached input tokens read"""
352
    
353
    WRITE = "x-amzn-bedrock-cache-write-input-token-count" 
354
    """Header indicating input tokens written to cache"""
355

356

357
class CacheSpanAttrs:
358
    """
359
    Span attribute constants for prompt caching information.
360
    
361
    Standardized attribute names for recording caching data
362
    in OpenTelemetry spans.
363
    """
364
    
365
    TYPE = "gen_ai.cache.type"
366
    """Cache operation type (read/write/miss)"""
367
    
368
    CACHED = "gen_ai.prompt_caching"
369
    """Prompt caching utilization flag"""
370
```
371

372
## Usage Examples
373

374
### Basic Metrics Collection
375

376
```python
377
from opentelemetry import metrics
378
from opentelemetry.instrumentation.bedrock import BedrockInstrumentor, is_metrics_enabled
379

380
# Check if metrics are enabled
381
if is_metrics_enabled():
382
    print("Metrics collection is enabled")
383
    
384
    # Enable instrumentation with metrics
385
    BedrockInstrumentor().instrument()
386
else:
387
    print("Metrics collection is disabled")
388
```
389

390
### Custom Metrics Provider
391

392
```python
393
from opentelemetry.instrumentation.bedrock import BedrockInstrumentor
394
from opentelemetry.sdk.metrics import MeterProvider
395
from opentelemetry.sdk.metrics.export import ConsoleMetricExporter, PeriodicExportingMetricReader
396

397
# Configure custom metrics provider
398
metric_reader = PeriodicExportingMetricReader(
399
    exporter=ConsoleMetricExporter(),
400
    export_interval_millis=30000
401
)
402
meter_provider = MeterProvider(metric_readers=[metric_reader])
403

404
# Instrument with custom provider
405
BedrockInstrumentor().instrument(meter_provider=meter_provider)
406
```
407

408
### Metrics Analysis
409

410
Common metrics collected include:
411

412
```python
413
# Token usage metrics
414
token_histogram.record(
415
    value=150,  # token count
416
    attributes={
417
        "gen_ai.system": "bedrock",
418
        "gen_ai.request.model": "anthropic.claude-3-sonnet-20240229-v1:0",
419
        "gen_ai.token.type": "input"
420
    }
421
)
422

423
# Guardrail activation metrics  
424
guardrail_activation.add(
425
    1,
426
    attributes={
427
        "gen_ai.system": "bedrock", 
428
        "guardrail.type": "input",
429
        "guardrail.policy": "sensitive_info"
430
    }
431
)
432

433
# Request duration metrics
434
duration_histogram.record(
435
    value=1.25,  # seconds
436
    attributes={
437
        "gen_ai.system": "bedrock",
438
        "gen_ai.operation.name": "completion",
439
        "gen_ai.request.model": "anthropic.claude-3-sonnet-20240229-v1:0"
440
    }
441
)
442
```
443

444
### Monitoring Dashboard Queries
445

446
Example queries for common monitoring scenarios:
447

448
#### Token Usage Monitoring
449

450
```promql
451
# Average tokens per request by model
452
rate(gen_ai_token_usage_sum[5m]) / rate(gen_ai_token_usage_count[5m])
453

454
# Token usage by type (input vs output)
455
sum by (gen_ai_token_type) (rate(gen_ai_token_usage_sum[5m]))
456
```
457

458
#### Error Rate Monitoring
459

460
```promql
461
# Error rate by model
462
rate(llm_bedrock_completions_exceptions_total[5m]) / 
463
rate(gen_ai_operation_duration_count[5m])
464

465
# Error breakdown by type
466
sum by (error_type) (rate(llm_bedrock_completions_exceptions_total[5m]))
467
```
468

469
#### Guardrail Analytics
470

471
```promql
472
# Guardrail activation rate
473
rate(gen_ai_bedrock_guardrail_activation_total[5m])
474

475
# Guardrail policy violation breakdown
476
sum by (guardrail_policy) (rate(gen_ai_bedrock_guardrail_activation_total[5m]))
477

478
# Guardrail processing latency
479
histogram_quantile(0.95, rate(gen_ai_bedrock_guardrail_latency_bucket[5m]))
480
```
481

482
#### Prompt Caching Effectiveness
483

484
```promql
485
# Cache hit rate
486
rate(gen_ai_prompt_caching_total{cache_type="read"}[5m]) /
487
(rate(gen_ai_prompt_caching_total{cache_type="read"}[5m]) + 
488
 rate(gen_ai_prompt_caching_total{cache_type="write"}[5m]))
489

490
# Token savings from caching
491
sum(rate(gen_ai_prompt_caching_total{cache_type="read"}[5m]))
492
```
493

494
## Metrics Schema
495

496
### Standard Attributes
497

498
All metrics include standard attributes for filtering and aggregation:
499

500
- **gen_ai.system**: "bedrock"
501
- **gen_ai.request.model**: Full model identifier
502
- **gen_ai.operation.name**: "completion" or "chat"
503
- **error.type**: Exception class name (for error metrics)
504
- **gen_ai.token.type**: "input" or "output" (for token metrics)
505

506
### Guardrail-Specific Attributes
507

508
Guardrail metrics include additional categorization:
509

510
- **guardrail.type**: "input" or "output"
511
- **guardrail.policy**: Policy type (sensitive_info, topic, content, words)
512
- **guardrail.confidence**: Confidence score for detections
513
- **guardrail.action**: Action taken (block, warn, pass)
514

515
### Model-Specific Attributes
516

517
Model identification attributes for multi-model deployments:
518

519
- **gen_ai.model.vendor**: Vendor name (anthropic, cohere, ai21, etc.)
520
- **gen_ai.model.name**: Simplified model name 
521
- **gen_ai.model.version**: Model version identifier
522
- **gen_ai.model.family**: Model family grouping
523

524
## Performance Considerations
525

526
### Metrics Overhead
527

528
Metrics collection adds minimal overhead:
529
- **Counter operations**: ~10-50 nanoseconds per increment
530
- **Histogram recordings**: ~100-500 nanoseconds per measurement
531
- **Attribute processing**: ~50-200 nanoseconds per attribute set
532

533
### Cardinality Management
534

535
Control metrics cardinality to prevent memory issues:
536
- Model identifiers are normalized to reduce unique combinations
537
- Request parameters are not included as attributes
538
- User-specific data is excluded from metrics labels
539

540
### Batching and Export
541

542
Configure appropriate export intervals:
543
- **Development**: 5-10 second intervals for immediate feedback
544
- **Production**: 30-60 second intervals to balance freshness and overhead
545
- **High-volume**: Use sampling or aggregation for cost optimization

Version

Tile

Files

metrics.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

metrics.mddocs/