0
# Metrics and Monitoring
1
2
Advanced metrics collection system providing comprehensive observability for AWS Bedrock AI model invocations. Includes token usage tracking, request duration monitoring, error analytics, guardrail interaction metrics, and prompt caching statistics.
3
4
## Capabilities
5
6
### Metrics Parameters Container
7
8
Central container for all metrics instruments and runtime state, managing the complete lifecycle of metrics collection for Bedrock operations.
9
10
```python { .api }
11
class MetricParams:
12
"""
13
Container for metrics configuration and runtime state.
14
15
Manages all metrics instruments and maintains request-specific
16
state throughout the lifecycle of Bedrock API calls.
17
"""
18
19
def __init__(
20
self,
21
token_histogram: Histogram,
22
choice_counter: Counter,
23
duration_histogram: Histogram,
24
exception_counter: Counter,
25
guardrail_activation: Counter,
26
guardrail_latency_histogram: Histogram,
27
guardrail_coverage: Counter,
28
guardrail_sensitive_info: Counter,
29
guardrail_topic: Counter,
30
guardrail_content: Counter,
31
guardrail_words: Counter,
32
prompt_caching: Counter,
33
):
34
"""
35
Initialize metrics container with all required instruments.
36
37
Parameters:
38
- token_histogram: Token usage tracking (input/output tokens)
39
- choice_counter: Number of completion choices generated
40
- duration_histogram: Request/response latency tracking
41
- exception_counter: Error and exception counting
42
- guardrail_activation: Guardrail trigger frequency
43
- guardrail_latency_histogram: Guardrail processing time
44
- guardrail_coverage: Text coverage by guardrails
45
- guardrail_sensitive_info: PII detection events
46
- guardrail_topic: Topic policy violations
47
- guardrail_content: Content policy violations
48
- guardrail_words: Word filter violations
49
- prompt_caching: Prompt caching utilization
50
"""
51
52
# Runtime state attributes
53
vendor: str
54
"""AI model vendor (e.g., 'anthropic', 'cohere', 'ai21')"""
55
56
model: str
57
"""Specific model name (e.g., 'claude-3-sonnet-20240229-v1:0')"""
58
59
is_stream: bool
60
"""Whether the current request uses streaming responses"""
61
62
start_time: float
63
"""Request start timestamp for duration calculation"""
64
```
65
66
### Core Metrics Constants
67
68
Metric name constants for standard AI model observability, following OpenTelemetry semantic conventions.
69
70
```python { .api }
71
class GuardrailMeters:
72
"""
73
Metric name constants for Bedrock Guardrails observability.
74
75
Provides standardized metric names for all guardrail-related
76
measurements following semantic convention patterns.
77
"""
78
79
LLM_BEDROCK_GUARDRAIL_ACTIVATION = "gen_ai.bedrock.guardrail.activation"
80
"""Counter for guardrail activation events"""
81
82
LLM_BEDROCK_GUARDRAIL_LATENCY = "gen_ai.bedrock.guardrail.latency"
83
"""Histogram for guardrail processing latency in milliseconds"""
84
85
LLM_BEDROCK_GUARDRAIL_COVERAGE = "gen_ai.bedrock.guardrail.coverage"
86
"""Counter for text coverage by guardrails in characters"""
87
88
LLM_BEDROCK_GUARDRAIL_SENSITIVE = "gen_ai.bedrock.guardrail.sensitive_info"
89
"""Counter for sensitive information detection events"""
90
91
LLM_BEDROCK_GUARDRAIL_TOPICS = "gen_ai.bedrock.guardrail.topics"
92
"""Counter for topic policy violation events"""
93
94
LLM_BEDROCK_GUARDRAIL_CONTENT = "gen_ai.bedrock.guardrail.content"
95
"""Counter for content policy violation events"""
96
97
LLM_BEDROCK_GUARDRAIL_WORDS = "gen_ai.bedrock.guardrail.words"
98
"""Counter for word filter violation events"""
99
100
101
class PromptCaching:
102
"""
103
Metric name constants for prompt caching observability.
104
105
Provides standardized metric names for prompt caching
106
utilization and performance tracking.
107
"""
108
109
LLM_BEDROCK_PROMPT_CACHING = "gen_ai.prompt.caching"
110
"""Counter for cached token utilization"""
111
112
113
class GuardrailAttributes:
114
"""
115
Span attribute constants for guardrail information.
116
117
Standardized attribute names for recording guardrail-related
118
data in OpenTelemetry spans.
119
"""
120
121
GUARDRAIL = "gen_ai.guardrail"
122
"""Base guardrail attribute namespace"""
123
124
TYPE = "gen_ai.guardrail.type"
125
"""Guardrail processing type (input/output)"""
126
127
PII = "gen_ai.guardrail.pii"
128
"""PII detection attribute"""
129
130
PATTERN = "gen_ai.guardrail.pattern"
131
"""Pattern matching attribute"""
132
133
TOPIC = "gen_ai.guardrail.topic"
134
"""Topic policy attribute"""
135
136
CONTENT = "gen_ai.guardrail.content"
137
"""Content policy attribute"""
138
139
CONFIDENCE = "gen_ai.guardrail.confidence"
140
"""Confidence score attribute"""
141
142
MATCH = "gen_ai.guardrail.match"
143
"""Match result attribute"""
144
145
146
class Type(Enum):
147
"""
148
Guardrail processing type enumeration.
149
150
Defines whether guardrail processing applies to input
151
or output content in AI model interactions.
152
"""
153
154
INPUT = "input"
155
"""Input content processing"""
156
157
OUTPUT = "output"
158
"""Output content processing"""
159
```
160
161
### Metrics Creation and Management
162
163
Functions for creating and managing the complete set of metrics instruments required for Bedrock observability.
164
165
```python { .api }
166
def _create_metrics(meter: Meter) -> tuple:
167
"""
168
Create all metrics instruments for Bedrock observability.
169
170
Initializes the complete set of histograms and counters needed
171
for comprehensive monitoring of Bedrock AI model interactions.
172
173
Parameters:
174
- meter: OpenTelemetry Meter instance for creating instruments
175
176
Returns:
177
Tuple containing all metrics instruments:
178
(token_histogram, choice_counter, duration_histogram, exception_counter,
179
guardrail_activation, guardrail_latency_histogram, guardrail_coverage,
180
guardrail_sensitive_info, guardrail_topic, guardrail_content,
181
guardrail_words, prompt_caching)
182
"""
183
184
185
def is_metrics_enabled() -> bool:
186
"""
187
Check if metrics collection is globally enabled.
188
189
Returns:
190
Boolean indicating if metrics should be collected based on
191
TRACELOOP_METRICS_ENABLED environment variable (default: true)
192
"""
193
```
194
195
### Guardrail Metrics Processing
196
197
Specialized functions for processing and recording guardrail-related metrics with detailed categorization and attribution.
198
199
```python { .api }
200
def is_guardrail_activated(response) -> bool:
201
"""
202
Check if any guardrails were activated in the response.
203
204
Examines response metadata to determine if Bedrock Guardrails
205
processed the request and applied any filtering or monitoring.
206
207
Parameters:
208
- response: Bedrock API response containing guardrail metadata
209
210
Returns:
211
Boolean indicating guardrail activation status
212
"""
213
214
215
def guardrail_converse(span, response, vendor, model, metric_params) -> None:
216
"""
217
Process guardrail metrics for converse API responses.
218
219
Extracts and records guardrail-related metrics from converse API
220
responses, including policy violations and processing latency.
221
222
Parameters:
223
- span: OpenTelemetry span for attribute setting
224
- response: Converse API response with guardrail metadata
225
- vendor: AI model vendor identifier
226
- model: Specific model name
227
- metric_params: MetricParams instance for recording metrics
228
"""
229
230
231
def guardrail_handling(span, response_body, vendor, model, metric_params) -> None:
232
"""
233
Process guardrail metrics for invoke_model API responses.
234
235
Handles guardrail metric extraction and recording for traditional
236
invoke_model API calls with comprehensive policy violation tracking.
237
238
Parameters:
239
- span: OpenTelemetry span for attribute setting
240
- response_body: Parsed response body with guardrail data
241
- vendor: AI model vendor identifier
242
- model: Specific model name
243
- metric_params: MetricParams instance for recording metrics
244
"""
245
246
247
def handle_invoke_metrics(t: Type, guardrail, attrs, metric_params) -> None:
248
"""
249
Handle metrics processing for guardrail invocations.
250
251
Extracts and records guardrail processing latency and coverage
252
metrics from guardrail invocation metadata.
253
254
Parameters:
255
- t: Guardrail processing type (INPUT or OUTPUT)
256
- guardrail: Guardrail response data containing metrics
257
- attrs: Base metric attributes for categorization
258
- metric_params: MetricParams instance for recording metrics
259
"""
260
261
262
def handle_sensitive(t: Type, guardrail, attrs, metric_params) -> None:
263
"""
264
Handle metrics for sensitive information policy violations.
265
266
Records metrics for PII detection and sensitive content
267
filtering by Bedrock Guardrails.
268
269
Parameters:
270
- t: Guardrail processing type (INPUT or OUTPUT)
271
- guardrail: Guardrail response data with PII detection results
272
- attrs: Base metric attributes for categorization
273
- metric_params: MetricParams instance for recording metrics
274
"""
275
276
277
def handle_topic(t: Type, guardrail, attrs, metric_params) -> None:
278
"""
279
Handle metrics for topic policy violations.
280
281
Records metrics for topic policy enforcement including
282
forbidden topics and conversation steering.
283
284
Parameters:
285
- t: Guardrail processing type (INPUT or OUTPUT)
286
- guardrail: Guardrail response data with topic policy results
287
- attrs: Base metric attributes for categorization
288
- metric_params: MetricParams instance for recording metrics
289
"""
290
291
292
def handle_content(t: Type, guardrail, attrs, metric_params) -> None:
293
"""
294
Handle metrics for content policy violations.
295
296
Records metrics for content filtering including harmful content
297
detection and safety policy enforcement.
298
299
Parameters:
300
- t: Guardrail processing type (INPUT or OUTPUT)
301
- guardrail: Guardrail response data with content policy results
302
- attrs: Base metric attributes for categorization
303
- metric_params: MetricParams instance for recording metrics
304
"""
305
306
307
def handle_words(t: Type, guardrail, attrs, metric_params) -> None:
308
"""
309
Handle metrics for word filter violations.
310
311
Records metrics for word-level filtering including blocked
312
words and phrases detected by guardrails.
313
314
Parameters:
315
- t: Guardrail processing type (INPUT or OUTPUT)
316
- guardrail: Guardrail response data with word filter results
317
- attrs: Base metric attributes for categorization
318
- metric_params: MetricParams instance for recording metrics
319
"""
320
```
321
322
### Prompt Caching Metrics
323
324
Functions for tracking prompt caching utilization and performance metrics.
325
326
```python { .api }
327
def prompt_caching_handling(headers, vendor, model, metric_params) -> None:
328
"""
329
Process prompt caching metrics from response headers.
330
331
Extracts caching information from HTTP response headers and
332
records metrics for cache hits, misses, and token savings.
333
334
Parameters:
335
- headers: HTTP response headers containing caching metadata
336
- vendor: AI model vendor identifier
337
- model: Specific model name
338
- metric_params: MetricParams instance for recording metrics
339
"""
340
341
342
class CachingHeaders:
343
"""
344
HTTP header constants for prompt caching detection.
345
346
Defines the standard headers used by Bedrock to communicate
347
prompt caching status and token counts.
348
"""
349
350
READ = "x-amzn-bedrock-cache-read-input-token-count"
351
"""Header indicating cached input tokens read"""
352
353
WRITE = "x-amzn-bedrock-cache-write-input-token-count"
354
"""Header indicating input tokens written to cache"""
355
356
357
class CacheSpanAttrs:
358
"""
359
Span attribute constants for prompt caching information.
360
361
Standardized attribute names for recording caching data
362
in OpenTelemetry spans.
363
"""
364
365
TYPE = "gen_ai.cache.type"
366
"""Cache operation type (read/write/miss)"""
367
368
CACHED = "gen_ai.prompt_caching"
369
"""Prompt caching utilization flag"""
370
```
371
372
## Usage Examples
373
374
### Basic Metrics Collection
375
376
```python
377
from opentelemetry import metrics
378
from opentelemetry.instrumentation.bedrock import BedrockInstrumentor, is_metrics_enabled
379
380
# Check if metrics are enabled
381
if is_metrics_enabled():
382
print("Metrics collection is enabled")
383
384
# Enable instrumentation with metrics
385
BedrockInstrumentor().instrument()
386
else:
387
print("Metrics collection is disabled")
388
```
389
390
### Custom Metrics Provider
391
392
```python
393
from opentelemetry.instrumentation.bedrock import BedrockInstrumentor
394
from opentelemetry.sdk.metrics import MeterProvider
395
from opentelemetry.sdk.metrics.export import ConsoleMetricExporter, PeriodicExportingMetricReader
396
397
# Configure custom metrics provider
398
metric_reader = PeriodicExportingMetricReader(
399
exporter=ConsoleMetricExporter(),
400
export_interval_millis=30000
401
)
402
meter_provider = MeterProvider(metric_readers=[metric_reader])
403
404
# Instrument with custom provider
405
BedrockInstrumentor().instrument(meter_provider=meter_provider)
406
```
407
408
### Metrics Analysis
409
410
Common metrics collected include:
411
412
```python
413
# Token usage metrics
414
token_histogram.record(
415
value=150, # token count
416
attributes={
417
"gen_ai.system": "bedrock",
418
"gen_ai.request.model": "anthropic.claude-3-sonnet-20240229-v1:0",
419
"gen_ai.token.type": "input"
420
}
421
)
422
423
# Guardrail activation metrics
424
guardrail_activation.add(
425
1,
426
attributes={
427
"gen_ai.system": "bedrock",
428
"guardrail.type": "input",
429
"guardrail.policy": "sensitive_info"
430
}
431
)
432
433
# Request duration metrics
434
duration_histogram.record(
435
value=1.25, # seconds
436
attributes={
437
"gen_ai.system": "bedrock",
438
"gen_ai.operation.name": "completion",
439
"gen_ai.request.model": "anthropic.claude-3-sonnet-20240229-v1:0"
440
}
441
)
442
```
443
444
### Monitoring Dashboard Queries
445
446
Example queries for common monitoring scenarios:
447
448
#### Token Usage Monitoring
449
450
```promql
451
# Average tokens per request by model
452
rate(gen_ai_token_usage_sum[5m]) / rate(gen_ai_token_usage_count[5m])
453
454
# Token usage by type (input vs output)
455
sum by (gen_ai_token_type) (rate(gen_ai_token_usage_sum[5m]))
456
```
457
458
#### Error Rate Monitoring
459
460
```promql
461
# Error rate by model
462
rate(llm_bedrock_completions_exceptions_total[5m]) /
463
rate(gen_ai_operation_duration_count[5m])
464
465
# Error breakdown by type
466
sum by (error_type) (rate(llm_bedrock_completions_exceptions_total[5m]))
467
```
468
469
#### Guardrail Analytics
470
471
```promql
472
# Guardrail activation rate
473
rate(gen_ai_bedrock_guardrail_activation_total[5m])
474
475
# Guardrail policy violation breakdown
476
sum by (guardrail_policy) (rate(gen_ai_bedrock_guardrail_activation_total[5m]))
477
478
# Guardrail processing latency
479
histogram_quantile(0.95, rate(gen_ai_bedrock_guardrail_latency_bucket[5m]))
480
```
481
482
#### Prompt Caching Effectiveness
483
484
```promql
485
# Cache hit rate
486
rate(gen_ai_prompt_caching_total{cache_type="read"}[5m]) /
487
(rate(gen_ai_prompt_caching_total{cache_type="read"}[5m]) +
488
rate(gen_ai_prompt_caching_total{cache_type="write"}[5m]))
489
490
# Token savings from caching
491
sum(rate(gen_ai_prompt_caching_total{cache_type="read"}[5m]))
492
```
493
494
## Metrics Schema
495
496
### Standard Attributes
497
498
All metrics include standard attributes for filtering and aggregation:
499
500
- **gen_ai.system**: "bedrock"
501
- **gen_ai.request.model**: Full model identifier
502
- **gen_ai.operation.name**: "completion" or "chat"
503
- **error.type**: Exception class name (for error metrics)
504
- **gen_ai.token.type**: "input" or "output" (for token metrics)
505
506
### Guardrail-Specific Attributes
507
508
Guardrail metrics include additional categorization:
509
510
- **guardrail.type**: "input" or "output"
511
- **guardrail.policy**: Policy type (sensitive_info, topic, content, words)
512
- **guardrail.confidence**: Confidence score for detections
513
- **guardrail.action**: Action taken (block, warn, pass)
514
515
### Model-Specific Attributes
516
517
Model identification attributes for multi-model deployments:
518
519
- **gen_ai.model.vendor**: Vendor name (anthropic, cohere, ai21, etc.)
520
- **gen_ai.model.name**: Simplified model name
521
- **gen_ai.model.version**: Model version identifier
522
- **gen_ai.model.family**: Model family grouping
523
524
## Performance Considerations
525
526
### Metrics Overhead
527
528
Metrics collection adds minimal overhead:
529
- **Counter operations**: ~10-50 nanoseconds per increment
530
- **Histogram recordings**: ~100-500 nanoseconds per measurement
531
- **Attribute processing**: ~50-200 nanoseconds per attribute set
532
533
### Cardinality Management
534
535
Control metrics cardinality to prevent memory issues:
536
- Model identifiers are normalized to reduce unique combinations
537
- Request parameters are not included as attributes
538
- User-specific data is excluded from metrics labels
539
540
### Batching and Export
541
542
Configure appropriate export intervals:
543
- **Development**: 5-10 second intervals for immediate feedback
544
- **Production**: 30-60 second intervals to balance freshness and overhead
545
- **High-volume**: Use sampling or aggregation for cost optimization