OpenTelemetry instrumentation for AWS Bedrock runtime services providing automatic tracing, metrics, and event emission for AI model invocations
—
Advanced metrics collection system providing comprehensive observability for AWS Bedrock AI model invocations. Includes token usage tracking, request duration monitoring, error analytics, guardrail interaction metrics, and prompt caching statistics.
Central container for all metrics instruments and runtime state, managing the complete lifecycle of metrics collection for Bedrock operations.
class MetricParams:
"""
Container for metrics configuration and runtime state.
Manages all metrics instruments and maintains request-specific
state throughout the lifecycle of Bedrock API calls.
"""
def __init__(
self,
token_histogram: Histogram,
choice_counter: Counter,
duration_histogram: Histogram,
exception_counter: Counter,
guardrail_activation: Counter,
guardrail_latency_histogram: Histogram,
guardrail_coverage: Counter,
guardrail_sensitive_info: Counter,
guardrail_topic: Counter,
guardrail_content: Counter,
guardrail_words: Counter,
prompt_caching: Counter,
):
"""
Initialize metrics container with all required instruments.
Parameters:
- token_histogram: Token usage tracking (input/output tokens)
- choice_counter: Number of completion choices generated
- duration_histogram: Request/response latency tracking
- exception_counter: Error and exception counting
- guardrail_activation: Guardrail trigger frequency
- guardrail_latency_histogram: Guardrail processing time
- guardrail_coverage: Text coverage by guardrails
- guardrail_sensitive_info: PII detection events
- guardrail_topic: Topic policy violations
- guardrail_content: Content policy violations
- guardrail_words: Word filter violations
- prompt_caching: Prompt caching utilization
"""
# Runtime state attributes
vendor: str
"""AI model vendor (e.g., 'anthropic', 'cohere', 'ai21')"""
model: str
"""Specific model name (e.g., 'claude-3-sonnet-20240229-v1:0')"""
is_stream: bool
"""Whether the current request uses streaming responses"""
start_time: float
"""Request start timestamp for duration calculation"""Metric name constants for standard AI model observability, following OpenTelemetry semantic conventions.
class GuardrailMeters:
"""
Metric name constants for Bedrock Guardrails observability.
Provides standardized metric names for all guardrail-related
measurements following semantic convention patterns.
"""
LLM_BEDROCK_GUARDRAIL_ACTIVATION = "gen_ai.bedrock.guardrail.activation"
"""Counter for guardrail activation events"""
LLM_BEDROCK_GUARDRAIL_LATENCY = "gen_ai.bedrock.guardrail.latency"
"""Histogram for guardrail processing latency in milliseconds"""
LLM_BEDROCK_GUARDRAIL_COVERAGE = "gen_ai.bedrock.guardrail.coverage"
"""Counter for text coverage by guardrails in characters"""
LLM_BEDROCK_GUARDRAIL_SENSITIVE = "gen_ai.bedrock.guardrail.sensitive_info"
"""Counter for sensitive information detection events"""
LLM_BEDROCK_GUARDRAIL_TOPICS = "gen_ai.bedrock.guardrail.topics"
"""Counter for topic policy violation events"""
LLM_BEDROCK_GUARDRAIL_CONTENT = "gen_ai.bedrock.guardrail.content"
"""Counter for content policy violation events"""
LLM_BEDROCK_GUARDRAIL_WORDS = "gen_ai.bedrock.guardrail.words"
"""Counter for word filter violation events"""
class PromptCaching:
"""
Metric name constants for prompt caching observability.
Provides standardized metric names for prompt caching
utilization and performance tracking.
"""
LLM_BEDROCK_PROMPT_CACHING = "gen_ai.prompt.caching"
"""Counter for cached token utilization"""
class GuardrailAttributes:
"""
Span attribute constants for guardrail information.
Standardized attribute names for recording guardrail-related
data in OpenTelemetry spans.
"""
GUARDRAIL = "gen_ai.guardrail"
"""Base guardrail attribute namespace"""
TYPE = "gen_ai.guardrail.type"
"""Guardrail processing type (input/output)"""
PII = "gen_ai.guardrail.pii"
"""PII detection attribute"""
PATTERN = "gen_ai.guardrail.pattern"
"""Pattern matching attribute"""
TOPIC = "gen_ai.guardrail.topic"
"""Topic policy attribute"""
CONTENT = "gen_ai.guardrail.content"
"""Content policy attribute"""
CONFIDENCE = "gen_ai.guardrail.confidence"
"""Confidence score attribute"""
MATCH = "gen_ai.guardrail.match"
"""Match result attribute"""
class Type(Enum):
"""
Guardrail processing type enumeration.
Defines whether guardrail processing applies to input
or output content in AI model interactions.
"""
INPUT = "input"
"""Input content processing"""
OUTPUT = "output"
"""Output content processing"""Functions for creating and managing the complete set of metrics instruments required for Bedrock observability.
def _create_metrics(meter: Meter) -> tuple:
"""
Create all metrics instruments for Bedrock observability.
Initializes the complete set of histograms and counters needed
for comprehensive monitoring of Bedrock AI model interactions.
Parameters:
- meter: OpenTelemetry Meter instance for creating instruments
Returns:
Tuple containing all metrics instruments:
(token_histogram, choice_counter, duration_histogram, exception_counter,
guardrail_activation, guardrail_latency_histogram, guardrail_coverage,
guardrail_sensitive_info, guardrail_topic, guardrail_content,
guardrail_words, prompt_caching)
"""
def is_metrics_enabled() -> bool:
"""
Check if metrics collection is globally enabled.
Returns:
Boolean indicating if metrics should be collected based on
TRACELOOP_METRICS_ENABLED environment variable (default: true)
"""Specialized functions for processing and recording guardrail-related metrics with detailed categorization and attribution.
def is_guardrail_activated(response) -> bool:
"""
Check if any guardrails were activated in the response.
Examines response metadata to determine if Bedrock Guardrails
processed the request and applied any filtering or monitoring.
Parameters:
- response: Bedrock API response containing guardrail metadata
Returns:
Boolean indicating guardrail activation status
"""
def guardrail_converse(span, response, vendor, model, metric_params) -> None:
"""
Process guardrail metrics for converse API responses.
Extracts and records guardrail-related metrics from converse API
responses, including policy violations and processing latency.
Parameters:
- span: OpenTelemetry span for attribute setting
- response: Converse API response with guardrail metadata
- vendor: AI model vendor identifier
- model: Specific model name
- metric_params: MetricParams instance for recording metrics
"""
def guardrail_handling(span, response_body, vendor, model, metric_params) -> None:
"""
Process guardrail metrics for invoke_model API responses.
Handles guardrail metric extraction and recording for traditional
invoke_model API calls with comprehensive policy violation tracking.
Parameters:
- span: OpenTelemetry span for attribute setting
- response_body: Parsed response body with guardrail data
- vendor: AI model vendor identifier
- model: Specific model name
- metric_params: MetricParams instance for recording metrics
"""
def handle_invoke_metrics(t: Type, guardrail, attrs, metric_params) -> None:
"""
Handle metrics processing for guardrail invocations.
Extracts and records guardrail processing latency and coverage
metrics from guardrail invocation metadata.
Parameters:
- t: Guardrail processing type (INPUT or OUTPUT)
- guardrail: Guardrail response data containing metrics
- attrs: Base metric attributes for categorization
- metric_params: MetricParams instance for recording metrics
"""
def handle_sensitive(t: Type, guardrail, attrs, metric_params) -> None:
"""
Handle metrics for sensitive information policy violations.
Records metrics for PII detection and sensitive content
filtering by Bedrock Guardrails.
Parameters:
- t: Guardrail processing type (INPUT or OUTPUT)
- guardrail: Guardrail response data with PII detection results
- attrs: Base metric attributes for categorization
- metric_params: MetricParams instance for recording metrics
"""
def handle_topic(t: Type, guardrail, attrs, metric_params) -> None:
"""
Handle metrics for topic policy violations.
Records metrics for topic policy enforcement including
forbidden topics and conversation steering.
Parameters:
- t: Guardrail processing type (INPUT or OUTPUT)
- guardrail: Guardrail response data with topic policy results
- attrs: Base metric attributes for categorization
- metric_params: MetricParams instance for recording metrics
"""
def handle_content(t: Type, guardrail, attrs, metric_params) -> None:
"""
Handle metrics for content policy violations.
Records metrics for content filtering including harmful content
detection and safety policy enforcement.
Parameters:
- t: Guardrail processing type (INPUT or OUTPUT)
- guardrail: Guardrail response data with content policy results
- attrs: Base metric attributes for categorization
- metric_params: MetricParams instance for recording metrics
"""
def handle_words(t: Type, guardrail, attrs, metric_params) -> None:
"""
Handle metrics for word filter violations.
Records metrics for word-level filtering including blocked
words and phrases detected by guardrails.
Parameters:
- t: Guardrail processing type (INPUT or OUTPUT)
- guardrail: Guardrail response data with word filter results
- attrs: Base metric attributes for categorization
- metric_params: MetricParams instance for recording metrics
"""Functions for tracking prompt caching utilization and performance metrics.
def prompt_caching_handling(headers, vendor, model, metric_params) -> None:
"""
Process prompt caching metrics from response headers.
Extracts caching information from HTTP response headers and
records metrics for cache hits, misses, and token savings.
Parameters:
- headers: HTTP response headers containing caching metadata
- vendor: AI model vendor identifier
- model: Specific model name
- metric_params: MetricParams instance for recording metrics
"""
class CachingHeaders:
"""
HTTP header constants for prompt caching detection.
Defines the standard headers used by Bedrock to communicate
prompt caching status and token counts.
"""
READ = "x-amzn-bedrock-cache-read-input-token-count"
"""Header indicating cached input tokens read"""
WRITE = "x-amzn-bedrock-cache-write-input-token-count"
"""Header indicating input tokens written to cache"""
class CacheSpanAttrs:
"""
Span attribute constants for prompt caching information.
Standardized attribute names for recording caching data
in OpenTelemetry spans.
"""
TYPE = "gen_ai.cache.type"
"""Cache operation type (read/write/miss)"""
CACHED = "gen_ai.prompt_caching"
"""Prompt caching utilization flag"""from opentelemetry import metrics
from opentelemetry.instrumentation.bedrock import BedrockInstrumentor, is_metrics_enabled
# Check if metrics are enabled
if is_metrics_enabled():
print("Metrics collection is enabled")
# Enable instrumentation with metrics
BedrockInstrumentor().instrument()
else:
print("Metrics collection is disabled")from opentelemetry.instrumentation.bedrock import BedrockInstrumentor
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import ConsoleMetricExporter, PeriodicExportingMetricReader
# Configure custom metrics provider
metric_reader = PeriodicExportingMetricReader(
exporter=ConsoleMetricExporter(),
export_interval_millis=30000
)
meter_provider = MeterProvider(metric_readers=[metric_reader])
# Instrument with custom provider
BedrockInstrumentor().instrument(meter_provider=meter_provider)Common metrics collected include:
# Token usage metrics
token_histogram.record(
value=150, # token count
attributes={
"gen_ai.system": "bedrock",
"gen_ai.request.model": "anthropic.claude-3-sonnet-20240229-v1:0",
"gen_ai.token.type": "input"
}
)
# Guardrail activation metrics
guardrail_activation.add(
1,
attributes={
"gen_ai.system": "bedrock",
"guardrail.type": "input",
"guardrail.policy": "sensitive_info"
}
)
# Request duration metrics
duration_histogram.record(
value=1.25, # seconds
attributes={
"gen_ai.system": "bedrock",
"gen_ai.operation.name": "completion",
"gen_ai.request.model": "anthropic.claude-3-sonnet-20240229-v1:0"
}
)Example queries for common monitoring scenarios:
# Average tokens per request by model
rate(gen_ai_token_usage_sum[5m]) / rate(gen_ai_token_usage_count[5m])
# Token usage by type (input vs output)
sum by (gen_ai_token_type) (rate(gen_ai_token_usage_sum[5m]))# Error rate by model
rate(llm_bedrock_completions_exceptions_total[5m]) /
rate(gen_ai_operation_duration_count[5m])
# Error breakdown by type
sum by (error_type) (rate(llm_bedrock_completions_exceptions_total[5m]))# Guardrail activation rate
rate(gen_ai_bedrock_guardrail_activation_total[5m])
# Guardrail policy violation breakdown
sum by (guardrail_policy) (rate(gen_ai_bedrock_guardrail_activation_total[5m]))
# Guardrail processing latency
histogram_quantile(0.95, rate(gen_ai_bedrock_guardrail_latency_bucket[5m]))# Cache hit rate
rate(gen_ai_prompt_caching_total{cache_type="read"}[5m]) /
(rate(gen_ai_prompt_caching_total{cache_type="read"}[5m]) +
rate(gen_ai_prompt_caching_total{cache_type="write"}[5m]))
# Token savings from caching
sum(rate(gen_ai_prompt_caching_total{cache_type="read"}[5m]))All metrics include standard attributes for filtering and aggregation:
Guardrail metrics include additional categorization:
Model identification attributes for multi-model deployments:
Metrics collection adds minimal overhead:
Control metrics cardinality to prevent memory issues:
Configure appropriate export intervals:
Install with Tessl CLI
npx tessl i tessl/pypi-opentelemetry-instrumentation-bedrock