0
# Monitoring and Metrics
1
2
Comprehensive monitoring capabilities including percentile metrics, usage statistics, and performance monitoring across regions, databases, containers, and partitions. These operations provide detailed insights into Cosmos DB performance, consumption, and health.
3
4
## Capabilities
5
6
### Percentile Metrics
7
8
Get percentile-based performance metrics for detailed performance analysis and SLA monitoring.
9
10
```python { .api }
11
def list_metrics(
12
self,
13
resource_group_name: str,
14
account_name: str,
15
filter: str
16
) -> ItemPaged[PercentileMetric]:
17
"""
18
Get percentile metrics for a database account.
19
20
Parameters:
21
- resource_group_name: Name of the resource group
22
- account_name: Name of the Cosmos DB account
23
- filter: OData filter for metrics (timespan, metric names, aggregation)
24
25
Returns:
26
ItemPaged[PercentileMetric]: Paginated list of percentile metrics
27
"""
28
29
def list_source_target_metrics(
30
self,
31
resource_group_name: str,
32
account_name: str,
33
source_region: str,
34
target_region: str,
35
filter: str
36
) -> ItemPaged[PercentileMetric]:
37
"""
38
Get source-target percentile metrics for replication monitoring.
39
40
Parameters:
41
- resource_group_name: Name of the resource group
42
- account_name: Name of the Cosmos DB account
43
- source_region: Source region for replication metrics
44
- target_region: Target region for replication metrics
45
- filter: OData filter for metrics
46
47
Returns:
48
ItemPaged[PercentileMetric]: Paginated list of replication percentile metrics
49
"""
50
51
def list_target_metrics(
52
self,
53
resource_group_name: str,
54
account_name: str,
55
target_region: str,
56
filter: str
57
) -> ItemPaged[PercentileMetric]:
58
"""
59
Get target percentile metrics for a specific region.
60
61
Parameters:
62
- resource_group_name: Name of the resource group
63
- account_name: Name of the Cosmos DB account
64
- target_region: Target region for metrics
65
- filter: OData filter for metrics
66
67
Returns:
68
ItemPaged[PercentileMetric]: Paginated list of target region percentile metrics
69
"""
70
```
71
72
### Database-Level Monitoring
73
74
Monitor database-level metrics, usage, and performance characteristics.
75
76
```python { .api }
77
def list_metrics(
78
self,
79
resource_group_name: str,
80
account_name: str,
81
database_rid: str,
82
filter: str
83
) -> ItemPaged[Metric]:
84
"""
85
Get metrics for a specific database.
86
87
Parameters:
88
- resource_group_name: Name of the resource group
89
- account_name: Name of the Cosmos DB account
90
- database_rid: Resource identifier of the database
91
- filter: OData filter for metrics (timespan, metric names)
92
93
Returns:
94
ItemPaged[Metric]: Paginated list of database metrics
95
"""
96
97
def list_usages(
98
self,
99
resource_group_name: str,
100
account_name: str,
101
database_rid: str,
102
filter: Optional[str] = None
103
) -> ItemPaged[Usage]:
104
"""
105
Get usage statistics for a database.
106
107
Parameters:
108
- resource_group_name: Name of the resource group
109
- account_name: Name of the Cosmos DB account
110
- database_rid: Resource identifier of the database
111
- filter: Optional OData filter for usage data
112
113
Returns:
114
ItemPaged[Usage]: Paginated list of database usage statistics
115
"""
116
117
def list_metric_definitions(
118
self,
119
resource_group_name: str,
120
account_name: str,
121
database_rid: str
122
) -> ItemPaged[MetricDefinition]:
123
"""
124
Get available metric definitions for a database.
125
126
Parameters:
127
- resource_group_name: Name of the resource group
128
- account_name: Name of the Cosmos DB account
129
- database_rid: Resource identifier of the database
130
131
Returns:
132
ItemPaged[MetricDefinition]: Available metric definitions for the database
133
"""
134
```
135
136
### Container-Level Monitoring
137
138
Monitor container-level metrics, usage, and partition-specific performance data.
139
140
```python { .api }
141
def list_metrics(
142
self,
143
resource_group_name: str,
144
account_name: str,
145
database_rid: str,
146
collection_rid: str,
147
filter: str
148
) -> ItemPaged[Metric]:
149
"""
150
Get metrics for a specific container.
151
152
Parameters:
153
- resource_group_name: Name of the resource group
154
- account_name: Name of the Cosmos DB account
155
- database_rid: Resource identifier of the database
156
- collection_rid: Resource identifier of the container
157
- filter: OData filter for metrics
158
159
Returns:
160
ItemPaged[Metric]: Paginated list of container metrics
161
"""
162
163
def list_usages(
164
self,
165
resource_group_name: str,
166
account_name: str,
167
database_rid: str,
168
collection_rid: str,
169
filter: Optional[str] = None
170
) -> ItemPaged[Usage]:
171
"""
172
Get usage statistics for a container.
173
174
Parameters:
175
- resource_group_name: Name of the resource group
176
- account_name: Name of the Cosmos DB account
177
- database_rid: Resource identifier of the database
178
- collection_rid: Resource identifier of the container
179
- filter: Optional OData filter for usage data
180
181
Returns:
182
ItemPaged[Usage]: Paginated list of container usage statistics
183
"""
184
185
def list_partition_metrics(
186
self,
187
resource_group_name: str,
188
account_name: str,
189
database_rid: str,
190
collection_rid: str,
191
filter: str
192
) -> ItemPaged[PartitionMetric]:
193
"""
194
Get partition-level metrics for a container.
195
196
Parameters:
197
- resource_group_name: Name of the resource group
198
- account_name: Name of the Cosmos DB account
199
- database_rid: Resource identifier of the database
200
- collection_rid: Resource identifier of the container
201
- filter: OData filter for metrics
202
203
Returns:
204
ItemPaged[PartitionMetric]: Paginated list of partition-level metrics
205
"""
206
```
207
208
### Regional Monitoring
209
210
Monitor region-specific metrics and cross-region performance data.
211
212
```python { .api }
213
def list_region_metrics(
214
self,
215
resource_group_name: str,
216
account_name: str,
217
region: str,
218
database_rid: str,
219
collection_rid: str,
220
filter: str
221
) -> ItemPaged[Metric]:
222
"""
223
Get region-specific metrics for a container.
224
225
Parameters:
226
- resource_group_name: Name of the resource group
227
- account_name: Name of the Cosmos DB account
228
- region: Azure region name
229
- database_rid: Resource identifier of the database
230
- collection_rid: Resource identifier of the container
231
- filter: OData filter for metrics
232
233
Returns:
234
ItemPaged[Metric]: Paginated list of region-specific container metrics
235
"""
236
237
def list_partition_region_metrics(
238
self,
239
resource_group_name: str,
240
account_name: str,
241
region: str,
242
database_rid: str,
243
collection_rid: str,
244
filter: str
245
) -> ItemPaged[PartitionMetric]:
246
"""
247
Get region-specific partition metrics for a container.
248
249
Parameters:
250
- resource_group_name: Name of the resource group
251
- account_name: Name of the Cosmos DB account
252
- region: Azure region name
253
- database_rid: Resource identifier of the database
254
- collection_rid: Resource identifier of the container
255
- filter: OData filter for metrics
256
257
Returns:
258
ItemPaged[PartitionMetric]: Paginated list of region-specific partition metrics
259
"""
260
261
def list_database_account_region_metrics(
262
self,
263
resource_group_name: str,
264
account_name: str,
265
region: str,
266
filter: str
267
) -> ItemPaged[Metric]:
268
"""
269
Get region-specific metrics for a database account.
270
271
Parameters:
272
- resource_group_name: Name of the resource group
273
- account_name: Name of the Cosmos DB account
274
- region: Azure region name
275
- filter: OData filter for metrics
276
277
Returns:
278
ItemPaged[Metric]: Paginated list of region-specific account metrics
279
"""
280
```
281
282
### Partition Key Range Monitoring
283
284
Monitor partition key range performance and distribution metrics.
285
286
```python { .api }
287
def list_metrics(
288
self,
289
resource_group_name: str,
290
account_name: str,
291
database_rid: str,
292
collection_rid: str,
293
partition_key_range_id: str,
294
filter: str
295
) -> ItemPaged[PartitionMetric]:
296
"""
297
Get metrics for a specific partition key range.
298
299
Parameters:
300
- resource_group_name: Name of the resource group
301
- account_name: Name of the Cosmos DB account
302
- database_rid: Resource identifier of the database
303
- collection_rid: Resource identifier of the container
304
- partition_key_range_id: Identifier of the partition key range
305
- filter: OData filter for metrics
306
307
Returns:
308
ItemPaged[PartitionMetric]: Paginated list of partition key range metrics
309
"""
310
311
def list_region_metrics(
312
self,
313
resource_group_name: str,
314
account_name: str,
315
region: str,
316
database_rid: str,
317
collection_rid: str,
318
partition_key_range_id: str,
319
filter: str
320
) -> ItemPaged[PartitionMetric]:
321
"""
322
Get region-specific metrics for a partition key range.
323
324
Parameters:
325
- resource_group_name: Name of the resource group
326
- account_name: Name of the Cosmos DB account
327
- region: Azure region name
328
- database_rid: Resource identifier of the database
329
- collection_rid: Resource identifier of the container
330
- partition_key_range_id: Identifier of the partition key range
331
- filter: OData filter for metrics
332
333
Returns:
334
ItemPaged[PartitionMetric]: Paginated list of region-specific partition key range metrics
335
"""
336
```
337
338
## Usage Examples
339
340
### Getting Account-Level Percentile Metrics
341
342
```python
343
from azure.mgmt.cosmosdb import CosmosDBManagementClient
344
from azure.identity import DefaultAzureCredential
345
from datetime import datetime, timedelta
346
347
client = CosmosDBManagementClient(DefaultAzureCredential(), "subscription-id")
348
349
# Get percentile metrics for the last 24 hours
350
end_time = datetime.utcnow()
351
start_time = end_time - timedelta(hours=24)
352
353
# Format time filter for OData
354
time_filter = f"(name.value eq 'ServerSideLatency') and timeGrain eq duration'PT1M' and startTime eq {start_time.isoformat()}Z and endTime eq {end_time.isoformat()}Z"
355
356
percentile_metrics = client.percentile.list_metrics(
357
"my-resource-group",
358
"my-cosmos-account",
359
time_filter
360
)
361
362
for metric in percentile_metrics:
363
print(f"Metric: {metric.name.value}")
364
print(f"Time Grain: {metric.time_grain}")
365
for data_point in metric.data:
366
print(f" Time: {data_point.time_stamp}")
367
print(f" P50: {data_point.p50} ms")
368
print(f" P95: {data_point.p95} ms")
369
print(f" P99: {data_point.p99} ms")
370
```
371
372
### Monitoring Database Usage and Metrics
373
374
```python
375
# Get database resource ID (typically in format "dbs/{database-name}")
376
database_rid = "dbs/products"
377
378
# Get database metrics for request units consumed
379
ru_filter = f"(name.value eq 'TotalRequestUnits') and timeGrain eq duration'PT1H' and startTime eq {start_time.isoformat()}Z and endTime eq {end_time.isoformat()}Z"
380
381
database_metrics = client.database.list_metrics(
382
"my-resource-group",
383
"my-cosmos-account",
384
database_rid,
385
ru_filter
386
)
387
388
for metric in database_metrics:
389
print(f"Database Metric: {metric.name.value}")
390
for data_point in metric.data:
391
print(f" Time: {data_point.time_stamp}")
392
print(f" Total RU: {data_point.total}")
393
print(f" Average RU/s: {data_point.average}")
394
395
# Get database usage statistics
396
usage_stats = client.database.list_usages(
397
"my-resource-group",
398
"my-cosmos-account",
399
database_rid
400
)
401
402
for usage in usage_stats:
403
print(f"Usage Metric: {usage.name.value}")
404
print(f"Current Value: {usage.current_value}")
405
print(f"Limit: {usage.limit}")
406
print(f"Unit: {usage.unit}")
407
```
408
409
### Container Partition Monitoring
410
411
```python
412
# Monitor partition-level metrics for hot partition detection
413
container_rid = "dbs/products/colls/items"
414
415
partition_filter = f"(name.value eq 'NormalizedRUConsumption') and timeGrain eq duration'PT1M' and startTime eq {start_time.isoformat()}Z and endTime eq {end_time.isoformat()}Z"
416
417
partition_metrics = client.collection_partition.list_metrics(
418
"my-resource-group",
419
"my-cosmos-account",
420
database_rid,
421
container_rid,
422
partition_filter
423
)
424
425
for metric in partition_metrics:
426
print(f"Partition Metric: {metric.name.value}")
427
for data_point in metric.data:
428
print(f" Partition ID: {data_point.partition_id}")
429
print(f" Time: {data_point.time_stamp}")
430
print(f" Normalized RU%: {data_point.maximum}%")
431
```
432
433
### Cross-Region Replication Monitoring
434
435
```python
436
# Monitor replication latency between regions
437
source_region = "East US"
438
target_region = "West Europe"
439
440
replication_filter = f"(name.value eq 'ReplicationLatency') and timeGrain eq duration'PT5M' and startTime eq {start_time.isoformat()}Z and endTime eq {end_time.isoformat()}Z"
441
442
replication_metrics = client.percentile_source_target.list_metrics(
443
"my-resource-group",
444
"my-cosmos-account",
445
source_region,
446
target_region,
447
replication_filter
448
)
449
450
for metric in replication_metrics:
451
print(f"Replication Metric: {metric.name.value}")
452
print(f"Source: {source_region} -> Target: {target_region}")
453
for data_point in metric.data:
454
print(f" Time: {data_point.time_stamp}")
455
print(f" P50 Latency: {data_point.p50} ms")
456
print(f" P99 Latency: {data_point.p99} ms")
457
```
458
459
### Available Metrics Discovery
460
461
```python
462
# Discover what metrics are available for a container
463
metric_definitions = client.collection.list_metric_definitions(
464
"my-resource-group",
465
"my-cosmos-account",
466
database_rid,
467
container_rid
468
)
469
470
print("Available Container Metrics:")
471
for definition in metric_definitions:
472
print(f" Name: {definition.name.value}")
473
print(f" Display Name: {definition.name.localized_value}")
474
print(f" Unit: {definition.unit}")
475
print(f" Primary Aggregation: {definition.primary_aggregation_type}")
476
print(f" Supported Aggregations: {definition.supported_aggregation_types}")
477
print(" ---")
478
```
479
480
## Key Types
481
482
```python { .api }
483
class PercentileMetric:
484
"""Percentile-based performance metric data."""
485
name: MetricName # Metric name and display name
486
unit: str # Metric unit (e.g. "Milliseconds", "Count")
487
time_grain: str # Time grain for aggregation (e.g. "PT1M", "PT1H")
488
start_time: str # Metric period start time
489
end_time: str # Metric period end time
490
data: List[PercentileMetricValue] # Time series data points
491
492
class PercentileMetricValue:
493
"""Individual percentile metric data point."""
494
time_stamp: str # Timestamp for this data point
495
p10: float # 10th percentile value
496
p25: float # 25th percentile value
497
p50: float # 50th percentile value (median)
498
p75: float # 75th percentile value
499
p90: float # 90th percentile value
500
p95: float # 95th percentile value
501
p99: float # 99th percentile value
502
503
class Metric:
504
"""Standard metric data."""
505
name: MetricName # Metric name and display name
506
unit: str # Metric unit
507
time_grain: str # Time grain for aggregation
508
start_time: str # Metric period start time
509
end_time: str # Metric period end time
510
data: List[MetricValue] # Time series data points
511
512
class MetricValue:
513
"""Individual metric data point."""
514
time_stamp: str # Timestamp for this data point
515
total: float # Total/sum value
516
count: float # Count of samples
517
average: float # Average value
518
minimum: float # Minimum value
519
maximum: float # Maximum value
520
521
class PartitionMetric:
522
"""Partition-specific metric data."""
523
name: MetricName # Metric name and display name
524
unit: str # Metric unit
525
time_grain: str # Time grain for aggregation
526
start_time: str # Metric period start time
527
end_time: str # Metric period end time
528
data: List[PartitionMetricValue] # Time series data points
529
530
class PartitionMetricValue:
531
"""Individual partition metric data point."""
532
time_stamp: str # Timestamp for this data point
533
partition_id: str # Partition identifier
534
partition_key_range_id: str # Partition key range identifier
535
total: float # Total/sum value
536
count: float # Count of samples
537
average: float # Average value
538
minimum: float # Minimum value
539
maximum: float # Maximum value
540
541
class Usage:
542
"""Resource usage statistics."""
543
name: MetricName # Usage metric name
544
unit: str # Usage unit (e.g. "Bytes", "Count")
545
current_value: int # Current usage value
546
limit: int # Maximum allowed value (-1 for unlimited)
547
quota_period: str # Period for quota enforcement
548
549
class MetricDefinition:
550
"""Definition of an available metric."""
551
name: MetricName # Metric name and display name
552
unit: str # Metric unit
553
primary_aggregation_type: str # Primary aggregation method
554
supported_aggregation_types: List[str] # All supported aggregation methods
555
metric_availabilities: List[MetricAvailability] # Available time grains and retention
556
fill_gap_with_zero: bool # Whether to fill gaps with zero values
557
558
class MetricName:
559
"""Metric name information."""
560
value: str # Programmatic metric name
561
localized_value: str # Human-readable display name
562
563
class MetricAvailability:
564
"""Metric availability and retention information."""
565
time_grain: str # Time grain (e.g. "PT1M", "PT1H", "P1D")
566
retention: str # Data retention period (e.g. "P30D", "P90D")
567
```