CDAP Watchdog provides comprehensive metrics collection, querying, and logging services for the CDAP platform
npx @tessl/cli install tessl/maven-io-cdap-cdap--cdap-watchdog@6.11.00
# CDAP Watchdog
1
2
CDAP Watchdog provides comprehensive metrics collection, querying, and logging services for the CDAP (Cask Data Application Platform) ecosystem. It implements a distributed monitoring and observability system that gathers performance data from CDAP components, processes and aggregates this data for efficient storage and retrieval, and exposes REST APIs for real-time metrics queries and centralized log management.
3
4
## Package Information
5
6
- **Package Name**: cdap-watchdog
7
- **Package Type**: maven
8
- **Language**: Java
9
- **Group ID**: io.cdap.cdap
10
- **Artifact ID**: cdap-watchdog
11
- **Version**: 6.11.0
12
- **Installation**: Include in your Maven `pom.xml`:
13
14
```xml
15
<dependency>
16
<groupId>io.cdap.cdap</groupId>
17
<artifactId>cdap-watchdog</artifactId>
18
<version>6.11.0</version>
19
</dependency>
20
```
21
22
## Core Imports
23
24
```java
25
// Metrics query functionality
26
import io.cdap.cdap.metrics.query.MetricsQueryService;
27
import io.cdap.cdap.metrics.query.MetricsHandler;
28
29
// Metrics collection and emission
30
import io.cdap.cdap.metrics.collect.MetricsEmitter;
31
import io.cdap.cdap.metrics.collect.AggregatedMetricsEmitter;
32
33
// Metrics processing services
34
import io.cdap.cdap.metrics.process.MetricsProcessorStatusService;
35
import io.cdap.cdap.metrics.process.MessagingMetricsProcessorService;
36
37
// Logging services
38
import io.cdap.cdap.logging.service.LogQueryService;
39
import io.cdap.cdap.logging.gateway.handlers.LogHttpHandler;
40
import io.cdap.cdap.logging.gateway.handlers.ErrorClassificationHttpHandler;
41
import io.cdap.cdap.logging.read.LogReader;
42
43
// Log buffer system
44
import io.cdap.cdap.logging.logbuffer.LogBufferService;
45
import io.cdap.cdap.logging.logbuffer.LogBufferWriter;
46
47
// Logging contexts
48
import io.cdap.cdap.logging.context.LoggingContextHelper;
49
import io.cdap.cdap.logging.context.ApplicationLoggingContext;
50
51
// Error classification
52
import io.cdap.cdap.logging.ErrorLogsClassifier;
53
```
54
55
## Basic Usage
56
57
### Metrics Collection
58
59
```java
60
import io.cdap.cdap.metrics.collect.AggregatedMetricsEmitter;
61
import io.cdap.cdap.api.metrics.MetricValue;
62
63
// Create metrics emitter for collecting metrics
64
AggregatedMetricsEmitter emitter = new AggregatedMetricsEmitter("my.metric.name");
65
66
// Emit different types of metrics
67
emitter.increment(5); // Counter metric
68
emitter.gauge(100); // Gauge metric
69
emitter.event(250); // Event for distribution
70
71
// Emit the collected metrics
72
MetricValue metricValue = emitter.emit();
73
```
74
75
### Log Management
76
77
```java
78
import io.cdap.cdap.logging.context.LoggingContextHelper;
79
import io.cdap.cdap.logging.context.ApplicationLoggingContext;
80
import io.cdap.cdap.logging.read.LogReader;
81
import io.cdap.cdap.logging.read.LogEvent;
82
83
// Create logging context for an application
84
ApplicationLoggingContext context = LoggingContextHelper.getLoggingContext(
85
"myNamespace", "myApp", "myProgram", ProgramType.SERVICE
86
);
87
88
// Read logs using LogReader
89
LogReader logReader = // ... obtain LogReader instance
90
logReader.getLog(context, startTime, endTime, Filter.EMPTY);
91
```
92
93
## Architecture
94
95
CDAP Watchdog is built around several key components:
96
97
- **Metrics System**: Distributed metrics collection with `MetricsEmitter` implementations, centralized processing through `MetricsProcessorService`, and query capabilities via `MetricsQueryService` with REST API endpoints
98
- **Logging Framework**: Comprehensive log collection using specialized appenders, centralized log storage and indexing, and flexible querying through `LogQueryService` with context-aware filtering
99
- **Service Discovery Integration**: Both metrics and logging services integrate with CDAP's service discovery for distributed deployment
100
- **Context System**: Rich logging context hierarchy supporting different CDAP program types (services, workflows, MapReduce, Spark, workers)
101
- **Administrative Messaging**: Admin operations for metrics deletion and system maintenance through `MetricsAdminMessage`
102
103
## Capabilities
104
105
### Metrics Query API
106
107
REST endpoints for querying metrics data, including time series queries, aggregate queries, tag searches, and batch query processing.
108
109
```java { .api }
110
public class MetricsQueryService extends AbstractIdleService {
111
protected void startUp() throws Exception;
112
protected void shutDown() throws Exception;
113
}
114
115
public class MetricsHandler extends AbstractHttpHandler {
116
// POST /v3/metrics/search - Search for tags or metrics
117
// POST /v3/metrics/query - Query metrics data (batch and single queries)
118
// GET /v3/metrics/processor/status - Get metrics processor status
119
}
120
```
121
122
[Metrics Query API](./metrics-query.md)
123
124
### Metrics Collection API
125
126
Interfaces and implementations for collecting and emitting metrics data, including counter, gauge, and distribution metrics with aggregation capabilities.
127
128
```java { .api }
129
public interface MetricsEmitter {
130
MetricValue emit();
131
}
132
133
public final class AggregatedMetricsEmitter implements MetricsEmitter {
134
public void increment(long incrementValue);
135
public void gauge(long value);
136
public void event(long value);
137
public MetricValue emit();
138
}
139
```
140
141
[Metrics Collection API](./metrics-collection.md)
142
143
### Metrics Processing Services
144
145
Backend metrics processing infrastructure for consuming metrics data from message queues, processing and persisting metrics to storage systems, and providing status monitoring.
146
147
```java { .api }
148
public class MetricsProcessorStatusService extends AbstractIdleService {
149
protected void startUp() throws Exception;
150
protected void shutDown() throws Exception;
151
}
152
153
public class MessagingMetricsProcessorService extends AbstractExecutionThreadService {
154
protected void run() throws Exception;
155
protected void shutDown() throws Exception;
156
}
157
```
158
159
[Metrics Processing Services](./metrics-processing.md)
160
161
### Logging Service API
162
163
Core logging services for centralized log collection, querying, and management with REST endpoints for log retrieval and error analysis.
164
165
```java { .api }
166
public class LogQueryService extends AbstractIdleService {
167
protected void startUp() throws Exception;
168
protected void shutDown() throws Exception;
169
}
170
171
public interface LogReader {
172
void getLogNext(LoggingContext loggingContext, ReadRange readRange, int maxEvents, Filter filter, Callback callback) throws Exception;
173
void getLogPrev(LoggingContext loggingContext, ReadRange readRange, int maxEvents, Filter filter, Callback callback) throws Exception;
174
}
175
```
176
177
[Logging Service API](./logging-service.md)
178
179
### Logging Context System
180
181
Context classes and utilities for organizing logs by CDAP program types, providing structured logging with consistent tagging and filtering capabilities.
182
183
```java { .api }
184
public final class LoggingContextHelper {
185
public static LoggingContext getLoggingContext(String namespaceId, String appId, String entityId, ProgramType programType);
186
public static LoggingContext getLoggingContextWithRunId(String namespaceId, String appId, String entityId, ProgramType programType, String runId);
187
}
188
189
public class ApplicationLoggingContext extends AbstractLoggingContext {
190
public String getLogPartition();
191
}
192
```
193
194
[Logging Context System](./logging-context.md)
195
196
### Log Buffer System
197
198
High-throughput log buffering infrastructure for temporary log storage, pipeline processing, automatic recovery, and cleanup operations with file-based buffering and concurrent writer support.
199
200
```java { .api }
201
public class LogBufferService extends AbstractIdleService {
202
protected void startUp() throws Exception;
203
protected void shutDown() throws Exception;
204
}
205
206
public class LogBufferWriter implements Flushable, Closeable {
207
public void append(LogBufferEvent logEvent) throws IOException;
208
public void flush() throws IOException;
209
public void close() throws IOException;
210
}
211
```
212
213
[Log Buffer System](./log-buffer.md)
214
215
## Common Data Types
216
217
```java { .api }
218
// Logging data models
219
public class LogEvent {
220
public ILoggingEvent getLoggingEvent();
221
public LogOffset getOffset();
222
}
223
224
// Administrative messages
225
public final class MetricsAdminMessage {
226
public Type getType();
227
public <T> T getPayload(Gson gson, Type type);
228
229
public enum Type {
230
DELETE
231
}
232
}
233
234
// Entity categorization
235
public enum MetricsEntityType {
236
CONTEXT("c"),
237
RUN("r"),
238
METRIC("m"),
239
TAG("t");
240
}
241
```