or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.mdlog-buffer.mdlogging-context.mdlogging-service.mdmetrics-collection.mdmetrics-processing.mdmetrics-query.md

index.mddocs/

0

# CDAP Watchdog

1

2

CDAP Watchdog provides comprehensive metrics collection, querying, and logging services for the CDAP (Cask Data Application Platform) ecosystem. It implements a distributed monitoring and observability system that gathers performance data from CDAP components, processes and aggregates this data for efficient storage and retrieval, and exposes REST APIs for real-time metrics queries and centralized log management.

3

4

## Package Information

5

6

- **Package Name**: cdap-watchdog

7

- **Package Type**: maven

8

- **Language**: Java

9

- **Group ID**: io.cdap.cdap

10

- **Artifact ID**: cdap-watchdog

11

- **Version**: 6.11.0

12

- **Installation**: Include in your Maven `pom.xml`:

13

14

```xml

15

<dependency>

16

<groupId>io.cdap.cdap</groupId>

17

<artifactId>cdap-watchdog</artifactId>

18

<version>6.11.0</version>

19

</dependency>

20

```

21

22

## Core Imports

23

24

```java

25

// Metrics query functionality

26

import io.cdap.cdap.metrics.query.MetricsQueryService;

27

import io.cdap.cdap.metrics.query.MetricsHandler;

28

29

// Metrics collection and emission

30

import io.cdap.cdap.metrics.collect.MetricsEmitter;

31

import io.cdap.cdap.metrics.collect.AggregatedMetricsEmitter;

32

33

// Metrics processing services

34

import io.cdap.cdap.metrics.process.MetricsProcessorStatusService;

35

import io.cdap.cdap.metrics.process.MessagingMetricsProcessorService;

36

37

// Logging services

38

import io.cdap.cdap.logging.service.LogQueryService;

39

import io.cdap.cdap.logging.gateway.handlers.LogHttpHandler;

40

import io.cdap.cdap.logging.gateway.handlers.ErrorClassificationHttpHandler;

41

import io.cdap.cdap.logging.read.LogReader;

42

43

// Log buffer system

44

import io.cdap.cdap.logging.logbuffer.LogBufferService;

45

import io.cdap.cdap.logging.logbuffer.LogBufferWriter;

46

47

// Logging contexts

48

import io.cdap.cdap.logging.context.LoggingContextHelper;

49

import io.cdap.cdap.logging.context.ApplicationLoggingContext;

50

51

// Error classification

52

import io.cdap.cdap.logging.ErrorLogsClassifier;

53

```

54

55

## Basic Usage

56

57

### Metrics Collection

58

59

```java

60

import io.cdap.cdap.metrics.collect.AggregatedMetricsEmitter;

61

import io.cdap.cdap.api.metrics.MetricValue;

62

63

// Create metrics emitter for collecting metrics

64

AggregatedMetricsEmitter emitter = new AggregatedMetricsEmitter("my.metric.name");

65

66

// Emit different types of metrics

67

emitter.increment(5); // Counter metric

68

emitter.gauge(100); // Gauge metric

69

emitter.event(250); // Event for distribution

70

71

// Emit the collected metrics

72

MetricValue metricValue = emitter.emit();

73

```

74

75

### Log Management

76

77

```java

78

import io.cdap.cdap.logging.context.LoggingContextHelper;

79

import io.cdap.cdap.logging.context.ApplicationLoggingContext;

80

import io.cdap.cdap.logging.read.LogReader;

81

import io.cdap.cdap.logging.read.LogEvent;

82

83

// Create logging context for an application

84

ApplicationLoggingContext context = LoggingContextHelper.getLoggingContext(

85

"myNamespace", "myApp", "myProgram", ProgramType.SERVICE

86

);

87

88

// Read logs using LogReader

89

LogReader logReader = // ... obtain LogReader instance

90

logReader.getLog(context, startTime, endTime, Filter.EMPTY);

91

```

92

93

## Architecture

94

95

CDAP Watchdog is built around several key components:

96

97

- **Metrics System**: Distributed metrics collection with `MetricsEmitter` implementations, centralized processing through `MetricsProcessorService`, and query capabilities via `MetricsQueryService` with REST API endpoints

98

- **Logging Framework**: Comprehensive log collection using specialized appenders, centralized log storage and indexing, and flexible querying through `LogQueryService` with context-aware filtering

99

- **Service Discovery Integration**: Both metrics and logging services integrate with CDAP's service discovery for distributed deployment

100

- **Context System**: Rich logging context hierarchy supporting different CDAP program types (services, workflows, MapReduce, Spark, workers)

101

- **Administrative Messaging**: Admin operations for metrics deletion and system maintenance through `MetricsAdminMessage`

102

103

## Capabilities

104

105

### Metrics Query API

106

107

REST endpoints for querying metrics data, including time series queries, aggregate queries, tag searches, and batch query processing.

108

109

```java { .api }

110

public class MetricsQueryService extends AbstractIdleService {

111

protected void startUp() throws Exception;

112

protected void shutDown() throws Exception;

113

}

114

115

public class MetricsHandler extends AbstractHttpHandler {

116

// POST /v3/metrics/search - Search for tags or metrics

117

// POST /v3/metrics/query - Query metrics data (batch and single queries)

118

// GET /v3/metrics/processor/status - Get metrics processor status

119

}

120

```

121

122

[Metrics Query API](./metrics-query.md)

123

124

### Metrics Collection API

125

126

Interfaces and implementations for collecting and emitting metrics data, including counter, gauge, and distribution metrics with aggregation capabilities.

127

128

```java { .api }

129

public interface MetricsEmitter {

130

MetricValue emit();

131

}

132

133

public final class AggregatedMetricsEmitter implements MetricsEmitter {

134

public void increment(long incrementValue);

135

public void gauge(long value);

136

public void event(long value);

137

public MetricValue emit();

138

}

139

```

140

141

[Metrics Collection API](./metrics-collection.md)

142

143

### Metrics Processing Services

144

145

Backend metrics processing infrastructure for consuming metrics data from message queues, processing and persisting metrics to storage systems, and providing status monitoring.

146

147

```java { .api }

148

public class MetricsProcessorStatusService extends AbstractIdleService {

149

protected void startUp() throws Exception;

150

protected void shutDown() throws Exception;

151

}

152

153

public class MessagingMetricsProcessorService extends AbstractExecutionThreadService {

154

protected void run() throws Exception;

155

protected void shutDown() throws Exception;

156

}

157

```

158

159

[Metrics Processing Services](./metrics-processing.md)

160

161

### Logging Service API

162

163

Core logging services for centralized log collection, querying, and management with REST endpoints for log retrieval and error analysis.

164

165

```java { .api }

166

public class LogQueryService extends AbstractIdleService {

167

protected void startUp() throws Exception;

168

protected void shutDown() throws Exception;

169

}

170

171

public interface LogReader {

172

void getLogNext(LoggingContext loggingContext, ReadRange readRange, int maxEvents, Filter filter, Callback callback) throws Exception;

173

void getLogPrev(LoggingContext loggingContext, ReadRange readRange, int maxEvents, Filter filter, Callback callback) throws Exception;

174

}

175

```

176

177

[Logging Service API](./logging-service.md)

178

179

### Logging Context System

180

181

Context classes and utilities for organizing logs by CDAP program types, providing structured logging with consistent tagging and filtering capabilities.

182

183

```java { .api }

184

public final class LoggingContextHelper {

185

public static LoggingContext getLoggingContext(String namespaceId, String appId, String entityId, ProgramType programType);

186

public static LoggingContext getLoggingContextWithRunId(String namespaceId, String appId, String entityId, ProgramType programType, String runId);

187

}

188

189

public class ApplicationLoggingContext extends AbstractLoggingContext {

190

public String getLogPartition();

191

}

192

```

193

194

[Logging Context System](./logging-context.md)

195

196

### Log Buffer System

197

198

High-throughput log buffering infrastructure for temporary log storage, pipeline processing, automatic recovery, and cleanup operations with file-based buffering and concurrent writer support.

199

200

```java { .api }

201

public class LogBufferService extends AbstractIdleService {

202

protected void startUp() throws Exception;

203

protected void shutDown() throws Exception;

204

}

205

206

public class LogBufferWriter implements Flushable, Closeable {

207

public void append(LogBufferEvent logEvent) throws IOException;

208

public void flush() throws IOException;

209

public void close() throws IOException;

210

}

211

```

212

213

[Log Buffer System](./log-buffer.md)

214

215

## Common Data Types

216

217

```java { .api }

218

// Logging data models

219

public class LogEvent {

220

public ILoggingEvent getLoggingEvent();

221

public LogOffset getOffset();

222

}

223

224

// Administrative messages

225

public final class MetricsAdminMessage {

226

public Type getType();

227

public <T> T getPayload(Gson gson, Type type);

228

229

public enum Type {

230

DELETE

231

}

232

}

233

234

// Entity categorization

235

public enum MetricsEntityType {

236

CONTEXT("c"),

237

RUN("r"),

238

METRIC("m"),

239

TAG("t");

240

}

241

```