or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/maven-co-cask-cdap--cdap-api

Core application programming interface for the Cask Data Application Platform enabling development of scalable data processing applications on Hadoop ecosystems.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/co.cask.cdap/cdap-api@5.1.x

To install, run

npx @tessl/cli install tessl/maven-co-cask-cdap--cdap-api@5.1.0

0

# CDAP API

1

2

The CDAP API provides a comprehensive set of Java interfaces and abstractions for developing applications on the Cask Data Application Platform (CDAP). CDAP is a unified data platform built on Apache Hadoop that enables developers to create scalable data applications, workflows, services, and batch/real-time processing programs without dealing directly with the complexity of the underlying Hadoop infrastructure.

3

4

## Package Information

5

6

- **Package Name**: cdap-api

7

- **Package Type**: maven

8

- **Language**: Java

9

- **Maven Coordinates**: `co.cask.cdap:cdap-api:5.1.2`

10

- **Installation**: Add to your Maven `pom.xml`:

11

12

```xml

13

<dependency>

14

<groupId>co.cask.cdap</groupId>

15

<artifactId>cdap-api</artifactId>

16

<version>5.1.2</version>

17

</dependency>

18

```

19

20

## Core Imports

21

22

```java

23

import co.cask.cdap.api.app.Application;

24

import co.cask.cdap.api.app.AbstractApplication;

25

import co.cask.cdap.api.app.ApplicationConfigurer;

26

import co.cask.cdap.api.Config;

27

import co.cask.cdap.api.annotation.UseDataSet;

28

import co.cask.cdap.api.dataset.Dataset;

29

```

30

31

## Basic Usage

32

33

```java

34

import co.cask.cdap.api.app.Application;

35

import co.cask.cdap.api.app.AbstractApplication;

36

import co.cask.cdap.api.app.ApplicationConfigurer;

37

import co.cask.cdap.api.Config;

38

39

public class MyApplication extends AbstractApplication<Config> {

40

41

@Override

42

public void configure(ApplicationConfigurer configurer, ApplicationContext<Config> context) {

43

configurer.setName("MyDataApp");

44

configurer.setDescription("A sample CDAP application");

45

46

// Add datasets, programs, services, etc.

47

configurer.addMapReduce(new MyMapReduceJob());

48

configurer.addService(new MyService());

49

}

50

}

51

```

52

53

## Architecture

54

55

The CDAP API is organized around several key architectural concepts:

56

57

- **Applications**: Top-level containers that define the complete data processing solution

58

- **Programs**: Executable components within applications (MapReduce, Spark, Workflows, Services, Workers)

59

- **Datasets**: Abstraction layer for data storage and access

60

- **Plugins**: Extensible components for custom functionality

61

- **Scheduling**: Time-based and event-driven program execution

62

- **Services**: HTTP-based APIs and long-running services

63

64

## Capabilities

65

66

### Application Framework

67

68

Core interfaces and classes for building CDAP applications with configuration, lifecycle management, and program organization.

69

70

```java { .api }

71

public interface Application<T extends Config> {

72

void configure(ApplicationConfigurer configurer, ApplicationContext<T> context);

73

}

74

75

public abstract class AbstractApplication<T extends Config> implements Application<T> {

76

public final void configure(ApplicationConfigurer configurer, ApplicationContext<T> context);

77

protected abstract void configure();

78

protected final void setName(String name);

79

protected final void setDescription(String description);

80

}

81

82

public interface ApplicationConfigurer extends DatasetConfigurer, PluginConfigurer {

83

void setName(String name);

84

void setDescription(String description);

85

void addMapReduce(MapReduce mapReduce);

86

void addSpark(Spark spark);

87

void addWorkflow(Workflow workflow);

88

void addService(Service service);

89

void addWorker(Worker worker);

90

ScheduleBuilder buildSchedule(String scheduleName, ProgramType programType, String programName);

91

TriggerFactory getTriggerFactory();

92

}

93

```

94

95

[Application Framework](./application-framework.md)

96

97

### Program Types

98

99

Support for various program types including MapReduce, Spark, Workflow orchestration, HTTP services, and background workers.

100

101

```java { .api }

102

public interface MapReduce {

103

void configure(MapReduceConfigurer configurer);

104

}

105

106

public interface Spark {

107

void configure(SparkConfigurer configurer);

108

}

109

110

public interface Workflow {

111

void configure(WorkflowConfigurer configurer);

112

}

113

```

114

115

[MapReduce Programs](./mapreduce-programs.md)

116

117

[Spark Programs](./spark-programs.md)

118

119

[Workflow Programs](./workflow-programs.md)

120

121

[Service Programs](./service-programs.md)

122

123

[Worker Programs](./worker-programs.md)

124

125

### Dataset Management

126

127

Comprehensive dataset APIs with built-in types (key-value, indexed tables, file sets) and support for custom dataset implementations.

128

129

```java { .api }

130

public interface Dataset extends Closeable {

131

// Base dataset interface

132

}

133

134

public interface DatasetDefinition<D extends Dataset, A extends DatasetAdmin> {

135

String getName();

136

D getDataset(DatasetContext datasetContext, DatasetSpecification spec,

137

Map<String, String> arguments, ClassLoader classLoader);

138

}

139

```

140

141

[Dataset Management](./dataset-management.md)

142

143

### Plugin Framework

144

145

Extensible plugin architecture for adding custom processing logic, data sources, sinks, and transformations.

146

147

```java { .api }

148

public class PluginConfig {

149

// Base plugin configuration

150

}

151

152

public interface PluginContext {

153

<T> T newPluginInstance(String pluginId);

154

<T> Class<T> loadPluginClass(String pluginId);

155

}

156

157

@Plugin(type = "source")

158

public class MySourcePlugin extends PluginConfig {

159

// Custom plugin implementation

160

}

161

```

162

163

[Plugin Framework](./plugin-framework.md)

164

165

### Scheduling and Triggers

166

167

Flexible scheduling system with time-based triggers, program status triggers, and partition-based triggers for automated program execution.

168

169

```java { .api }

170

public class ScheduleBuilder {

171

public static ScheduleBuilder create(String name, Trigger trigger);

172

public ScheduleBuilder setDescription(String description);

173

public ScheduleBuilder setProperties(Map<String, String> properties);

174

}

175

176

public interface Trigger {

177

// Base trigger interface

178

}

179

```

180

181

[Scheduling and Triggers](./scheduling.md)

182

183

### Transaction Management

184

185

Built-in support for ACID transactions across datasets with declarative transaction control and programmatic transaction management.

186

187

```java { .api }

188

public interface Transactional {

189

void execute(TxRunnable runnable);

190

<T> T execute(Callable<T> callable);

191

}

192

193

@TransactionPolicy(TransactionControl.EXPLICIT)

194

public class MyProgram {

195

// Explicit transaction control

196

}

197

```

198

199

[Transaction Management](./transactions.md)

200

201

### Annotations and Configuration

202

203

Rich annotation-based configuration system for dependency injection, transaction control, data access patterns, and plugin metadata.

204

205

```java { .api }

206

// Flowlet dataset injection

207

@UseDataSet("myDataset")

208

private ObjectStore<Data> dataStore; // In Flowlet context

209

210

@Property

211

@Description("Configuration property description")

212

private String configValue;

213

214

@TransactionPolicy(TransactionControl.IMPLICIT)

215

public class MyTransactionalProgram {

216

// Implicit transaction handling

217

}

218

```

219

220

[Annotations and Configuration](./annotations.md)

221

222

### System Services

223

224

Integration with CDAP system services including metrics collection, service discovery, administrative operations, and artifact management.

225

226

```java { .api }

227

public interface Metrics {

228

void count(String metricName, int delta);

229

void gauge(String metricName, long value);

230

}

231

232

public interface ServiceDiscoverer {

233

Discoverable discover(String serviceName);

234

}

235

```

236

237

[System Services](./system-services.md)

238

239

## Types

240

241

```java { .api }

242

public class Config {

243

// Base configuration class for all configurable components

244

}

245

246

public enum ProgramType {

247

FLOW, MAPREDUCE, WORKFLOW, SERVICE, SPARK, WORKER

248

}

249

250

public interface RuntimeContext {

251

String getNamespace();

252

String getApplicationName();

253

ProgramType getProgramType();

254

String getProgramName();

255

}

256

257

public interface ProgramLifecycle<T extends RuntimeContext> {

258

void initialize(T context);

259

void destroy();

260

}

261

262

public class Resources {

263

private final int virtualCores;

264

private final int memoryMB;

265

266

public Resources(int memoryMB);

267

public Resources(int memoryMB, int virtualCores);

268

}

269

```