or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

annotations.mdapplication-framework.mddataset-management.mdindex.mdmapreduce-programs.mdplugin-framework.mdscheduling.mdservice-programs.mdspark-programs.mdsystem-services.mdtransactions.mdworker-programs.mdworkflow-programs.md

transactions.mddocs/

0

# Transaction Management

1

2

CDAP provides built-in support for ACID transactions across datasets, offering both declarative transaction control through annotations and programmatic transaction management through transactional interfaces.

3

4

## Capabilities

5

6

### Transactional Interface

7

8

Executes operations within transactions, ensuring ACID properties across dataset operations.

9

10

```java { .api }

11

/**

12

* Interface for executing operations within transactions

13

*/

14

public interface Transactional {

15

/**

16

* Executes a TxRunnable within a transaction

17

* @param runnable the operations to execute in the transaction

18

* @throws TransactionFailureException if transaction fails

19

*/

20

void execute(TxRunnable runnable) throws TransactionFailureException;

21

22

/**

23

* Executes a TxRunnable within a transaction with timeout

24

* @param timeoutInSeconds transaction timeout in seconds

25

* @param runnable the operations to execute in the transaction

26

* @throws TransactionFailureException if transaction fails

27

*/

28

void execute(int timeoutInSeconds, TxRunnable runnable) throws TransactionFailureException;

29

}

30

```

31

32

### TxRunnable Interface

33

34

Functional interface for defining transactional operations with dataset access.

35

36

```java { .api }

37

/**

38

* Runnable that provides DatasetContext for transactional dataset operations

39

*/

40

@FunctionalInterface

41

public interface TxRunnable {

42

/**

43

* Execute operations with access to datasets within a transaction

44

* @param context provides access to datasets

45

* @throws Exception if operation fails

46

*/

47

void run(DatasetContext context) throws Exception;

48

}

49

```

50

51

### Transaction Control Annotations

52

53

Declarative transaction control using annotations for method-level transaction management.

54

55

```java { .api }

56

/**

57

* Annotation to control transaction behavior for program lifecycle methods

58

*/

59

@Retention(RetentionPolicy.RUNTIME)

60

@Target(ElementType.METHOD)

61

public @interface TransactionPolicy {

62

TransactionControl value();

63

}

64

65

/**

66

* Enum defining transaction control modes

67

*/

68

public enum TransactionControl {

69

EXPLICIT, // Method controls its own transactions

70

IMPLICIT // Platform manages transactions automatically

71

}

72

```

73

74

**Usage Examples:**

75

76

```java

77

import co.cask.cdap.api.Transactional;

78

import co.cask.cdap.api.TxRunnable;

79

import co.cask.cdap.api.annotation.TransactionPolicy;

80

import co.cask.cdap.api.annotation.TransactionControl;

81

import co.cask.cdap.api.dataset.lib.KeyValueTable;

82

83

// Programmatic transaction management

84

public class TransactionalService {

85

86

public void processData(Transactional transactional, List<DataRecord> records) {

87

transactional.execute(context -> {

88

KeyValueTable table = context.getDataset("processed-data");

89

90

for (DataRecord record : records) {

91

// All operations within this block are transactional

92

String key = record.getId();

93

String value = processRecord(record);

94

table.write(key, value);

95

}

96

// Transaction commits automatically on successful completion

97

});

98

}

99

100

// Transaction with timeout

101

public void processLargeDataset(Transactional transactional, List<DataRecord> largeDataset) {

102

transactional.execute(300, context -> { // 5-minute timeout

103

KeyValueTable table = context.getDataset("large-processed-data");

104

105

// Long-running operation with extended timeout

106

for (DataRecord record : largeDataset) {

107

performComplexProcessing(record, table);

108

}

109

});

110

}

111

}

112

113

// Declarative transaction control with annotations

114

public class MyWorker extends AbstractWorker {

115

116

@Override

117

@TransactionPolicy(TransactionControl.EXPLICIT)

118

public void initialize(WorkerContext context) {

119

// This method runs without automatic transaction management

120

// Useful for initialization that may exceed transaction timeouts

121

122

setupExternalConnections();

123

124

// Can still use programmatic transactions when needed

125

context.execute(datasetContext -> {

126

// This block runs in a transaction

127

KeyValueTable config = datasetContext.getDataset("worker-config");

128

loadConfiguration(config);

129

});

130

}

131

132

@Override

133

// Default behavior - runs in implicit transaction

134

public void run() {

135

// This method automatically runs within a transaction

136

// All dataset operations are automatically transactional

137

138

KeyValueTable workTable = getContext().getDataset("work-items");

139

String workItem = workTable.read("next-item");

140

processWorkItem(workItem);

141

workTable.write("processed-item", workItem);

142

// Transaction commits on successful method completion

143

}

144

145

@Override

146

@TransactionPolicy(TransactionControl.EXPLICIT)

147

public void destroy() {

148

// Cleanup method without transaction overhead

149

closeExternalConnections();

150

}

151

}

152

```

153

154

## Transaction Behavior

155

156

### Implicit Transaction Control

157

- **Default behavior**: Program lifecycle methods run within transactions automatically

158

- **Automatic commit**: Successful method completion commits the transaction

159

- **Automatic rollback**: Exceptions cause transaction rollback

160

- **Timeout handling**: Uses default transaction timeout settings

161

162

### Explicit Transaction Control

163

- **Manual control**: Methods annotated with `EXPLICIT` manage their own transactions

164

- **No automatic wrapping**: Platform doesn't create transactions automatically

165

- **Programmatic access**: Use `Transactional.execute()` for controlled transactions

166

- **Timeout flexibility**: Set custom timeouts for long-running operations

167

168

### Transaction Scope

169

- **Dataset operations**: All dataset reads/writes within a transaction are atomic

170

- **Cross-dataset consistency**: Multiple datasets can participate in single transaction

171

- **Isolation**: Concurrent transactions see consistent dataset state

172

- **Durability**: Committed transactions persist across system failures

173

174

### Error Handling

175

- **TransactionFailureException**: Thrown when transaction operations fail

176

- **Rollback scenarios**: Exceptions during execution trigger automatic rollback

177

- **Retry logic**: Applications can implement retry patterns for transient failures

178

- **Conflict resolution**: System handles concurrent transaction conflicts

179

180

### Best Practices

181

182

**When to use EXPLICIT control:**

183

- Long-running initialization or cleanup operations

184

- Methods that may exceed default transaction timeouts

185

- Complex transaction patterns requiring custom retry logic

186

- Operations that don't require dataset access

187

188

**When to use IMPLICIT control (default):**

189

- Standard dataset operations within reasonable time limits

190

- Simple CRUD operations on datasets

191

- Operations that benefit from automatic transaction management

192

- Most program lifecycle methods