0
# Transaction Management
1
2
CDAP provides built-in support for ACID transactions across datasets, offering both declarative transaction control through annotations and programmatic transaction management through transactional interfaces.
3
4
## Capabilities
5
6
### Transactional Interface
7
8
Executes operations within transactions, ensuring ACID properties across dataset operations.
9
10
```java { .api }
11
/**
12
* Interface for executing operations within transactions
13
*/
14
public interface Transactional {
15
/**
16
* Executes a TxRunnable within a transaction
17
* @param runnable the operations to execute in the transaction
18
* @throws TransactionFailureException if transaction fails
19
*/
20
void execute(TxRunnable runnable) throws TransactionFailureException;
21
22
/**
23
* Executes a TxRunnable within a transaction with timeout
24
* @param timeoutInSeconds transaction timeout in seconds
25
* @param runnable the operations to execute in the transaction
26
* @throws TransactionFailureException if transaction fails
27
*/
28
void execute(int timeoutInSeconds, TxRunnable runnable) throws TransactionFailureException;
29
}
30
```
31
32
### TxRunnable Interface
33
34
Functional interface for defining transactional operations with dataset access.
35
36
```java { .api }
37
/**
38
* Runnable that provides DatasetContext for transactional dataset operations
39
*/
40
@FunctionalInterface
41
public interface TxRunnable {
42
/**
43
* Execute operations with access to datasets within a transaction
44
* @param context provides access to datasets
45
* @throws Exception if operation fails
46
*/
47
void run(DatasetContext context) throws Exception;
48
}
49
```
50
51
### Transaction Control Annotations
52
53
Declarative transaction control using annotations for method-level transaction management.
54
55
```java { .api }
56
/**
57
* Annotation to control transaction behavior for program lifecycle methods
58
*/
59
@Retention(RetentionPolicy.RUNTIME)
60
@Target(ElementType.METHOD)
61
public @interface TransactionPolicy {
62
TransactionControl value();
63
}
64
65
/**
66
* Enum defining transaction control modes
67
*/
68
public enum TransactionControl {
69
EXPLICIT, // Method controls its own transactions
70
IMPLICIT // Platform manages transactions automatically
71
}
72
```
73
74
**Usage Examples:**
75
76
```java
77
import co.cask.cdap.api.Transactional;
78
import co.cask.cdap.api.TxRunnable;
79
import co.cask.cdap.api.annotation.TransactionPolicy;
80
import co.cask.cdap.api.annotation.TransactionControl;
81
import co.cask.cdap.api.dataset.lib.KeyValueTable;
82
83
// Programmatic transaction management
84
public class TransactionalService {
85
86
public void processData(Transactional transactional, List<DataRecord> records) {
87
transactional.execute(context -> {
88
KeyValueTable table = context.getDataset("processed-data");
89
90
for (DataRecord record : records) {
91
// All operations within this block are transactional
92
String key = record.getId();
93
String value = processRecord(record);
94
table.write(key, value);
95
}
96
// Transaction commits automatically on successful completion
97
});
98
}
99
100
// Transaction with timeout
101
public void processLargeDataset(Transactional transactional, List<DataRecord> largeDataset) {
102
transactional.execute(300, context -> { // 5-minute timeout
103
KeyValueTable table = context.getDataset("large-processed-data");
104
105
// Long-running operation with extended timeout
106
for (DataRecord record : largeDataset) {
107
performComplexProcessing(record, table);
108
}
109
});
110
}
111
}
112
113
// Declarative transaction control with annotations
114
public class MyWorker extends AbstractWorker {
115
116
@Override
117
@TransactionPolicy(TransactionControl.EXPLICIT)
118
public void initialize(WorkerContext context) {
119
// This method runs without automatic transaction management
120
// Useful for initialization that may exceed transaction timeouts
121
122
setupExternalConnections();
123
124
// Can still use programmatic transactions when needed
125
context.execute(datasetContext -> {
126
// This block runs in a transaction
127
KeyValueTable config = datasetContext.getDataset("worker-config");
128
loadConfiguration(config);
129
});
130
}
131
132
@Override
133
// Default behavior - runs in implicit transaction
134
public void run() {
135
// This method automatically runs within a transaction
136
// All dataset operations are automatically transactional
137
138
KeyValueTable workTable = getContext().getDataset("work-items");
139
String workItem = workTable.read("next-item");
140
processWorkItem(workItem);
141
workTable.write("processed-item", workItem);
142
// Transaction commits on successful method completion
143
}
144
145
@Override
146
@TransactionPolicy(TransactionControl.EXPLICIT)
147
public void destroy() {
148
// Cleanup method without transaction overhead
149
closeExternalConnections();
150
}
151
}
152
```
153
154
## Transaction Behavior
155
156
### Implicit Transaction Control
157
- **Default behavior**: Program lifecycle methods run within transactions automatically
158
- **Automatic commit**: Successful method completion commits the transaction
159
- **Automatic rollback**: Exceptions cause transaction rollback
160
- **Timeout handling**: Uses default transaction timeout settings
161
162
### Explicit Transaction Control
163
- **Manual control**: Methods annotated with `EXPLICIT` manage their own transactions
164
- **No automatic wrapping**: Platform doesn't create transactions automatically
165
- **Programmatic access**: Use `Transactional.execute()` for controlled transactions
166
- **Timeout flexibility**: Set custom timeouts for long-running operations
167
168
### Transaction Scope
169
- **Dataset operations**: All dataset reads/writes within a transaction are atomic
170
- **Cross-dataset consistency**: Multiple datasets can participate in single transaction
171
- **Isolation**: Concurrent transactions see consistent dataset state
172
- **Durability**: Committed transactions persist across system failures
173
174
### Error Handling
175
- **TransactionFailureException**: Thrown when transaction operations fail
176
- **Rollback scenarios**: Exceptions during execution trigger automatic rollback
177
- **Retry logic**: Applications can implement retry patterns for transient failures
178
- **Conflict resolution**: System handles concurrent transaction conflicts
179
180
### Best Practices
181
182
**When to use EXPLICIT control:**
183
- Long-running initialization or cleanup operations
184
- Methods that may exceed default transaction timeouts
185
- Complex transaction patterns requiring custom retry logic
186
- Operations that don't require dataset access
187
188
**When to use IMPLICIT control (default):**
189
- Standard dataset operations within reasonable time limits
190
- Simple CRUD operations on datasets
191
- Operations that benefit from automatic transaction management
192
- Most program lifecycle methods