0
# Predefined Options
1
2
Pre-configured RocksDB options optimized for different hardware profiles and use cases, providing easy setup for common deployment scenarios.
3
4
## Capabilities
5
6
### PredefinedOptions Enum
7
8
Enumeration of pre-configured RocksDB option sets optimized for different hardware and workload characteristics.
9
10
```java { .api }
11
/**
12
* Predefined RocksDB options optimized for different hardware profiles.
13
* Each option set provides tuned database and column family configurations.
14
*/
15
enum PredefinedOptions {
16
17
/** Default configuration with basic optimizations */
18
DEFAULT,
19
20
/** Optimized for spinning disk storage (HDDs) */
21
SPINNING_DISK_OPTIMIZED,
22
23
/** Optimized for spinning disks with higher memory usage */
24
SPINNING_DISK_OPTIMIZED_HIGH_MEM,
25
26
/** Optimized for flash SSD storage */
27
FLASH_SSD_OPTIMIZED;
28
29
/**
30
* Creates database options for this predefined configuration.
31
* @param handlesToClose collection to register objects that need cleanup
32
* @return configured DBOptions instance
33
*/
34
abstract DBOptions createDBOptions(Collection<AutoCloseable> handlesToClose);
35
36
/**
37
* Creates column family options for this predefined configuration.
38
* @param handlesToClose collection to register objects that need cleanup
39
* @return configured ColumnFamilyOptions instance
40
*/
41
abstract ColumnFamilyOptions createColumnOptions(Collection<AutoCloseable> handlesToClose);
42
}
43
```
44
45
### DEFAULT Configuration
46
47
Basic configuration suitable for general-purpose workloads with minimal tuning.
48
49
**Characteristics:**
50
- Disables fsync for better performance (trade-off: potential data loss on system crash)
51
- Sets log level to header-only to reduce log verbosity
52
- Disables statistics dump to reduce overhead
53
- Uses RocksDB defaults for most other settings
54
55
**Usage:**
56
57
```java
58
EmbeddedRocksDBStateBackend stateBackend = new EmbeddedRocksDBStateBackend();
59
stateBackend.setPredefinedOptions(PredefinedOptions.DEFAULT);
60
```
61
62
**Configuration Details:**
63
- `setUseFsync(false)` - Disables fsync for performance
64
- `setInfoLogLevel(InfoLogLevel.HEADER_LEVEL)` - Minimal logging
65
- `setStatsDumpPeriodSec(0)` - Disables stats dumping
66
67
### SPINNING_DISK_OPTIMIZED Configuration
68
69
Optimized for traditional hard disk drives (HDDs) with slower sequential I/O characteristics.
70
71
**Characteristics:**
72
- Increases parallelism for background operations
73
- Uses level-based compaction with dynamic level sizes
74
- Optimizes file sizes and compaction for spinning disk access patterns
75
- Reduces random I/O through better file organization
76
77
**Usage:**
78
79
```java
80
EmbeddedRocksDBStateBackend stateBackend = new EmbeddedRocksDBStateBackend();
81
stateBackend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED);
82
```
83
84
**Configuration Details:**
85
86
*Database Options:*
87
- `setIncreaseParallelism(4)` - Increases background thread count
88
- `setUseFsync(false)` - Disables fsync for performance
89
- `setMaxOpenFiles(-1)` - Unlimited open files
90
- `setInfoLogLevel(InfoLogLevel.HEADER_LEVEL)` - Minimal logging
91
- `setStatsDumpPeriodSec(0)` - Disables stats dumping
92
93
*Column Family Options:*
94
- `setCompactionStyle(CompactionStyle.LEVEL)` - Uses level-based compaction
95
- `setLevelCompactionDynamicLevelBytes(true)` - Enables dynamic level sizing
96
97
### SPINNING_DISK_OPTIMIZED_HIGH_MEM Configuration
98
99
Optimized for spinning disks with higher memory usage to reduce I/O operations.
100
101
**Characteristics:**
102
- All optimizations from SPINNING_DISK_OPTIMIZED
103
- Larger block cache (256MB) to cache frequently accessed data
104
- Larger block size (128KB) for better sequential reads
105
- Larger target file size (256MB) for fewer files
106
- Larger write buffer (64MB) to batch writes
107
108
**Usage:**
109
110
```java
111
EmbeddedRocksDBStateBackend stateBackend = new EmbeddedRocksDBStateBackend();
112
stateBackend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED_HIGH_MEM);
113
```
114
115
**Configuration Details:**
116
117
*Includes all SPINNING_DISK_OPTIMIZED settings plus:*
118
119
*Enhanced Memory Usage:*
120
- Block cache size: 256MB
121
- Block size: 128KB
122
- Target file size: 256MB
123
- Write buffer size: 64MB
124
125
### FLASH_SSD_OPTIMIZED Configuration
126
127
Optimized for flash-based SSD storage with fast random I/O characteristics.
128
129
**Characteristics:**
130
- Increases parallelism for background operations
131
- Uses default compaction settings suitable for SSD random access
132
- Optimizes for SSD write patterns and longevity
133
- Balances performance with SSD wear leveling
134
135
**Usage:**
136
137
```java
138
EmbeddedRocksDBStateBackend stateBackend = new EmbeddedRocksDBStateBackend();
139
stateBackend.setPredefinedOptions(PredefinedOptions.FLASH_SSD_OPTIMIZED);
140
```
141
142
**Configuration Details:**
143
144
*Database Options:*
145
- `setIncreaseParallelism(4)` - Increases background thread count
146
- `setUseFsync(false)` - Disables fsync for performance
147
- `setMaxOpenFiles(-1)` - Unlimited open files
148
- `setInfoLogLevel(InfoLogLevel.HEADER_LEVEL)` - Minimal logging
149
- `setStatsDumpPeriodSec(0)` - Disables stats dumping
150
151
*Column Family Options:*
152
- Uses RocksDB default column family options optimized for SSD
153
154
## Configuration Comparison
155
156
| Configuration | Use Case | Memory Usage | I/O Pattern | Parallelism |
157
|---------------|----------|--------------|-------------|-------------|
158
| DEFAULT | General purpose | Low | Balanced | Default |
159
| SPINNING_DISK_OPTIMIZED | HDD storage | Moderate | Sequential-optimized | High (4x) |
160
| SPINNING_DISK_OPTIMIZED_HIGH_MEM | HDD with more RAM | High | Sequential-optimized | High (4x) |
161
| FLASH_SSD_OPTIMIZED | SSD storage | Moderate | Random-optimized | High (4x) |
162
163
## Hardware-Specific Recommendations
164
165
### Traditional Hard Drives (HDDs)
166
167
**Recommended:** `SPINNING_DISK_OPTIMIZED` or `SPINNING_DISK_OPTIMIZED_HIGH_MEM`
168
169
```java
170
// For HDDs with limited memory
171
stateBackend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED);
172
173
// For HDDs with abundant memory (>8GB available for Flink)
174
stateBackend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED_HIGH_MEM);
175
```
176
177
**Benefits:**
178
- Reduces random I/O through better file organization
179
- Uses level-based compaction for better sequential access patterns
180
- Dynamic level sizing reduces write amplification
181
182
### Solid State Drives (SSDs)
183
184
**Recommended:** `FLASH_SSD_OPTIMIZED`
185
186
```java
187
stateBackend.setPredefinedOptions(PredefinedOptions.FLASH_SSD_OPTIMIZED);
188
```
189
190
**Benefits:**
191
- Takes advantage of fast random I/O capabilities
192
- Optimizes for SSD write characteristics
193
- Balances performance with drive longevity
194
195
### Cloud Storage (EBS, Persistent Disks)
196
197
**Recommended:** Start with `FLASH_SSD_OPTIMIZED`, tune based on performance characteristics
198
199
```java
200
// Most cloud storage behaves like SSDs
201
stateBackend.setPredefinedOptions(PredefinedOptions.FLASH_SSD_OPTIMIZED);
202
203
// For high-IOPS volumes with abundant memory
204
stateBackend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED_HIGH_MEM);
205
```
206
207
## Complete Configuration Examples
208
209
### Basic Setup with Predefined Options
210
211
```java
212
import org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend;
213
import org.apache.flink.contrib.streaming.state.PredefinedOptions;
214
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
215
216
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
217
218
// Create state backend with incremental checkpointing
219
EmbeddedRocksDBStateBackend stateBackend = new EmbeddedRocksDBStateBackend(true);
220
221
// Configure for SSD storage
222
stateBackend.setPredefinedOptions(PredefinedOptions.FLASH_SSD_OPTIMIZED);
223
stateBackend.setDbStoragePath("/ssd/flink/rocksdb");
224
225
env.setStateBackend(stateBackend);
226
```
227
228
### Combining Predefined Options with Custom Settings
229
230
```java
231
import org.apache.flink.contrib.streaming.state.DefaultConfigurableOptionsFactory;
232
233
// Start with predefined options
234
EmbeddedRocksDBStateBackend stateBackend = new EmbeddedRocksDBStateBackend(true);
235
stateBackend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED);
236
237
// Add custom optimizations
238
DefaultConfigurableOptionsFactory customFactory = new DefaultConfigurableOptionsFactory()
239
.setWriteBufferSize("128mb") // Custom write buffer size
240
.setBlockCacheSize("512mb") // Custom block cache size
241
.setUseBloomFilter(true) // Enable Bloom filter
242
.setBloomFilterBitsPerKey(10.0); // Configure Bloom filter
243
244
stateBackend.setRocksDBOptions(customFactory);
245
```
246
247
### Environment-Specific Configurations
248
249
```java
250
// Development/Testing Environment
251
EmbeddedRocksDBStateBackend devBackend = new EmbeddedRocksDBStateBackend(false);
252
devBackend.setPredefinedOptions(PredefinedOptions.DEFAULT);
253
254
// Production Environment with HDDs
255
EmbeddedRocksDBStateBackend prodBackend = new EmbeddedRocksDBStateBackend(true);
256
prodBackend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED_HIGH_MEM);
257
prodBackend.setDbStoragePaths("/data1/rocksdb", "/data2/rocksdb", "/data3/rocksdb");
258
259
// Production Environment with SSDs
260
EmbeddedRocksDBStateBackend ssdBackend = new EmbeddedRocksDBStateBackend(true);
261
ssdBackend.setPredefinedOptions(PredefinedOptions.FLASH_SSD_OPTIMIZED);
262
ssdBackend.setDbStoragePath("/nvme/flink/rocksdb");
263
```
264
265
## Performance Tuning Guidelines
266
267
### When to Use Each Option
268
269
1. **Start with appropriate predefined option** based on your storage type
270
2. **Monitor performance metrics** (checkpoint duration, state access latency)
271
3. **Fine-tune with custom options** if needed using DefaultConfigurableOptionsFactory
272
4. **Test under realistic load** before production deployment
273
274
### Combining with Memory Configuration
275
276
```java
277
// Optimized setup for high-memory SSD environment
278
EmbeddedRocksDBStateBackend stateBackend = new EmbeddedRocksDBStateBackend(true);
279
stateBackend.setPredefinedOptions(PredefinedOptions.FLASH_SSD_OPTIMIZED);
280
281
// Configure memory allocation
282
RocksDBMemoryConfiguration memConfig = stateBackend.getMemoryConfiguration();
283
memConfig.setUseManagedMemory(true);
284
memConfig.setWriteBufferRatio(0.3); // More memory for caching with SSD
285
memConfig.setHighPriorityPoolRatio(0.1);
286
287
stateBackend.setNumberOfTransferThreads(8); // More threads for SSD
288
```