A key-value store abstraction for storing application data locally with automatic serialization, indexing, and support for multiple storage backends.
npx @tessl/cli install tessl/maven-org-apache-spark--spark-kvstore_2-13@4.0.00
# Apache Spark KVStore
1
2
Apache Spark KVStore is a thread-safe abstraction layer for local key-value storage within Apache Spark applications. It provides automatic serialization using Jackson with compression, automatic key management with unique key generation, and indexing capabilities for efficient data access without loading all instances.
3
4
## Package Information
5
6
- **Package Name**: spark-kvstore_2.13
7
- **Package Type**: Maven JAR library
8
- **Language**: Java/Scala
9
- **Organization**: org.apache.spark
10
- **Installation**: Included as part of Apache Spark distribution
11
- **License**: Apache-2.0
12
13
## Core Imports
14
15
```java
16
import org.apache.spark.util.kvstore.*;
17
```
18
19
## Basic Usage
20
21
```java
22
import org.apache.spark.util.kvstore.*;
23
import java.io.File;
24
25
// Create a LevelDB-backed store
26
KVStore store = new LevelDB(new File("/path/to/store"));
27
28
// Define a data class with indexing
29
public class User {
30
@KVIndex // Natural index (primary key)
31
public String name;
32
33
@KVIndex("age")
34
public int age;
35
36
public User(String name, int age) {
37
this.name = name;
38
this.age = age;
39
}
40
}
41
42
// Store data
43
User user = new User("Alice", 30);
44
store.write(user);
45
46
// Read data back
47
User retrieved = store.read(User.class, "Alice");
48
49
// Query with views
50
KVStoreView<User> adults = store.view(User.class)
51
.index("age")
52
.first(18)
53
.max(100);
54
55
for (User adult : adults) {
56
System.out.println(adult.name + " is " + adult.age);
57
}
58
```
59
60
## Architecture
61
62
Apache Spark KVStore is built around several key components:
63
64
- **Storage Abstraction**: `KVStore` interface provides uniform API across different backend implementations
65
- **Multiple Backends**: Support for in-memory storage, LevelDB, and RocksDB for different deployment scenarios
66
- **Serialization System**: Jackson-based automatic serialization with GZIP compression for efficient storage
67
- **Indexing Engine**: Annotation-based indexing system for efficient querying and sorting
68
- **Resource Management**: Proper resource cleanup with Closeable interface implementation
69
- **Thread Safety**: All operations are thread-safe for concurrent read/write access
70
71
## Capabilities
72
73
### Core Store Operations
74
75
Primary KVStore interface providing CRUD operations, metadata management, and resource handling. Essential for all data persistence operations.
76
77
```java { .api }
78
public interface KVStore extends Closeable {
79
<T> T getMetadata(Class<T> klass) throws Exception;
80
void setMetadata(Object value) throws Exception;
81
<T> T read(Class<T> klass, Object naturalKey) throws Exception;
82
void write(Object value) throws Exception;
83
void delete(Class<?> type, Object naturalKey) throws Exception;
84
<T> KVStoreView<T> view(Class<T> type) throws Exception;
85
long count(Class<?> type) throws Exception;
86
long count(Class<?> type, String index, Object indexedValue) throws Exception;
87
<T> boolean removeAllByIndexValues(Class<T> klass, String index, Collection<?> indexValues) throws Exception;
88
}
89
```
90
91
[Core Store Operations](./core-operations.md)
92
93
### Data Querying and Views
94
95
Configurable view system for iterating over stored data with filtering, sorting, and pagination capabilities.
96
97
```java { .api }
98
public abstract class KVStoreView<T> implements Iterable<T> {
99
public abstract KVStoreView<T> reverse();
100
public abstract KVStoreView<T> index(String name);
101
public abstract KVStoreView<T> parent(Object value);
102
public abstract KVStoreView<T> first(Object value);
103
public abstract KVStoreView<T> last(Object value);
104
public abstract KVStoreView<T> max(long max);
105
public abstract KVStoreView<T> skip(long n);
106
public abstract KVStoreIterator<T> closeableIterator() throws Exception;
107
}
108
109
public interface KVStoreIterator<T> extends Iterator<T>, Closeable {
110
List<T> next(int max) throws Exception;
111
boolean skip(long n) throws Exception;
112
}
113
```
114
115
[Data Querying and Views](./querying-views.md)
116
117
### Indexing System
118
119
Annotation-based indexing for efficient data access and sorting without loading all instances.
120
121
```java { .api }
122
@Retention(RetentionPolicy.RUNTIME)
123
@Target({ElementType.FIELD, ElementType.METHOD})
124
public @interface KVIndex {
125
String NATURAL_INDEX_NAME = "__main__";
126
String value() default NATURAL_INDEX_NAME;
127
String parent() default "";
128
boolean copy() default false;
129
}
130
```
131
132
[Indexing System](./indexing-system.md)
133
134
### Storage Backends
135
136
Multiple storage backend implementations for different deployment scenarios and performance requirements.
137
138
```java { .api }
139
// In-memory storage for development and testing
140
public class InMemoryStore implements KVStore
141
142
// LevelDB backend for production use
143
public class LevelDB implements KVStore {
144
public LevelDB(File path) throws Exception;
145
public LevelDB(File path, KVStoreSerializer serializer) throws Exception;
146
}
147
148
// RocksDB backend for high-performance scenarios
149
public class RocksDB implements KVStore {
150
public RocksDB(File path) throws Exception;
151
public RocksDB(File path, KVStoreSerializer serializer) throws Exception;
152
}
153
```
154
155
[Storage Backends](./storage-backends.md)