tessl install tessl/maven-org-apache-spark--spark-kvstore_2-13@3.5.0Local key/value store abstraction for Apache Spark with thread-safe operations, automatic serialization, and indexing capabilities
Apache Spark KVStore provides three distinct storage implementations, each optimized for different use cases: InMemoryStore for high-performance temporary storage, LevelDB for reliable persistent storage, and RocksDB for high-throughput persistent storage with advanced compression.
import org.apache.spark.util.kvstore.InMemoryStore;
import org.apache.spark.util.kvstore.LevelDB;
import org.apache.spark.util.kvstore.RocksDB;
import org.apache.spark.util.kvstore.KVStoreSerializer;
import org.apache.spark.util.kvstore.UnsupportedStoreVersionException;
import java.io.File;
import java.util.List;High-performance in-memory implementation that keeps all data deserialized in memory. Ideal for temporary caching, session storage, and development/testing scenarios.
public class InMemoryStore implements KVStore {
public InMemoryStore();
}Usage Example:
// Create in-memory store
KVStore store = new InMemoryStore();
// Use normally - all data kept in memory
User user = new User("user123", "Alice", "Engineering");
store.write(user);
User retrieved = store.read(User.class, "user123");
// Data is lost when store is closed or application exits
store.close();Characteristics:
Persistent storage using Google's LevelDB with Jackson JSON serialization and GZIP compression. Provides reliable storage with good performance characteristics.
public class LevelDB implements KVStore {
public LevelDB(File path) throws Exception;
public LevelDB(File path, KVStoreSerializer serializer) throws Exception;
public void writeAll(List<?> values) throws Exception;
static final long STORE_VERSION = 1L;
}Usage Example:
// Create LevelDB store with default serializer
File dbPath = new File("/path/to/leveldb");
KVStore store = new LevelDB(dbPath);
// Or with custom serializer
KVStoreSerializer customSerializer = new MyCustomSerializer();
KVStore store = new LevelDB(dbPath, customSerializer);
// Use normally - data persisted to disk
User user = new User("user123", "Alice", "Engineering");
store.write(user);
// Data survives application restarts
store.close();
// Reopen same database
KVStore reopened = new LevelDB(dbPath);
User retrieved = reopened.read(User.class, "user123"); // Still there!
reopened.close();Characteristics:
Configuration:
High-performance persistent storage using Facebook's RocksDB with advanced compression algorithms and optimized block-based storage format.
public class RocksDB implements KVStore {
public RocksDB(File path) throws Exception;
public RocksDB(File path, KVStoreSerializer serializer) throws Exception;
public void writeAll(List<?> values) throws Exception;
static final long STORE_VERSION = 1L;
}Usage Example:
// Create RocksDB store
File dbPath = new File("/path/to/rocksdb");
KVStore store = new RocksDB(dbPath);
// High-throughput operations
for (int i = 0; i < 100000; i++) {
User user = new User("user" + i, "User " + i, "Department" + (i % 10));
store.write(user);
}
// Efficient bulk operations
List<String> userIds = IntStream.range(0, 1000)
.mapToObj(i -> "user" + i)
.collect(Collectors.toList());
store.removeAllByIndexValues(User.class, "__main__", userIds);
store.close();Characteristics:
Configuration:
Both LevelDB and RocksDB implementations provide optimized bulk write operations for improved performance when storing multiple objects at once.
public void writeAll(List<?> values) throws Exception;Usage Example:
import java.util.Arrays;
import java.util.List;
// Create sample data
List<User> users = Arrays.asList(
new User("user1", "Alice", "Engineering"),
new User("user2", "Bob", "Marketing"),
new User("user3", "Carol", "Engineering")
);
// Bulk write using LevelDB
LevelDB levelDbStore = new LevelDB(new File("/path/to/leveldb"));
levelDbStore.writeAll(users);
// Bulk write using RocksDB
RocksDB rocksDbStore = new RocksDB(new File("/path/to/rocksdb"));
rocksDbStore.writeAll(users);
// Close stores
levelDbStore.close();
rocksDbStore.close();Performance Benefits:
Parameters:
values: List of objects to write in bulk (all objects must have natural key annotations)Exceptions:
Exception: For serialization errors, storage backend issues, or duplicate key conflictsNote: InMemoryStore does not provide a specific writeAll method but can achieve similar functionality through multiple individual write() calls.
| Feature | InMemoryStore | LevelDB | RocksDB |
|---|---|---|---|
| Read Speed | Fastest | Fast | Fast |
| Write Speed | Fastest | Good | Excellent |
| Memory Usage | High | Low | Low |
| Disk Usage | None | Medium | Low (compressed) |
| Startup Time | Instant | Fast | Medium |
| Large Datasets | Limited by RAM | Good | Excellent |
Choose InMemoryStore when:
Choose LevelDB when:
Choose RocksDB when:
All persistent implementations support custom serializers for specialized encoding requirements:
public class CustomSerializer extends KVStoreSerializer {
public CustomSerializer() {
super();
// Configure ObjectMapper for specific needs
mapper.configure(JsonGenerator.Feature.WRITE_NUMBERS_AS_STRINGS, true);
mapper.setDateFormat(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"));
}
}
KVStore store = new RocksDB(dbPath, new CustomSerializer());try {
KVStore store = new LevelDB(new File("/invalid/path"));
} catch (UnsupportedStoreVersionException e) {
// Store created with incompatible version
System.err.println("Database version incompatible");
} catch (Exception e) {
// Other initialization errors (disk full, permissions, etc.)
System.err.println("Failed to open database: " + e.getMessage());
}