or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

core-operations.mdindex.mdindexing-system.mdquerying-views.mdstorage-backends.md
tile.json

tessl/maven-org-apache-spark--spark-kvstore_2-13

A key-value store abstraction for storing application data locally with automatic serialization, indexing, and support for multiple storage backends.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.spark/spark-kvstore_2.13@4.0.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-spark--spark-kvstore_2-13@4.0.0

index.mddocs/

Apache Spark KVStore

Apache Spark KVStore is a thread-safe abstraction layer for local key-value storage within Apache Spark applications. It provides automatic serialization using Jackson with compression, automatic key management with unique key generation, and indexing capabilities for efficient data access without loading all instances.

Package Information

  • Package Name: spark-kvstore_2.13
  • Package Type: Maven JAR library
  • Language: Java/Scala
  • Organization: org.apache.spark
  • Installation: Included as part of Apache Spark distribution
  • License: Apache-2.0

Core Imports

import org.apache.spark.util.kvstore.*;

Basic Usage

import org.apache.spark.util.kvstore.*;
import java.io.File;

// Create a LevelDB-backed store
KVStore store = new LevelDB(new File("/path/to/store"));

// Define a data class with indexing
public class User {
    @KVIndex  // Natural index (primary key)
    public String name;
    
    @KVIndex("age") 
    public int age;
    
    public User(String name, int age) {
        this.name = name;
        this.age = age;
    }
}

// Store data
User user = new User("Alice", 30);
store.write(user);

// Read data back
User retrieved = store.read(User.class, "Alice");

// Query with views
KVStoreView<User> adults = store.view(User.class)
    .index("age")
    .first(18)
    .max(100);

for (User adult : adults) {
    System.out.println(adult.name + " is " + adult.age);
}

Architecture

Apache Spark KVStore is built around several key components:

  • Storage Abstraction: KVStore interface provides uniform API across different backend implementations
  • Multiple Backends: Support for in-memory storage, LevelDB, and RocksDB for different deployment scenarios
  • Serialization System: Jackson-based automatic serialization with GZIP compression for efficient storage
  • Indexing Engine: Annotation-based indexing system for efficient querying and sorting
  • Resource Management: Proper resource cleanup with Closeable interface implementation
  • Thread Safety: All operations are thread-safe for concurrent read/write access

Capabilities

Core Store Operations

Primary KVStore interface providing CRUD operations, metadata management, and resource handling. Essential for all data persistence operations.

public interface KVStore extends Closeable {
    <T> T getMetadata(Class<T> klass) throws Exception;
    void setMetadata(Object value) throws Exception;
    <T> T read(Class<T> klass, Object naturalKey) throws Exception;
    void write(Object value) throws Exception;
    void delete(Class<?> type, Object naturalKey) throws Exception;
    <T> KVStoreView<T> view(Class<T> type) throws Exception;
    long count(Class<?> type) throws Exception;
    long count(Class<?> type, String index, Object indexedValue) throws Exception;
    <T> boolean removeAllByIndexValues(Class<T> klass, String index, Collection<?> indexValues) throws Exception;
}

Core Store Operations

Data Querying and Views

Configurable view system for iterating over stored data with filtering, sorting, and pagination capabilities.

public abstract class KVStoreView<T> implements Iterable<T> {
    public abstract KVStoreView<T> reverse();
    public abstract KVStoreView<T> index(String name);
    public abstract KVStoreView<T> parent(Object value);
    public abstract KVStoreView<T> first(Object value);
    public abstract KVStoreView<T> last(Object value);
    public abstract KVStoreView<T> max(long max);
    public abstract KVStoreView<T> skip(long n);
    public abstract KVStoreIterator<T> closeableIterator() throws Exception;
}

public interface KVStoreIterator<T> extends Iterator<T>, Closeable {
    List<T> next(int max) throws Exception;
    boolean skip(long n) throws Exception;
}

Data Querying and Views

Indexing System

Annotation-based indexing for efficient data access and sorting without loading all instances.

@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.FIELD, ElementType.METHOD})
public @interface KVIndex {
    String NATURAL_INDEX_NAME = "__main__";
    String value() default NATURAL_INDEX_NAME;
    String parent() default "";
    boolean copy() default false;
}

Indexing System

Storage Backends

Multiple storage backend implementations for different deployment scenarios and performance requirements.

// In-memory storage for development and testing
public class InMemoryStore implements KVStore

// LevelDB backend for production use
public class LevelDB implements KVStore {
    public LevelDB(File path) throws Exception;
    public LevelDB(File path, KVStoreSerializer serializer) throws Exception;
}

// RocksDB backend for high-performance scenarios  
public class RocksDB implements KVStore {
    public RocksDB(File path) throws Exception;
    public RocksDB(File path, KVStoreSerializer serializer) throws Exception;
}

Storage Backends