or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

authorization-security.mdconfiguration-management.mddatabase-connection.mdentity-models.mdindex.mdmetadata-management.mdnative-operations.mdscala-functional-api.md
tile.json

tessl/maven-com-dmetasoul--lakesoul-common

Core utilities and metadata management library for the LakeSoul lakehouse framework providing database connection management, RBAC authorization, protobuf serialization, and native library integration.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/com.dmetasoul/lakesoul-common@2.6.x

To install, run

npx @tessl/cli install tessl/maven-com-dmetasoul--lakesoul-common@2.6.0

index.mddocs/

LakeSoul Common

LakeSoul Common provides core utilities and metadata management for the LakeSoul lakehouse framework. It offers comprehensive database connection management with PostgreSQL, RBAC (Role-Based Access Control) using Casbin for authorization policies, protobuf-based data serialization for cross-language compatibility, native library integration with Rust components through JNR-FFI for high-performance operations, connection pooling with HikariCP, and AspectJ support for cross-cutting concerns.

The library serves as the foundational layer that enables LakeSoul's ACID transactions, concurrent operations, and metadata consistency across the entire lakehouse ecosystem, supporting multiple computing engines like Spark, Flink, and Presto while maintaining unified metadata management and security policies.

Package Information

  • Package Name: lakesoul-common
  • Package Type: maven
  • Language: Java/Scala
  • Installation: Add Maven dependency
<dependency>
    <groupId>com.dmetasoul</groupId>
    <artifactId>lakesoul-common</artifactId>
    <version>2.6.2</version>
</dependency>

Core Imports

import com.dmetasoul.lakesoul.meta.DBManager;
import com.dmetasoul.lakesoul.meta.GlobalConfig;
import com.dmetasoul.lakesoul.meta.DBConnector;
import com.dmetasoul.lakesoul.meta.entity.*;

For Scala components:

import com.dmetasoul.lakesoul.meta.{DataOperation, MetaVersion}
import com.dmetasoul.lakesoul.meta.DataFileInfo

Basic Usage

// Initialize database manager
DBManager dbManager = new DBManager();

// Check if table exists
boolean exists = dbManager.isTableExists("/path/to/table");

// Create namespace
dbManager.createNewNamespace("my_namespace", "{}", "Test namespace");

// List tables in namespace
List<String> tables = dbManager.listTablePathsByNamespace("my_namespace");

// Get table information
TableInfo tableInfo = dbManager.getTableInfoByPath("/path/to/table");

// Create a new table
dbManager.createNewTable(
    "table_id_123",
    "my_namespace", 
    "my_table",
    "/path/to/table",
    "schema_json",
    new JSONObject(),
    "partition_config"
);

Architecture

LakeSoul Common is built around several key architectural components:

  • Metadata Management Layer: Central DBManager class providing high-level operations for tables, partitions, and data commits
  • Database Connection Layer: DBConnector with HikariCP connection pooling for PostgreSQL backend
  • Entity Layer: Complete protobuf-based entity system for type-safe metadata serialization
  • DAO Layer: Data Access Objects for all metadata operations with transaction support
  • Native Integration Layer: JNR-FFI based integration with high-performance Rust components
  • Authorization Layer: RBAC system using Casbin for fine-grained access control
  • Configuration Layer: Centralized configuration management with global and local settings
  • Scala API Layer: High-level Scala objects for functional programming style operations

Capabilities

Metadata Management

Core metadata operations for tables, partitions, namespaces, and data commits. Provides the primary interface for all LakeSoul metadata operations.

public class DBManager {
    public DBManager();
    public boolean isTableExists(String tablePath);
    public boolean isTableExistsByTableName(String tableName);
    public boolean isTableExistsByTableName(String tableName, String tableNamespace);
    public boolean isNamespaceExists(String tableNamespace);
    public void createNewTable(String tableId, String namespace, String tableName, 
                              String tablePath, String tableSchema, JSONObject properties, String partitions);
    public TableInfo getTableInfoByTableId(String tableId);
    public TableInfo getTableInfoByName(String tableName);
    public TableInfo getTableInfoByNameAndNamespace(String tableName, String namespace);
    public TableInfo getTableInfoByPath(String tablePath);
    public List<TableInfo> getTableInfosByNamespace(String tableNamespace);
    public List<String> listTables();
    public List<String> listTableNamesByNamespace(String tableNamespace);
    public List<String> listTablePathsByNamespace(String tableNamespace);
    public PartitionInfo getSinglePartitionInfo(String tableId, String partitionDesc);
    public PartitionInfo getSinglePartitionInfo(String tableId, String partitionDesc, int version);
    public List<PartitionInfo> getAllPartitionInfo(String tableId);
    public void updateTableSchema(String tableId, String tableSchema);
    public void deleteTableInfo(String tablePath, String tableId, String tableNamespace);
    public boolean commitData(MetaInfo metaInfo, boolean changeSchema, CommitOp commitOp);
    public List<String> listNamespaces();
    public void createNewNamespace(String name, String properties, String comment);
}

Metadata Management

Database Connection Management

PostgreSQL connection management with HikariCP connection pooling for scalable, efficient database operations.

public class DBConnector {
    public static DataSource getDS();
    public static Connection getConn() throws SQLException;
    public static void closeAllConnections();
    public static void closeConn(Connection conn);
    public static void closeConn(Statement statement, Connection conn);
    public static void closeConn(ResultSet set, Statement statement, Connection conn);
}

Database Connection Management

Entity Data Models

Comprehensive protobuf-based entity system providing type-safe data transfer objects for all metadata operations.

// Core entities with builder patterns
public class TableInfo {
    public String getTableId();
    public String getTableNamespace();
    public String getTableName();
    public String getTablePath();
    public String getTableSchema();
    public String getProperties();
    public String getPartitions();
    public String getDomain();
    public static Builder newBuilder();
}

public class PartitionInfo {
    public String getTableId();
    public String getPartitionDesc();
    public int getVersion();
    public CommitOp getCommitOp();
    public long getTimestamp();
    public List<Uuid> getSnapshotList();
    public int getSnapshotCount();
    public String getExpression();
    public String getDomain();
    public static Builder newBuilder();
}

public class DataCommitInfo {
    public String getTableId();
    public String getPartitionDesc();
    public Uuid getCommitId();
    public List<DataFileOp> getFileOpsList();
    public CommitOp getCommitOp();
    public long getTimestamp();
    public boolean getCommitted();
    public String getDomain();
    public static Builder newBuilder();
}

public class MetaInfo {
    public List<PartitionInfo> getListPartitionList();
    public TableInfo getTableInfo();
    public List<PartitionInfo> getReadPartitionInfoList();
    public static Builder newBuilder();
}

Entity Data Models

High-Performance Native Operations

JNR-FFI based integration with Rust components for high-performance metadata operations with connection pooling and retry logic.

public class NativeMetadataJavaClient implements AutoCloseable {
    public static NativeMetadataJavaClient getInstance();
    public static Integer insert(NativeUtils.CodedDaoType insertType, JniWrapper jniWrapper);
    public static JniWrapper query(NativeUtils.CodedDaoType queryType, List<String> params);
    public static Integer update(NativeUtils.CodedDaoType updateType, List<String> params);
    public static List<String> queryScalar(NativeUtils.CodedDaoType queryScalarType, List<String> params);
    public void close();
}

High-Performance Native Operations

Authorization and Security

Role-Based Access Control (RBAC) system using Casbin for fine-grained authorization policies with AOP-based enforcement.

public class AuthZEnforcer {
    public static SyncedEnforcer get();
    public static boolean authZEnabled();
}

@Documented
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface AuthZ {
    String value() default "";
    String object() default "object";
    String action() default "action";
}

public class AuthZContext {
    public static AuthZContext getInstance();
    public String getDomain();
    public void setDomain(String domain);
    public String getSubject();
    public void setSubject(String subject);
}

public class AuthZException extends RuntimeException {
    public AuthZException();
}

Authorization and Security

Configuration Management

Centralized configuration management for database connections, authorization settings, and operational parameters.

public class GlobalConfig {
    public static GlobalConfig get();
    public boolean isAuthZEnabled();
    public void setAuthZEnabled(boolean enabled);
    public String getAuthZCasbinModel();
}

public abstract class DBConfig {
    public static final String LAKESOUL_DEFAULT_NAMESPACE = "default";
    public static final String LAKESOUL_RANGE_PARTITION_SPLITTER = ",";
    public static final int MAX_COMMIT_ATTEMPTS = 3;
}

Configuration Management

Scala Functional API

High-level Scala objects providing functional programming style operations for data file management and metadata operations.

object DataOperation {
    def getTableDataInfo(tableId: String): Array[DataFileInfo]
    def getTableDataInfo(partitionList: util.List[PartitionInfo]): Array[DataFileInfo]
    def getTableDataInfo(partition_info_arr: Array[PartitionInfoScala]): Array[DataFileInfo]
    def getTableDataInfo(tableId: String, partitions: List[String]): Array[DataFileInfo]
    def getIncrementalPartitionDataInfo(table_id: String, partition_desc: String, 
                                       startTimestamp: Long, endTimestamp: Long, 
                                       readType: String): Array[DataFileInfo]
    def getSinglePartitionDataInfo(partition_info: PartitionInfoScala): ArrayBuffer[DataFileInfo]
    def getSinglePartitionDataInfo(table_id: String, partition_desc: String, 
                                  startTimestamp: Long, endTimestamp: Long): ArrayBuffer[DataFileInfo]
}

object MetaVersion {
    def createNamespace(namespace: String): Unit
    def listNamespaces(): Array[String]
    def isTableExists(table_name: String): Boolean
    def isTableIdExists(table_name: String, table_id: String): Boolean
    def isNamespaceExists(table_namespace: String): Boolean
    def isShortTableNameExists(short_table_name: String, table_namespace: String): (Boolean, String)
    def createNewTable(table_namespace: String, table_path: String, short_table_name: String,
                      table_id: String, table_schema: String, range_column: String,
                      hash_column: String, configuration: Map[String, String], bucket_num: Int): Unit
    def getAllPartitionInfo(table_id: String): util.List[PartitionInfo]
    def getAllPartitionInfoScala(table_id: String): Array[PartitionInfoScala]
}

// Scala-specific data file information
case class DataFileInfo(range_partitions: String, path: String, file_op: String, size: Long,
                       modification_time: Long = -1L, file_exist_cols: String = "") {
    lazy val file_bucket_id: Int
    lazy val range_version: String
    def expire(deleteTime: Long): DataFileInfo
}

// Scala wrapper for PartitionInfo
case class PartitionInfoScala(table_id: String, partition_desc: String, version: Int,
                             commit_op: String, timestamp: Long, snapshot: Array[String],
                             expression: String = "")

Scala Functional API

Types

// Commit operation types
public enum CommitOp {
    CompactionCommit,
    AppendCommit, 
    MergeCommit,
    UpdateCommit,
    DeleteCommit
}

// File operation types  
public enum FileOp {
    add,
    del
}

// Data file operation with metadata
public class DataFileOp {
    public String getPath();
    public FileOp getFileOp();
    public long getSize();
    public String getFileExistCols();
}

// Namespace information
public class Namespace {
    public String getNamespace();
    public String getProperties();
    public String getComment();
    public String getDomain();
    public static Builder newBuilder();
}

// JNI wrapper for native operations
public class JniWrapper {
    // Protobuf-serialized data for native calls
}

// Database property configuration
public class DataBaseProperty {
    public String getDriver();
    public String getUrl(); 
    public String getUsername();
    public String getPassword();
    public String getDbName();
    public String getHost();
    public int getPort();
    public void setDriver(String driver);
    public void setUrl(String url);
    // ... other setters
}

// UUID representation for protobuf compatibility
public class Uuid {
    public long getHigh();
    public long getLow();
    public static Builder newBuilder();
}

Error Handling

The library throws several exception types for different error conditions:

  • AuthZException - Thrown when authorization fails, contains message "lakesoul access denied!"
  • IllegalStateException - Thrown for invalid operations, conflicts, or state inconsistencies
  • RuntimeException - General runtime errors from configuration or initialization issues
  • SQLException - Database operation errors from connection or query failures

Always handle these exceptions appropriately in your application code.

Thread Safety

LakeSoul Common is designed for concurrent usage:

  • DBManager is thread-safe for concurrent metadata operations
  • Connection pooling is handled by HikariCP with configurable pool sizes
  • Native client includes built-in connection pooling and retry logic
  • Configuration classes use singleton pattern with thread-safe lazy initialization
  • All protobuf entity classes are immutable after construction