0
# LakeSoul Common
1
2
LakeSoul Common provides core utilities and metadata management for the LakeSoul lakehouse framework. It offers comprehensive database connection management with PostgreSQL, RBAC (Role-Based Access Control) using Casbin for authorization policies, protobuf-based data serialization for cross-language compatibility, native library integration with Rust components through JNR-FFI for high-performance operations, connection pooling with HikariCP, and AspectJ support for cross-cutting concerns.
3
4
The library serves as the foundational layer that enables LakeSoul's ACID transactions, concurrent operations, and metadata consistency across the entire lakehouse ecosystem, supporting multiple computing engines like Spark, Flink, and Presto while maintaining unified metadata management and security policies.
5
6
## Package Information
7
8
- **Package Name**: lakesoul-common
9
- **Package Type**: maven
10
- **Language**: Java/Scala
11
- **Installation**: Add Maven dependency
12
13
```xml
14
<dependency>
15
<groupId>com.dmetasoul</groupId>
16
<artifactId>lakesoul-common</artifactId>
17
<version>2.6.2</version>
18
</dependency>
19
```
20
21
## Core Imports
22
23
```java
24
import com.dmetasoul.lakesoul.meta.DBManager;
25
import com.dmetasoul.lakesoul.meta.GlobalConfig;
26
import com.dmetasoul.lakesoul.meta.DBConnector;
27
import com.dmetasoul.lakesoul.meta.entity.*;
28
```
29
30
For Scala components:
31
32
```scala
33
import com.dmetasoul.lakesoul.meta.{DataOperation, MetaVersion}
34
import com.dmetasoul.lakesoul.meta.DataFileInfo
35
```
36
37
## Basic Usage
38
39
```java
40
// Initialize database manager
41
DBManager dbManager = new DBManager();
42
43
// Check if table exists
44
boolean exists = dbManager.isTableExists("/path/to/table");
45
46
// Create namespace
47
dbManager.createNewNamespace("my_namespace", "{}", "Test namespace");
48
49
// List tables in namespace
50
List<String> tables = dbManager.listTablePathsByNamespace("my_namespace");
51
52
// Get table information
53
TableInfo tableInfo = dbManager.getTableInfoByPath("/path/to/table");
54
55
// Create a new table
56
dbManager.createNewTable(
57
"table_id_123",
58
"my_namespace",
59
"my_table",
60
"/path/to/table",
61
"schema_json",
62
new JSONObject(),
63
"partition_config"
64
);
65
```
66
67
## Architecture
68
69
LakeSoul Common is built around several key architectural components:
70
71
- **Metadata Management Layer**: Central `DBManager` class providing high-level operations for tables, partitions, and data commits
72
- **Database Connection Layer**: `DBConnector` with HikariCP connection pooling for PostgreSQL backend
73
- **Entity Layer**: Complete protobuf-based entity system for type-safe metadata serialization
74
- **DAO Layer**: Data Access Objects for all metadata operations with transaction support
75
- **Native Integration Layer**: JNR-FFI based integration with high-performance Rust components
76
- **Authorization Layer**: RBAC system using Casbin for fine-grained access control
77
- **Configuration Layer**: Centralized configuration management with global and local settings
78
- **Scala API Layer**: High-level Scala objects for functional programming style operations
79
80
## Capabilities
81
82
### Metadata Management
83
84
Core metadata operations for tables, partitions, namespaces, and data commits. Provides the primary interface for all LakeSoul metadata operations.
85
86
```java { .api }
87
public class DBManager {
88
public DBManager();
89
public boolean isTableExists(String tablePath);
90
public boolean isTableExistsByTableName(String tableName);
91
public boolean isTableExistsByTableName(String tableName, String tableNamespace);
92
public boolean isNamespaceExists(String tableNamespace);
93
public void createNewTable(String tableId, String namespace, String tableName,
94
String tablePath, String tableSchema, JSONObject properties, String partitions);
95
public TableInfo getTableInfoByTableId(String tableId);
96
public TableInfo getTableInfoByName(String tableName);
97
public TableInfo getTableInfoByNameAndNamespace(String tableName, String namespace);
98
public TableInfo getTableInfoByPath(String tablePath);
99
public List<TableInfo> getTableInfosByNamespace(String tableNamespace);
100
public List<String> listTables();
101
public List<String> listTableNamesByNamespace(String tableNamespace);
102
public List<String> listTablePathsByNamespace(String tableNamespace);
103
public PartitionInfo getSinglePartitionInfo(String tableId, String partitionDesc);
104
public PartitionInfo getSinglePartitionInfo(String tableId, String partitionDesc, int version);
105
public List<PartitionInfo> getAllPartitionInfo(String tableId);
106
public void updateTableSchema(String tableId, String tableSchema);
107
public void deleteTableInfo(String tablePath, String tableId, String tableNamespace);
108
public boolean commitData(MetaInfo metaInfo, boolean changeSchema, CommitOp commitOp);
109
public List<String> listNamespaces();
110
public void createNewNamespace(String name, String properties, String comment);
111
}
112
```
113
114
[Metadata Management](./metadata-management.md)
115
116
### Database Connection Management
117
118
PostgreSQL connection management with HikariCP connection pooling for scalable, efficient database operations.
119
120
```java { .api }
121
public class DBConnector {
122
public static DataSource getDS();
123
public static Connection getConn() throws SQLException;
124
public static void closeAllConnections();
125
public static void closeConn(Connection conn);
126
public static void closeConn(Statement statement, Connection conn);
127
public static void closeConn(ResultSet set, Statement statement, Connection conn);
128
}
129
```
130
131
[Database Connection Management](./database-connection.md)
132
133
### Entity Data Models
134
135
Comprehensive protobuf-based entity system providing type-safe data transfer objects for all metadata operations.
136
137
```java { .api }
138
// Core entities with builder patterns
139
public class TableInfo {
140
public String getTableId();
141
public String getTableNamespace();
142
public String getTableName();
143
public String getTablePath();
144
public String getTableSchema();
145
public String getProperties();
146
public String getPartitions();
147
public String getDomain();
148
public static Builder newBuilder();
149
}
150
151
public class PartitionInfo {
152
public String getTableId();
153
public String getPartitionDesc();
154
public int getVersion();
155
public CommitOp getCommitOp();
156
public long getTimestamp();
157
public List<Uuid> getSnapshotList();
158
public int getSnapshotCount();
159
public String getExpression();
160
public String getDomain();
161
public static Builder newBuilder();
162
}
163
164
public class DataCommitInfo {
165
public String getTableId();
166
public String getPartitionDesc();
167
public Uuid getCommitId();
168
public List<DataFileOp> getFileOpsList();
169
public CommitOp getCommitOp();
170
public long getTimestamp();
171
public boolean getCommitted();
172
public String getDomain();
173
public static Builder newBuilder();
174
}
175
176
public class MetaInfo {
177
public List<PartitionInfo> getListPartitionList();
178
public TableInfo getTableInfo();
179
public List<PartitionInfo> getReadPartitionInfoList();
180
public static Builder newBuilder();
181
}
182
```
183
184
[Entity Data Models](./entity-models.md)
185
186
### High-Performance Native Operations
187
188
JNR-FFI based integration with Rust components for high-performance metadata operations with connection pooling and retry logic.
189
190
```java { .api }
191
public class NativeMetadataJavaClient implements AutoCloseable {
192
public static NativeMetadataJavaClient getInstance();
193
public static Integer insert(NativeUtils.CodedDaoType insertType, JniWrapper jniWrapper);
194
public static JniWrapper query(NativeUtils.CodedDaoType queryType, List<String> params);
195
public static Integer update(NativeUtils.CodedDaoType updateType, List<String> params);
196
public static List<String> queryScalar(NativeUtils.CodedDaoType queryScalarType, List<String> params);
197
public void close();
198
}
199
```
200
201
[High-Performance Native Operations](./native-operations.md)
202
203
### Authorization and Security
204
205
Role-Based Access Control (RBAC) system using Casbin for fine-grained authorization policies with AOP-based enforcement.
206
207
```java { .api }
208
public class AuthZEnforcer {
209
public static SyncedEnforcer get();
210
public static boolean authZEnabled();
211
}
212
213
@Documented
214
@Retention(RetentionPolicy.RUNTIME)
215
@Target(ElementType.METHOD)
216
public @interface AuthZ {
217
String value() default "";
218
String object() default "object";
219
String action() default "action";
220
}
221
222
public class AuthZContext {
223
public static AuthZContext getInstance();
224
public String getDomain();
225
public void setDomain(String domain);
226
public String getSubject();
227
public void setSubject(String subject);
228
}
229
230
public class AuthZException extends RuntimeException {
231
public AuthZException();
232
}
233
```
234
235
[Authorization and Security](./authorization-security.md)
236
237
### Configuration Management
238
239
Centralized configuration management for database connections, authorization settings, and operational parameters.
240
241
```java { .api }
242
public class GlobalConfig {
243
public static GlobalConfig get();
244
public boolean isAuthZEnabled();
245
public void setAuthZEnabled(boolean enabled);
246
public String getAuthZCasbinModel();
247
}
248
249
public abstract class DBConfig {
250
public static final String LAKESOUL_DEFAULT_NAMESPACE = "default";
251
public static final String LAKESOUL_RANGE_PARTITION_SPLITTER = ",";
252
public static final int MAX_COMMIT_ATTEMPTS = 3;
253
}
254
```
255
256
[Configuration Management](./configuration-management.md)
257
258
### Scala Functional API
259
260
High-level Scala objects providing functional programming style operations for data file management and metadata operations.
261
262
```scala { .api }
263
object DataOperation {
264
def getTableDataInfo(tableId: String): Array[DataFileInfo]
265
def getTableDataInfo(partitionList: util.List[PartitionInfo]): Array[DataFileInfo]
266
def getTableDataInfo(partition_info_arr: Array[PartitionInfoScala]): Array[DataFileInfo]
267
def getTableDataInfo(tableId: String, partitions: List[String]): Array[DataFileInfo]
268
def getIncrementalPartitionDataInfo(table_id: String, partition_desc: String,
269
startTimestamp: Long, endTimestamp: Long,
270
readType: String): Array[DataFileInfo]
271
def getSinglePartitionDataInfo(partition_info: PartitionInfoScala): ArrayBuffer[DataFileInfo]
272
def getSinglePartitionDataInfo(table_id: String, partition_desc: String,
273
startTimestamp: Long, endTimestamp: Long): ArrayBuffer[DataFileInfo]
274
}
275
276
object MetaVersion {
277
def createNamespace(namespace: String): Unit
278
def listNamespaces(): Array[String]
279
def isTableExists(table_name: String): Boolean
280
def isTableIdExists(table_name: String, table_id: String): Boolean
281
def isNamespaceExists(table_namespace: String): Boolean
282
def isShortTableNameExists(short_table_name: String, table_namespace: String): (Boolean, String)
283
def createNewTable(table_namespace: String, table_path: String, short_table_name: String,
284
table_id: String, table_schema: String, range_column: String,
285
hash_column: String, configuration: Map[String, String], bucket_num: Int): Unit
286
def getAllPartitionInfo(table_id: String): util.List[PartitionInfo]
287
def getAllPartitionInfoScala(table_id: String): Array[PartitionInfoScala]
288
}
289
290
// Scala-specific data file information
291
case class DataFileInfo(range_partitions: String, path: String, file_op: String, size: Long,
292
modification_time: Long = -1L, file_exist_cols: String = "") {
293
lazy val file_bucket_id: Int
294
lazy val range_version: String
295
def expire(deleteTime: Long): DataFileInfo
296
}
297
298
// Scala wrapper for PartitionInfo
299
case class PartitionInfoScala(table_id: String, partition_desc: String, version: Int,
300
commit_op: String, timestamp: Long, snapshot: Array[String],
301
expression: String = "")
302
```
303
304
[Scala Functional API](./scala-functional-api.md)
305
306
## Types
307
308
```java { .api }
309
// Commit operation types
310
public enum CommitOp {
311
CompactionCommit,
312
AppendCommit,
313
MergeCommit,
314
UpdateCommit,
315
DeleteCommit
316
}
317
318
// File operation types
319
public enum FileOp {
320
add,
321
del
322
}
323
324
// Data file operation with metadata
325
public class DataFileOp {
326
public String getPath();
327
public FileOp getFileOp();
328
public long getSize();
329
public String getFileExistCols();
330
}
331
332
// Namespace information
333
public class Namespace {
334
public String getNamespace();
335
public String getProperties();
336
public String getComment();
337
public String getDomain();
338
public static Builder newBuilder();
339
}
340
341
// JNI wrapper for native operations
342
public class JniWrapper {
343
// Protobuf-serialized data for native calls
344
}
345
346
// Database property configuration
347
public class DataBaseProperty {
348
public String getDriver();
349
public String getUrl();
350
public String getUsername();
351
public String getPassword();
352
public String getDbName();
353
public String getHost();
354
public int getPort();
355
public void setDriver(String driver);
356
public void setUrl(String url);
357
// ... other setters
358
}
359
360
// UUID representation for protobuf compatibility
361
public class Uuid {
362
public long getHigh();
363
public long getLow();
364
public static Builder newBuilder();
365
}
366
```
367
368
## Error Handling
369
370
The library throws several exception types for different error conditions:
371
372
- `AuthZException` - Thrown when authorization fails, contains message "lakesoul access denied!"
373
- `IllegalStateException` - Thrown for invalid operations, conflicts, or state inconsistencies
374
- `RuntimeException` - General runtime errors from configuration or initialization issues
375
- `SQLException` - Database operation errors from connection or query failures
376
377
Always handle these exceptions appropriately in your application code.
378
379
## Thread Safety
380
381
LakeSoul Common is designed for concurrent usage:
382
383
- `DBManager` is thread-safe for concurrent metadata operations
384
- Connection pooling is handled by HikariCP with configurable pool sizes
385
- Native client includes built-in connection pooling and retry logic
386
- Configuration classes use singleton pattern with thread-safe lazy initialization
387
- All protobuf entity classes are immutable after construction