0
# Entity Data Models
1
2
The entity data model system provides comprehensive protobuf-based data transfer objects for all metadata operations in LakeSoul. These entities ensure type-safe serialization, cross-language compatibility, and efficient data transfer between components.
3
4
## Capabilities
5
6
### Core Entity Classes
7
8
The primary entity classes represent the fundamental data structures in the LakeSoul metadata system.
9
10
### TableInfo Entity
11
12
Represents complete metadata information for LakeSoul tables including schema, properties, and partition configuration.
13
14
```java { .api }
15
/**
16
* Metadata information for LakeSoul tables
17
* Immutable protobuf message with builder pattern
18
*/
19
public class TableInfo {
20
/**
21
* Get unique table identifier
22
* @return String table ID (UUID format)
23
*/
24
public String getTableId();
25
26
/**
27
* Get table namespace for organization and multi-tenancy
28
* @return String namespace name
29
*/
30
public String getTableNamespace();
31
32
/**
33
* Get short table name for user-friendly reference
34
* @return String table name (can be empty)
35
*/
36
public String getTableName();
37
38
/**
39
* Get full storage path for table data
40
* @return String absolute path to table storage location
41
*/
42
public String getTablePath();
43
44
/**
45
* Get table schema definition
46
* @return String JSON schema definition (Arrow or Spark format)
47
*/
48
public String getTableSchema();
49
50
/**
51
* Get table properties as JSON string
52
* @return String JSON object with key-value properties
53
*/
54
public String getProperties();
55
56
/**
57
* Get partition configuration
58
* @return String partition configuration (range and hash partitions)
59
*/
60
public String getPartitions();
61
62
/**
63
* Get security domain for RBAC
64
* @return String domain name for authorization
65
*/
66
public String getDomain();
67
68
/**
69
* Create new TableInfo builder
70
* @return Builder instance for constructing TableInfo
71
*/
72
public static Builder newBuilder();
73
74
/**
75
* Create builder from existing TableInfo
76
* @return Builder instance initialized with current values
77
*/
78
public Builder toBuilder();
79
}
80
```
81
82
### PartitionInfo Entity
83
84
Contains version information and snapshot references for specific table range partitions.
85
86
```java { .api }
87
/**
88
* Version information for specific table range partitions
89
* Supports versioning, snapshots, and commit operations
90
*/
91
public class PartitionInfo {
92
/**
93
* Get table identifier this partition belongs to
94
* @return String table ID
95
*/
96
public String getTableId();
97
98
/**
99
* Get partition description/key
100
* @return String partition identifier (e.g., "date=2023-01-01")
101
*/
102
public String getPartitionDesc();
103
104
/**
105
* Get partition version number
106
* @return int version number (incremental, starting from 0)
107
*/
108
public int getVersion();
109
110
/**
111
* Get commit operation type for this partition version
112
* @return CommitOp enum value (AppendCommit, CompactionCommit, etc.)
113
*/
114
public CommitOp getCommitOp();
115
116
/**
117
* Get commit timestamp
118
* @return long timestamp in UTC milliseconds
119
*/
120
public long getTimestamp();
121
122
/**
123
* Get list of snapshot UUIDs for this partition version
124
* @return List<Uuid> snapshot identifiers
125
*/
126
public List<Uuid> getSnapshotList();
127
128
/**
129
* Get number of snapshots
130
* @return int count of snapshots in this partition version
131
*/
132
public int getSnapshotCount();
133
134
/**
135
* Get partition expression (for dynamic partitioning)
136
* @return String expression used for partition filtering
137
*/
138
public String getExpression();
139
140
/**
141
* Get security domain for RBAC
142
* @return String domain name for authorization
143
*/
144
public String getDomain();
145
146
/**
147
* Create new PartitionInfo builder
148
* @return Builder instance for constructing PartitionInfo
149
*/
150
public static Builder newBuilder();
151
152
/**
153
* Create builder from existing PartitionInfo
154
* @return Builder instance initialized with current values
155
*/
156
public Builder toBuilder();
157
}
158
```
159
160
### DataCommitInfo Entity
161
162
Contains detailed information about data file operations for specific table partitions.
163
164
```java { .api }
165
/**
166
* Data files commit information for specific table range partitions
167
* Contains file operations, timestamps, and commit status
168
*/
169
public class DataCommitInfo {
170
/**
171
* Get table identifier
172
* @return String table ID
173
*/
174
public String getTableId();
175
176
/**
177
* Get partition description/key
178
* @return String partition identifier
179
*/
180
public String getPartitionDesc();
181
182
/**
183
* Get unique commit identifier
184
* @return Uuid commit ID for this data operation
185
*/
186
public Uuid getCommitId();
187
188
/**
189
* Get list of file operations in this commit
190
* @return List<DataFileOp> file operations (add/delete)
191
*/
192
public List<DataFileOp> getFileOpsList();
193
194
/**
195
* Get number of file operations
196
* @return int count of file operations in this commit
197
*/
198
public int getFileOpsCount();
199
200
/**
201
* Get commit operation type
202
* @return CommitOp enum value
203
*/
204
public CommitOp getCommitOp();
205
206
/**
207
* Get commit timestamp
208
* @return long timestamp in UTC milliseconds
209
*/
210
public long getTimestamp();
211
212
/**
213
* Get commit status
214
* @return boolean true if commit is completed, false if pending
215
*/
216
public boolean getCommitted();
217
218
/**
219
* Get security domain for RBAC
220
* @return String domain name for authorization
221
*/
222
public String getDomain();
223
224
/**
225
* Create new DataCommitInfo builder
226
* @return Builder instance for constructing DataCommitInfo
227
*/
228
public static Builder newBuilder();
229
230
/**
231
* Create builder from existing DataCommitInfo
232
* @return Builder instance initialized with current values
233
*/
234
public Builder toBuilder();
235
}
236
```
237
238
### DataFileOp Entity
239
240
Represents individual file operations (add/delete) with metadata.
241
242
```java { .api }
243
/**
244
* Single data file operation information
245
* Represents add or delete operations on data files
246
*/
247
public class DataFileOp {
248
/**
249
* Get file path
250
* @return String absolute path to data file
251
*/
252
public String getPath();
253
254
/**
255
* Get file operation type
256
* @return FileOp enum value (add or del)
257
*/
258
public FileOp getFileOp();
259
260
/**
261
* Get file size in bytes
262
* @return long file size
263
*/
264
public long getSize();
265
266
/**
267
* Get existing columns information
268
* @return String JSON array of column names present in file
269
*/
270
public String getFileExistCols();
271
272
/**
273
* Create new DataFileOp builder
274
* @return Builder instance for constructing DataFileOp
275
*/
276
public static Builder newBuilder();
277
278
/**
279
* Create builder from existing DataFileOp
280
* @return Builder instance initialized with current values
281
*/
282
public Builder toBuilder();
283
}
284
```
285
286
### Namespace Entity
287
288
Represents namespace metadata for table organization and multi-tenancy.
289
290
```java { .api }
291
/**
292
* Namespace metadata for tables
293
* Provides organization and multi-tenancy support
294
*/
295
public class Namespace {
296
/**
297
* Get namespace name
298
* @return String namespace identifier
299
*/
300
public String getNamespace();
301
302
/**
303
* Get namespace properties
304
* @return String JSON object with namespace configuration
305
*/
306
public String getProperties();
307
308
/**
309
* Get namespace comment/description
310
* @return String optional description
311
*/
312
public String getComment();
313
314
/**
315
* Get security domain for RBAC
316
* @return String domain name for authorization
317
*/
318
public String getDomain();
319
320
/**
321
* Create new Namespace builder
322
* @return Builder instance for constructing Namespace
323
*/
324
public static Builder newBuilder();
325
326
/**
327
* Create builder from existing Namespace
328
* @return Builder instance initialized with current values
329
*/
330
public Builder toBuilder();
331
}
332
```
333
334
### Mapping Entities
335
336
Entities for managing table name and path to ID mappings.
337
338
```java { .api }
339
/**
340
* Relationship between table namespace.name and table ID
341
* Enables short name lookups for tables
342
*/
343
public class TableNameId {
344
/**
345
* Get table short name
346
* @return String user-friendly table name
347
*/
348
public String getTableName();
349
350
/**
351
* Get table unique identifier
352
* @return String table ID (UUID format)
353
*/
354
public String getTableId();
355
356
/**
357
* Get table namespace
358
* @return String namespace name
359
*/
360
public String getTableNamespace();
361
362
/**
363
* Get security domain for RBAC
364
* @return String domain name for authorization
365
*/
366
public String getDomain();
367
368
/**
369
* Create new TableNameId builder
370
* @return Builder instance for constructing TableNameId
371
*/
372
public static Builder newBuilder();
373
}
374
375
/**
376
* Relationship between table namespace.path and table ID
377
* Enables path-based lookups for tables
378
*/
379
public class TablePathId {
380
/**
381
* Get table storage path
382
* @return String absolute path to table storage
383
*/
384
public String getTablePath();
385
386
/**
387
* Get table unique identifier
388
* @return String table ID (UUID format)
389
*/
390
public String getTableId();
391
392
/**
393
* Get table namespace
394
* @return String namespace name
395
*/
396
public String getTableNamespace();
397
398
/**
399
* Get security domain for RBAC
400
* @return String domain name for authorization
401
*/
402
public String getDomain();
403
404
/**
405
* Create new TablePathId builder
406
* @return Builder instance for constructing TablePathId
407
*/
408
public static Builder newBuilder();
409
}
410
```
411
412
### UUID Entity
413
414
Protobuf-compatible UUID representation for cross-language compatibility.
415
416
```java { .api }
417
/**
418
* UUID representation compatible with protobuf serialization
419
* Stores UUID as high and low 64-bit values
420
*/
421
public class Uuid {
422
/**
423
* Get high 64 bits of UUID
424
* @return long high order bits
425
*/
426
public long getHigh();
427
428
/**
429
* Get low 64 bits of UUID
430
* @return long low order bits
431
*/
432
public long getLow();
433
434
/**
435
* Create new Uuid builder
436
* @return Builder instance for constructing Uuid
437
*/
438
public static Builder newBuilder();
439
440
/**
441
* Create builder from existing Uuid
442
* @return Builder instance initialized with current values
443
*/
444
public Builder toBuilder();
445
}
446
```
447
448
### Collection Entities
449
450
Entities for batch operations and metadata collections.
451
452
```java { .api }
453
/**
454
* Collection of partition information for one table
455
* Used for batch operations and metadata exchange
456
*/
457
public class MetaInfo {
458
/**
459
* Get list of partitions to be committed
460
* @return List<PartitionInfo> partitions for commit operation
461
*/
462
public List<PartitionInfo> getListPartitionList();
463
464
/**
465
* Get number of partitions in commit
466
* @return int count of partitions
467
*/
468
public int getListPartitionCount();
469
470
/**
471
* Get table information
472
* @return TableInfo table metadata
473
*/
474
public TableInfo getTableInfo();
475
476
/**
477
* Get list of partitions read during operation
478
* @return List<PartitionInfo> partitions accessed for read
479
*/
480
public List<PartitionInfo> getReadPartitionInfoList();
481
482
/**
483
* Get number of read partitions
484
* @return int count of read partitions
485
*/
486
public int getReadPartitionInfoCount();
487
488
/**
489
* Create new MetaInfo builder
490
* @return Builder instance for constructing MetaInfo
491
*/
492
public static Builder newBuilder();
493
}
494
495
/**
496
* Wrapper for JNI operations containing collections of various entities
497
* Used for batch operations with native components
498
*/
499
public class JniWrapper {
500
// Contains collections of all entity types for batch operations
501
// Specific methods depend on protobuf generation
502
503
/**
504
* Create new JniWrapper builder
505
* @return Builder instance for constructing JniWrapper
506
*/
507
public static Builder newBuilder();
508
}
509
```
510
511
### Enumeration Types
512
513
Enumerations defining operation types and states.
514
515
```java { .api }
516
/**
517
* Define specific operations for data commits
518
* Determines how data changes are applied to partitions
519
*/
520
public enum CommitOp {
521
/** Compaction operation - merge multiple files into fewer files */
522
CompactionCommit,
523
524
/** Append operation - add new data without modifying existing data */
525
AppendCommit,
526
527
/** Merge operation - combine new data with existing data using merge logic */
528
MergeCommit,
529
530
/** Update operation - modify existing data in-place */
531
UpdateCommit,
532
533
/** Delete operation - remove data or mark as deleted */
534
DeleteCommit
535
}
536
537
/**
538
* Define specific operations for files
539
* Indicates whether file is being added or removed
540
*/
541
public enum FileOp {
542
/** Add file operation - file is being added to partition */
543
add,
544
545
/** Delete file operation - file is being removed from partition */
546
del
547
}
548
```
549
550
**Usage Examples:**
551
552
```java
553
import com.dmetasoul.lakesoul.meta.entity.*;
554
import com.alibaba.fastjson.JSONObject;
555
556
public class EntityUsageExample {
557
558
public void createTableInfoExample() {
559
// Create table info with builder pattern
560
JSONObject properties = new JSONObject();
561
properties.put("format", "parquet");
562
properties.put("compression", "snappy");
563
564
TableInfo tableInfo = TableInfo.newBuilder()
565
.setTableId("tbl_001")
566
.setTableNamespace("analytics")
567
.setTableName("user_events")
568
.setTablePath("/data/analytics/user_events")
569
.setTableSchema("{\"type\":\"struct\",\"fields\":[...]}")
570
.setProperties(properties.toJSONString())
571
.setPartitions("date,hour")
572
.setDomain("public")
573
.build();
574
575
System.out.println("Created table: " + tableInfo.getTableName());
576
System.out.println("Schema: " + tableInfo.getTableSchema());
577
}
578
579
public void createPartitionInfoExample() {
580
// Create partition info for new version
581
PartitionInfo partitionInfo = PartitionInfo.newBuilder()
582
.setTableId("tbl_001")
583
.setPartitionDesc("date=2023-01-01,hour=12")
584
.setVersion(5)
585
.setCommitOp(CommitOp.AppendCommit)
586
.setTimestamp(System.currentTimeMillis())
587
.addSnapshot(createUuid())
588
.setExpression("date >= '2023-01-01' AND hour = 12")
589
.setDomain("public")
590
.build();
591
592
System.out.println("Partition version: " + partitionInfo.getVersion());
593
System.out.println("Commit type: " + partitionInfo.getCommitOp());
594
System.out.println("Snapshots: " + partitionInfo.getSnapshotCount());
595
}
596
597
public void createDataCommitInfoExample() {
598
// Create data commit info with file operations
599
DataFileOp addOp = DataFileOp.newBuilder()
600
.setPath("/data/analytics/user_events/date=2023-01-01/hour=12/part-001.parquet")
601
.setFileOp(FileOp.add)
602
.setSize(1024000)
603
.setFileExistCols("[\"user_id\",\"event_type\",\"timestamp\"]")
604
.build();
605
606
DataFileOp deleteOp = DataFileOp.newBuilder()
607
.setPath("/data/analytics/user_events/date=2023-01-01/hour=12/part-000.parquet")
608
.setFileOp(FileOp.del)
609
.setSize(512000)
610
.setFileExistCols("[\"user_id\",\"event_type\"]")
611
.build();
612
613
DataCommitInfo commitInfo = DataCommitInfo.newBuilder()
614
.setTableId("tbl_001")
615
.setPartitionDesc("date=2023-01-01,hour=12")
616
.setCommitId(createUuid())
617
.addFileOps(addOp)
618
.addFileOps(deleteOp)
619
.setCommitOp(CommitOp.CompactionCommit)
620
.setTimestamp(System.currentTimeMillis())
621
.setCommitted(true)
622
.setDomain("public")
623
.build();
624
625
System.out.println("Commit ID: " + commitInfo.getCommitId());
626
System.out.println("File operations: " + commitInfo.getFileOpsCount());
627
}
628
629
public void createMetaInfoExample() {
630
// Create MetaInfo for batch operations
631
TableInfo tableInfo = createTableInfo();
632
PartitionInfo partition1 = createPartitionInfo("date=2023-01-01");
633
PartitionInfo partition2 = createPartitionInfo("date=2023-01-02");
634
635
MetaInfo metaInfo = MetaInfo.newBuilder()
636
.setTableInfo(tableInfo)
637
.addListPartition(partition1)
638
.addListPartition(partition2)
639
.addReadPartitionInfo(partition1) // Partition read during operation
640
.build();
641
642
System.out.println("Partitions to commit: " + metaInfo.getListPartitionCount());
643
System.out.println("Partitions read: " + metaInfo.getReadPartitionInfoCount());
644
}
645
646
public void modifyEntityExample() {
647
// Modify existing entity using builder
648
TableInfo originalTable = createTableInfo();
649
650
// Update table properties
651
JSONObject newProps = new JSONObject();
652
newProps.put("format", "delta");
653
newProps.put("compression", "zstd");
654
newProps.put("retention_days", "30");
655
656
TableInfo updatedTable = originalTable.toBuilder()
657
.setProperties(newProps.toJSONString())
658
.build();
659
660
System.out.println("Original format: " +
661
getPropertyValue(originalTable.getProperties(), "format"));
662
System.out.println("Updated format: " +
663
getPropertyValue(updatedTable.getProperties(), "format"));
664
}
665
666
private Uuid createUuid() {
667
// Helper method to create UUID
668
java.util.UUID javaUuid = java.util.UUID.randomUUID();
669
return Uuid.newBuilder()
670
.setHigh(javaUuid.getMostSignificantBits())
671
.setLow(javaUuid.getLeastSignificantBits())
672
.build();
673
}
674
675
private String getPropertyValue(String propertiesJson, String key) {
676
JSONObject props = JSONObject.parseObject(propertiesJson);
677
return props.getString(key);
678
}
679
}
680
```
681
682
**Entity Relationships:**
683
684
The entities form a hierarchical relationship structure:
685
686
- **TableInfo** → Contains table metadata and references partitions
687
- **PartitionInfo** → Contains partition versions and references data commits via snapshots
688
- **DataCommitInfo** → Contains file operations and commit details
689
- **DataFileOp** → Contains individual file operation details
690
- **MetaInfo** → Aggregates table and partition information for batch operations
691
- **Namespace** → Provides organizational container for tables
692
- **TableNameId/TablePathId** → Provide lookup mappings for tables
693
694
**Builder Pattern Usage:**
695
696
All entities use the protobuf builder pattern for construction:
697
698
1. **Immutable Objects**: Entities are immutable after construction
699
2. **Builder Creation**: Use `newBuilder()` for new instances or `toBuilder()` for modifications
700
3. **Method Chaining**: Builders support method chaining for fluent API
701
4. **Validation**: Builders validate required fields during `build()` operation
702
5. **Type Safety**: Compile-time type checking for all fields and operations