or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

authorization-security.mdconfiguration-management.mddatabase-connection.mdentity-models.mdindex.mdmetadata-management.mdnative-operations.mdscala-functional-api.md

entity-models.mddocs/

0

# Entity Data Models

1

2

The entity data model system provides comprehensive protobuf-based data transfer objects for all metadata operations in LakeSoul. These entities ensure type-safe serialization, cross-language compatibility, and efficient data transfer between components.

3

4

## Capabilities

5

6

### Core Entity Classes

7

8

The primary entity classes represent the fundamental data structures in the LakeSoul metadata system.

9

10

### TableInfo Entity

11

12

Represents complete metadata information for LakeSoul tables including schema, properties, and partition configuration.

13

14

```java { .api }

15

/**

16

* Metadata information for LakeSoul tables

17

* Immutable protobuf message with builder pattern

18

*/

19

public class TableInfo {

20

/**

21

* Get unique table identifier

22

* @return String table ID (UUID format)

23

*/

24

public String getTableId();

25

26

/**

27

* Get table namespace for organization and multi-tenancy

28

* @return String namespace name

29

*/

30

public String getTableNamespace();

31

32

/**

33

* Get short table name for user-friendly reference

34

* @return String table name (can be empty)

35

*/

36

public String getTableName();

37

38

/**

39

* Get full storage path for table data

40

* @return String absolute path to table storage location

41

*/

42

public String getTablePath();

43

44

/**

45

* Get table schema definition

46

* @return String JSON schema definition (Arrow or Spark format)

47

*/

48

public String getTableSchema();

49

50

/**

51

* Get table properties as JSON string

52

* @return String JSON object with key-value properties

53

*/

54

public String getProperties();

55

56

/**

57

* Get partition configuration

58

* @return String partition configuration (range and hash partitions)

59

*/

60

public String getPartitions();

61

62

/**

63

* Get security domain for RBAC

64

* @return String domain name for authorization

65

*/

66

public String getDomain();

67

68

/**

69

* Create new TableInfo builder

70

* @return Builder instance for constructing TableInfo

71

*/

72

public static Builder newBuilder();

73

74

/**

75

* Create builder from existing TableInfo

76

* @return Builder instance initialized with current values

77

*/

78

public Builder toBuilder();

79

}

80

```

81

82

### PartitionInfo Entity

83

84

Contains version information and snapshot references for specific table range partitions.

85

86

```java { .api }

87

/**

88

* Version information for specific table range partitions

89

* Supports versioning, snapshots, and commit operations

90

*/

91

public class PartitionInfo {

92

/**

93

* Get table identifier this partition belongs to

94

* @return String table ID

95

*/

96

public String getTableId();

97

98

/**

99

* Get partition description/key

100

* @return String partition identifier (e.g., "date=2023-01-01")

101

*/

102

public String getPartitionDesc();

103

104

/**

105

* Get partition version number

106

* @return int version number (incremental, starting from 0)

107

*/

108

public int getVersion();

109

110

/**

111

* Get commit operation type for this partition version

112

* @return CommitOp enum value (AppendCommit, CompactionCommit, etc.)

113

*/

114

public CommitOp getCommitOp();

115

116

/**

117

* Get commit timestamp

118

* @return long timestamp in UTC milliseconds

119

*/

120

public long getTimestamp();

121

122

/**

123

* Get list of snapshot UUIDs for this partition version

124

* @return List<Uuid> snapshot identifiers

125

*/

126

public List<Uuid> getSnapshotList();

127

128

/**

129

* Get number of snapshots

130

* @return int count of snapshots in this partition version

131

*/

132

public int getSnapshotCount();

133

134

/**

135

* Get partition expression (for dynamic partitioning)

136

* @return String expression used for partition filtering

137

*/

138

public String getExpression();

139

140

/**

141

* Get security domain for RBAC

142

* @return String domain name for authorization

143

*/

144

public String getDomain();

145

146

/**

147

* Create new PartitionInfo builder

148

* @return Builder instance for constructing PartitionInfo

149

*/

150

public static Builder newBuilder();

151

152

/**

153

* Create builder from existing PartitionInfo

154

* @return Builder instance initialized with current values

155

*/

156

public Builder toBuilder();

157

}

158

```

159

160

### DataCommitInfo Entity

161

162

Contains detailed information about data file operations for specific table partitions.

163

164

```java { .api }

165

/**

166

* Data files commit information for specific table range partitions

167

* Contains file operations, timestamps, and commit status

168

*/

169

public class DataCommitInfo {

170

/**

171

* Get table identifier

172

* @return String table ID

173

*/

174

public String getTableId();

175

176

/**

177

* Get partition description/key

178

* @return String partition identifier

179

*/

180

public String getPartitionDesc();

181

182

/**

183

* Get unique commit identifier

184

* @return Uuid commit ID for this data operation

185

*/

186

public Uuid getCommitId();

187

188

/**

189

* Get list of file operations in this commit

190

* @return List<DataFileOp> file operations (add/delete)

191

*/

192

public List<DataFileOp> getFileOpsList();

193

194

/**

195

* Get number of file operations

196

* @return int count of file operations in this commit

197

*/

198

public int getFileOpsCount();

199

200

/**

201

* Get commit operation type

202

* @return CommitOp enum value

203

*/

204

public CommitOp getCommitOp();

205

206

/**

207

* Get commit timestamp

208

* @return long timestamp in UTC milliseconds

209

*/

210

public long getTimestamp();

211

212

/**

213

* Get commit status

214

* @return boolean true if commit is completed, false if pending

215

*/

216

public boolean getCommitted();

217

218

/**

219

* Get security domain for RBAC

220

* @return String domain name for authorization

221

*/

222

public String getDomain();

223

224

/**

225

* Create new DataCommitInfo builder

226

* @return Builder instance for constructing DataCommitInfo

227

*/

228

public static Builder newBuilder();

229

230

/**

231

* Create builder from existing DataCommitInfo

232

* @return Builder instance initialized with current values

233

*/

234

public Builder toBuilder();

235

}

236

```

237

238

### DataFileOp Entity

239

240

Represents individual file operations (add/delete) with metadata.

241

242

```java { .api }

243

/**

244

* Single data file operation information

245

* Represents add or delete operations on data files

246

*/

247

public class DataFileOp {

248

/**

249

* Get file path

250

* @return String absolute path to data file

251

*/

252

public String getPath();

253

254

/**

255

* Get file operation type

256

* @return FileOp enum value (add or del)

257

*/

258

public FileOp getFileOp();

259

260

/**

261

* Get file size in bytes

262

* @return long file size

263

*/

264

public long getSize();

265

266

/**

267

* Get existing columns information

268

* @return String JSON array of column names present in file

269

*/

270

public String getFileExistCols();

271

272

/**

273

* Create new DataFileOp builder

274

* @return Builder instance for constructing DataFileOp

275

*/

276

public static Builder newBuilder();

277

278

/**

279

* Create builder from existing DataFileOp

280

* @return Builder instance initialized with current values

281

*/

282

public Builder toBuilder();

283

}

284

```

285

286

### Namespace Entity

287

288

Represents namespace metadata for table organization and multi-tenancy.

289

290

```java { .api }

291

/**

292

* Namespace metadata for tables

293

* Provides organization and multi-tenancy support

294

*/

295

public class Namespace {

296

/**

297

* Get namespace name

298

* @return String namespace identifier

299

*/

300

public String getNamespace();

301

302

/**

303

* Get namespace properties

304

* @return String JSON object with namespace configuration

305

*/

306

public String getProperties();

307

308

/**

309

* Get namespace comment/description

310

* @return String optional description

311

*/

312

public String getComment();

313

314

/**

315

* Get security domain for RBAC

316

* @return String domain name for authorization

317

*/

318

public String getDomain();

319

320

/**

321

* Create new Namespace builder

322

* @return Builder instance for constructing Namespace

323

*/

324

public static Builder newBuilder();

325

326

/**

327

* Create builder from existing Namespace

328

* @return Builder instance initialized with current values

329

*/

330

public Builder toBuilder();

331

}

332

```

333

334

### Mapping Entities

335

336

Entities for managing table name and path to ID mappings.

337

338

```java { .api }

339

/**

340

* Relationship between table namespace.name and table ID

341

* Enables short name lookups for tables

342

*/

343

public class TableNameId {

344

/**

345

* Get table short name

346

* @return String user-friendly table name

347

*/

348

public String getTableName();

349

350

/**

351

* Get table unique identifier

352

* @return String table ID (UUID format)

353

*/

354

public String getTableId();

355

356

/**

357

* Get table namespace

358

* @return String namespace name

359

*/

360

public String getTableNamespace();

361

362

/**

363

* Get security domain for RBAC

364

* @return String domain name for authorization

365

*/

366

public String getDomain();

367

368

/**

369

* Create new TableNameId builder

370

* @return Builder instance for constructing TableNameId

371

*/

372

public static Builder newBuilder();

373

}

374

375

/**

376

* Relationship between table namespace.path and table ID

377

* Enables path-based lookups for tables

378

*/

379

public class TablePathId {

380

/**

381

* Get table storage path

382

* @return String absolute path to table storage

383

*/

384

public String getTablePath();

385

386

/**

387

* Get table unique identifier

388

* @return String table ID (UUID format)

389

*/

390

public String getTableId();

391

392

/**

393

* Get table namespace

394

* @return String namespace name

395

*/

396

public String getTableNamespace();

397

398

/**

399

* Get security domain for RBAC

400

* @return String domain name for authorization

401

*/

402

public String getDomain();

403

404

/**

405

* Create new TablePathId builder

406

* @return Builder instance for constructing TablePathId

407

*/

408

public static Builder newBuilder();

409

}

410

```

411

412

### UUID Entity

413

414

Protobuf-compatible UUID representation for cross-language compatibility.

415

416

```java { .api }

417

/**

418

* UUID representation compatible with protobuf serialization

419

* Stores UUID as high and low 64-bit values

420

*/

421

public class Uuid {

422

/**

423

* Get high 64 bits of UUID

424

* @return long high order bits

425

*/

426

public long getHigh();

427

428

/**

429

* Get low 64 bits of UUID

430

* @return long low order bits

431

*/

432

public long getLow();

433

434

/**

435

* Create new Uuid builder

436

* @return Builder instance for constructing Uuid

437

*/

438

public static Builder newBuilder();

439

440

/**

441

* Create builder from existing Uuid

442

* @return Builder instance initialized with current values

443

*/

444

public Builder toBuilder();

445

}

446

```

447

448

### Collection Entities

449

450

Entities for batch operations and metadata collections.

451

452

```java { .api }

453

/**

454

* Collection of partition information for one table

455

* Used for batch operations and metadata exchange

456

*/

457

public class MetaInfo {

458

/**

459

* Get list of partitions to be committed

460

* @return List<PartitionInfo> partitions for commit operation

461

*/

462

public List<PartitionInfo> getListPartitionList();

463

464

/**

465

* Get number of partitions in commit

466

* @return int count of partitions

467

*/

468

public int getListPartitionCount();

469

470

/**

471

* Get table information

472

* @return TableInfo table metadata

473

*/

474

public TableInfo getTableInfo();

475

476

/**

477

* Get list of partitions read during operation

478

* @return List<PartitionInfo> partitions accessed for read

479

*/

480

public List<PartitionInfo> getReadPartitionInfoList();

481

482

/**

483

* Get number of read partitions

484

* @return int count of read partitions

485

*/

486

public int getReadPartitionInfoCount();

487

488

/**

489

* Create new MetaInfo builder

490

* @return Builder instance for constructing MetaInfo

491

*/

492

public static Builder newBuilder();

493

}

494

495

/**

496

* Wrapper for JNI operations containing collections of various entities

497

* Used for batch operations with native components

498

*/

499

public class JniWrapper {

500

// Contains collections of all entity types for batch operations

501

// Specific methods depend on protobuf generation

502

503

/**

504

* Create new JniWrapper builder

505

* @return Builder instance for constructing JniWrapper

506

*/

507

public static Builder newBuilder();

508

}

509

```

510

511

### Enumeration Types

512

513

Enumerations defining operation types and states.

514

515

```java { .api }

516

/**

517

* Define specific operations for data commits

518

* Determines how data changes are applied to partitions

519

*/

520

public enum CommitOp {

521

/** Compaction operation - merge multiple files into fewer files */

522

CompactionCommit,

523

524

/** Append operation - add new data without modifying existing data */

525

AppendCommit,

526

527

/** Merge operation - combine new data with existing data using merge logic */

528

MergeCommit,

529

530

/** Update operation - modify existing data in-place */

531

UpdateCommit,

532

533

/** Delete operation - remove data or mark as deleted */

534

DeleteCommit

535

}

536

537

/**

538

* Define specific operations for files

539

* Indicates whether file is being added or removed

540

*/

541

public enum FileOp {

542

/** Add file operation - file is being added to partition */

543

add,

544

545

/** Delete file operation - file is being removed from partition */

546

del

547

}

548

```

549

550

**Usage Examples:**

551

552

```java

553

import com.dmetasoul.lakesoul.meta.entity.*;

554

import com.alibaba.fastjson.JSONObject;

555

556

public class EntityUsageExample {

557

558

public void createTableInfoExample() {

559

// Create table info with builder pattern

560

JSONObject properties = new JSONObject();

561

properties.put("format", "parquet");

562

properties.put("compression", "snappy");

563

564

TableInfo tableInfo = TableInfo.newBuilder()

565

.setTableId("tbl_001")

566

.setTableNamespace("analytics")

567

.setTableName("user_events")

568

.setTablePath("/data/analytics/user_events")

569

.setTableSchema("{\"type\":\"struct\",\"fields\":[...]}")

570

.setProperties(properties.toJSONString())

571

.setPartitions("date,hour")

572

.setDomain("public")

573

.build();

574

575

System.out.println("Created table: " + tableInfo.getTableName());

576

System.out.println("Schema: " + tableInfo.getTableSchema());

577

}

578

579

public void createPartitionInfoExample() {

580

// Create partition info for new version

581

PartitionInfo partitionInfo = PartitionInfo.newBuilder()

582

.setTableId("tbl_001")

583

.setPartitionDesc("date=2023-01-01,hour=12")

584

.setVersion(5)

585

.setCommitOp(CommitOp.AppendCommit)

586

.setTimestamp(System.currentTimeMillis())

587

.addSnapshot(createUuid())

588

.setExpression("date >= '2023-01-01' AND hour = 12")

589

.setDomain("public")

590

.build();

591

592

System.out.println("Partition version: " + partitionInfo.getVersion());

593

System.out.println("Commit type: " + partitionInfo.getCommitOp());

594

System.out.println("Snapshots: " + partitionInfo.getSnapshotCount());

595

}

596

597

public void createDataCommitInfoExample() {

598

// Create data commit info with file operations

599

DataFileOp addOp = DataFileOp.newBuilder()

600

.setPath("/data/analytics/user_events/date=2023-01-01/hour=12/part-001.parquet")

601

.setFileOp(FileOp.add)

602

.setSize(1024000)

603

.setFileExistCols("[\"user_id\",\"event_type\",\"timestamp\"]")

604

.build();

605

606

DataFileOp deleteOp = DataFileOp.newBuilder()

607

.setPath("/data/analytics/user_events/date=2023-01-01/hour=12/part-000.parquet")

608

.setFileOp(FileOp.del)

609

.setSize(512000)

610

.setFileExistCols("[\"user_id\",\"event_type\"]")

611

.build();

612

613

DataCommitInfo commitInfo = DataCommitInfo.newBuilder()

614

.setTableId("tbl_001")

615

.setPartitionDesc("date=2023-01-01,hour=12")

616

.setCommitId(createUuid())

617

.addFileOps(addOp)

618

.addFileOps(deleteOp)

619

.setCommitOp(CommitOp.CompactionCommit)

620

.setTimestamp(System.currentTimeMillis())

621

.setCommitted(true)

622

.setDomain("public")

623

.build();

624

625

System.out.println("Commit ID: " + commitInfo.getCommitId());

626

System.out.println("File operations: " + commitInfo.getFileOpsCount());

627

}

628

629

public void createMetaInfoExample() {

630

// Create MetaInfo for batch operations

631

TableInfo tableInfo = createTableInfo();

632

PartitionInfo partition1 = createPartitionInfo("date=2023-01-01");

633

PartitionInfo partition2 = createPartitionInfo("date=2023-01-02");

634

635

MetaInfo metaInfo = MetaInfo.newBuilder()

636

.setTableInfo(tableInfo)

637

.addListPartition(partition1)

638

.addListPartition(partition2)

639

.addReadPartitionInfo(partition1) // Partition read during operation

640

.build();

641

642

System.out.println("Partitions to commit: " + metaInfo.getListPartitionCount());

643

System.out.println("Partitions read: " + metaInfo.getReadPartitionInfoCount());

644

}

645

646

public void modifyEntityExample() {

647

// Modify existing entity using builder

648

TableInfo originalTable = createTableInfo();

649

650

// Update table properties

651

JSONObject newProps = new JSONObject();

652

newProps.put("format", "delta");

653

newProps.put("compression", "zstd");

654

newProps.put("retention_days", "30");

655

656

TableInfo updatedTable = originalTable.toBuilder()

657

.setProperties(newProps.toJSONString())

658

.build();

659

660

System.out.println("Original format: " +

661

getPropertyValue(originalTable.getProperties(), "format"));

662

System.out.println("Updated format: " +

663

getPropertyValue(updatedTable.getProperties(), "format"));

664

}

665

666

private Uuid createUuid() {

667

// Helper method to create UUID

668

java.util.UUID javaUuid = java.util.UUID.randomUUID();

669

return Uuid.newBuilder()

670

.setHigh(javaUuid.getMostSignificantBits())

671

.setLow(javaUuid.getLeastSignificantBits())

672

.build();

673

}

674

675

private String getPropertyValue(String propertiesJson, String key) {

676

JSONObject props = JSONObject.parseObject(propertiesJson);

677

return props.getString(key);

678

}

679

}

680

```

681

682

**Entity Relationships:**

683

684

The entities form a hierarchical relationship structure:

685

686

- **TableInfo** → Contains table metadata and references partitions

687

- **PartitionInfo** → Contains partition versions and references data commits via snapshots

688

- **DataCommitInfo** → Contains file operations and commit details

689

- **DataFileOp** → Contains individual file operation details

690

- **MetaInfo** → Aggregates table and partition information for batch operations

691

- **Namespace** → Provides organizational container for tables

692

- **TableNameId/TablePathId** → Provide lookup mappings for tables

693

694

**Builder Pattern Usage:**

695

696

All entities use the protobuf builder pattern for construction:

697

698

1. **Immutable Objects**: Entities are immutable after construction

699

2. **Builder Creation**: Use `newBuilder()` for new instances or `toBuilder()` for modifications

700

3. **Method Chaining**: Builders support method chaining for fluent API

701

4. **Validation**: Builders validate required fields during `build()` operation

702

5. **Type Safety**: Compile-time type checking for all fields and operations