or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

byte-utilities.mddata-format-system.mdindex.mdschema-system.mdstream-processing.mdstructured-records.md

structured-records.mddocs/

0

# Structured Records

1

2

Type-safe record instances that conform to defined schemas, providing runtime data containers with builder pattern construction and specialized accessors for date/time logical types. StructuredRecord is essential for data pipeline processing, validation, and type-safe data manipulation in CDAP applications.

3

4

## Capabilities

5

6

### Record Creation

7

8

Create structured records using the builder pattern with schema validation.

9

10

```java { .api }

11

/**

12

* Create builder for constructing structured records

13

* @param schema Record schema (must be RECORD type with at least one field)

14

* @return Builder instance

15

* @throws UnexpectedFormatException if schema is not a valid record schema

16

*/

17

public static StructuredRecord.Builder builder(Schema schema);

18

```

19

20

**Usage Example:**

21

22

```java

23

Schema schema = Schema.recordOf("Person",

24

Schema.Field.of("name", Schema.of(Schema.Type.STRING)),

25

Schema.Field.of("age", Schema.of(Schema.Type.INT)),

26

Schema.Field.of("active", Schema.of(Schema.Type.BOOLEAN))

27

);

28

29

StructuredRecord.Builder builder = StructuredRecord.builder(schema);

30

```

31

32

### Data Access

33

34

Access field data from structured records with type safety and specialized accessors.

35

36

```java { .api }

37

/**

38

* Get schema of the record

39

* @return Record schema

40

*/

41

public Schema getSchema();

42

43

/**

44

* Get value of a field (generic accessor)

45

* @param fieldName Field name

46

* @param <T> Expected type of field value

47

* @return Field value or null

48

*/

49

public <T> T get(String fieldName);

50

51

/**

52

* Get LocalDate from DATE logical type field

53

* @param fieldName Date field name

54

* @return LocalDate value or null

55

* @throws UnexpectedFormatException if field is not DATE logical type

56

*/

57

public LocalDate getDate(String fieldName);

58

59

/**

60

* Get LocalTime from TIME logical type field

61

* @param fieldName Time field name

62

* @return LocalTime value or null

63

* @throws UnexpectedFormatException if field is not TIME_MILLIS or TIME_MICROS

64

*/

65

public LocalTime getTime(String fieldName);

66

67

/**

68

* Get ZonedDateTime with UTC timezone from TIMESTAMP logical type field

69

* @param fieldName Timestamp field name

70

* @return ZonedDateTime value or null

71

* @throws UnexpectedFormatException if field is not TIMESTAMP logical type

72

*/

73

public ZonedDateTime getTimestamp(String fieldName);

74

75

/**

76

* Get ZonedDateTime with specified timezone from TIMESTAMP logical type field

77

* @param fieldName Timestamp field name

78

* @param zoneId Timezone for result

79

* @return ZonedDateTime value or null

80

* @throws UnexpectedFormatException if field is not TIMESTAMP logical type

81

*/

82

public ZonedDateTime getTimestamp(String fieldName, ZoneId zoneId);

83

```

84

85

**Usage Examples:**

86

87

```java

88

// Create record with date/time fields

89

Schema schema = Schema.recordOf("Event",

90

Schema.Field.of("name", Schema.of(Schema.Type.STRING)),

91

Schema.Field.of("count", Schema.of(Schema.Type.INT)),

92

Schema.Field.of("eventDate", Schema.of(Schema.LogicalType.DATE)),

93

Schema.Field.of("timestamp", Schema.of(Schema.LogicalType.TIMESTAMP_MILLIS))

94

);

95

96

StructuredRecord record = StructuredRecord.builder(schema)

97

.set("name", "user_signup")

98

.set("count", 42)

99

.setDate("eventDate", LocalDate.of(2023, 6, 15))

100

.setTimestamp("timestamp", ZonedDateTime.now())

101

.build();

102

103

// Access field data

104

String name = record.get("name"); // "user_signup"

105

Integer count = record.get("count"); // 42

106

LocalDate eventDate = record.getDate("eventDate"); // 2023-06-15

107

ZonedDateTime timestamp = record.getTimestamp("timestamp");

108

109

// Access with specific timezone

110

ZonedDateTime pstTime = record.getTimestamp("timestamp",

111

ZoneId.of("America/Los_Angeles"));

112

```

113

114

## Builder Pattern

115

116

### Basic Field Assignment

117

118

Set field values with type checking and nullable field handling.

119

120

```java { .api }

121

/**

122

* Set field to given value

123

* @param fieldName Field name (must exist in schema)

124

* @param value Field value (type must match schema)

125

* @return Builder instance for chaining

126

* @throws UnexpectedFormatException if field not in schema or invalid value

127

*/

128

public StructuredRecord.Builder set(String fieldName, Object value);

129

```

130

131

**Usage Examples:**

132

133

```java

134

Schema schema = Schema.recordOf("Product",

135

Schema.Field.of("id", Schema.of(Schema.Type.LONG)),

136

Schema.Field.of("name", Schema.of(Schema.Type.STRING)),

137

Schema.Field.of("price", Schema.nullableOf(Schema.of(Schema.Type.DOUBLE))),

138

Schema.Field.of("active", Schema.of(Schema.Type.BOOLEAN))

139

);

140

141

StructuredRecord product = StructuredRecord.builder(schema)

142

.set("id", 12345L)

143

.set("name", "Widget")

144

.set("price", 29.99) // Can be null for nullable fields

145

.set("active", true)

146

.build();

147

148

// Nullable field example

149

StructuredRecord productNoPrice = StructuredRecord.builder(schema)

150

.set("id", 67890L)

151

.set("name", "Gadget")

152

.set("price", null) // Explicitly set null

153

.set("active", false)

154

.build();

155

```

156

157

### Date and Time Field Assignment

158

159

Set date and time fields using Java 8 time types with automatic conversion to underlying storage format.

160

161

```java { .api }

162

/**

163

* Set DATE logical type field

164

* @param fieldName Field name (must be DATE logical type)

165

* @param localDate Date value

166

* @return Builder instance

167

* @throws UnexpectedFormatException if field is not DATE type or date too large

168

*/

169

public StructuredRecord.Builder setDate(String fieldName, LocalDate localDate);

170

171

/**

172

* Set TIME logical type field

173

* @param fieldName Field name (must be TIME_MILLIS or TIME_MICROS)

174

* @param localTime Time value

175

* @return Builder instance

176

* @throws UnexpectedFormatException if field is not TIME type or time too large

177

*/

178

public StructuredRecord.Builder setTime(String fieldName, LocalTime localTime);

179

180

/**

181

* Set TIMESTAMP logical type field

182

* @param fieldName Field name (must be TIMESTAMP_MILLIS or TIMESTAMP_MICROS)

183

* @param zonedDateTime Timestamp value

184

* @return Builder instance

185

* @throws UnexpectedFormatException if field is not TIMESTAMP type or timestamp too large

186

*/

187

public StructuredRecord.Builder setTimestamp(String fieldName, ZonedDateTime zonedDateTime);

188

```

189

190

**Usage Examples:**

191

192

```java

193

Schema eventSchema = Schema.recordOf("Event",

194

Schema.Field.of("eventDate", Schema.of(Schema.LogicalType.DATE)),

195

Schema.Field.of("eventTime", Schema.of(Schema.LogicalType.TIME_MILLIS)),

196

Schema.Field.of("timestamp", Schema.of(Schema.LogicalType.TIMESTAMP_MILLIS))

197

);

198

199

LocalDate today = LocalDate.now();

200

LocalTime noon = LocalTime.of(12, 0, 0);

201

ZonedDateTime now = ZonedDateTime.now();

202

203

StructuredRecord event = StructuredRecord.builder(eventSchema)

204

.setDate("eventDate", today)

205

.setTime("eventTime", noon)

206

.setTimestamp("timestamp", now)

207

.build();

208

209

// Nullable date/time fields

210

Schema nullableEventSchema = Schema.recordOf("Event",

211

Schema.Field.of("optionalDate", Schema.nullableOf(Schema.of(Schema.LogicalType.DATE)))

212

);

213

214

StructuredRecord eventWithNull = StructuredRecord.builder(nullableEventSchema)

215

.setDate("optionalDate", null) // Null value for nullable field

216

.build();

217

```

218

219

### String Conversion and Legacy Date Support

220

221

Convert string values to appropriate field types and handle legacy Date objects.

222

223

```java { .api }

224

/**

225

* Convert string to field type and set value

226

* @param fieldName Field name

227

* @param strVal String value to convert

228

* @return Builder instance

229

* @throws UnexpectedFormatException if conversion fails or field invalid

230

*/

231

public StructuredRecord.Builder convertAndSet(String fieldName, String strVal);

232

233

/**

234

* Convert Date to field type and set value (deprecated)

235

* @param fieldName Field name

236

* @param date Date value

237

* @return Builder instance

238

* @throws UnexpectedFormatException if conversion fails

239

* @deprecated Use setDate, setTime, setTimestamp instead

240

*/

241

@Deprecated

242

public StructuredRecord.Builder convertAndSet(String fieldName, Date date);

243

244

/**

245

* Convert Date with format to field type and set value (deprecated)

246

* @param fieldName Field name

247

* @param date Date value

248

* @param dateFormat Format for string conversion

249

* @return Builder instance

250

* @throws UnexpectedFormatException if conversion fails

251

* @deprecated Use setDate, setTime, setTimestamp instead

252

*/

253

@Deprecated

254

public StructuredRecord.Builder convertAndSet(String fieldName, Date date, DateFormat dateFormat);

255

```

256

257

**Usage Examples:**

258

259

```java

260

Schema schema = Schema.recordOf("Data",

261

Schema.Field.of("id", Schema.of(Schema.Type.LONG)),

262

Schema.Field.of("score", Schema.of(Schema.Type.DOUBLE)),

263

Schema.Field.of("active", Schema.of(Schema.Type.BOOLEAN)),

264

Schema.Field.of("name", Schema.of(Schema.Type.STRING))

265

);

266

267

// String conversion automatically handles type conversion

268

StructuredRecord record = StructuredRecord.builder(schema)

269

.convertAndSet("id", "12345") // String "12345" -> Long 12345

270

.convertAndSet("score", "98.5") // String "98.5" -> Double 98.5

271

.convertAndSet("active", "true") // String "true" -> Boolean true

272

.convertAndSet("name", "John Doe") // String -> String (no conversion)

273

.build();

274

275

// Nullable field string conversion

276

Schema nullableSchema = Schema.recordOf("Data",

277

Schema.Field.of("optionalValue", Schema.nullableOf(Schema.of(Schema.Type.INT)))

278

);

279

280

StructuredRecord withNull = StructuredRecord.builder(nullableSchema)

281

.convertAndSet("optionalValue", null) // null string -> null value

282

.build();

283

```

284

285

### Record Finalization

286

287

Build the final structured record with validation.

288

289

```java { .api }

290

/**

291

* Build final StructuredRecord with validation

292

* @return Completed StructuredRecord

293

* @throws UnexpectedFormatException if non-nullable fields missing values

294

*/

295

public StructuredRecord build();

296

```

297

298

**Usage Example:**

299

300

```java

301

Schema schema = Schema.recordOf("User",

302

Schema.Field.of("id", Schema.of(Schema.Type.LONG)), // Required

303

Schema.Field.of("name", Schema.of(Schema.Type.STRING)), // Required

304

Schema.Field.of("email", Schema.nullableOf(Schema.of(Schema.Type.STRING))) // Optional

305

);

306

307

// Valid record - all required fields set

308

StructuredRecord validUser = StructuredRecord.builder(schema)

309

.set("id", 123L)

310

.set("name", "Alice")

311

// email not set, but nullable so gets null value

312

.build();

313

314

// Invalid record - missing required field will throw exception

315

try {

316

StructuredRecord invalidUser = StructuredRecord.builder(schema)

317

.set("id", 123L)

318

// Missing required "name" field

319

.build(); // Throws UnexpectedFormatException

320

} catch (UnexpectedFormatException e) {

321

// Handle validation error

322

}

323

```

324

325

## Type Conversion Rules

326

327

### String to Type Conversion

328

329

The `convertAndSet(String fieldName, String strVal)` method supports automatic conversion:

330

331

- **BOOLEAN**: `Boolean.parseBoolean(strVal)`

332

- **INT**: `Integer.parseInt(strVal)`

333

- **LONG**: `Long.parseLong(strVal)`

334

- **FLOAT**: `Float.parseFloat(strVal)`

335

- **DOUBLE**: `Double.parseDouble(strVal)`

336

- **BYTES**: `Bytes.toBytesBinary(strVal)` (binary-escaped format)

337

- **STRING**: No conversion (direct assignment)

338

- **NULL**: Always returns null

339

340

### Nullable Field Handling

341

342

- Nullable fields (union with null) accept null values

343

- Non-nullable fields throw `UnexpectedFormatException` for null values

344

- Missing fields in build() get null for nullable fields or throw exception for required fields

345

346

### Date/Time Type Storage

347

348

- **DATE**: Stored as INT (days since Unix epoch, max value ~2038-01-01)

349

- **TIME_MILLIS**: Stored as INT (milliseconds since midnight)

350

- **TIME_MICROS**: Stored as LONG (microseconds since midnight)

351

- **TIMESTAMP_MILLIS**: Stored as LONG (milliseconds since Unix epoch)

352

- **TIMESTAMP_MICROS**: Stored as LONG (microseconds since Unix epoch)

353

354

## Validation and Error Handling

355

356

### Field Validation

357

358

- Field names must exist in the record schema

359

- Field values must be compatible with schema types

360

- Null values only allowed for nullable (union with null) fields

361

- Date/time values must fit within storage type ranges

362

363

### Common Exceptions

364

365

```java { .api }

366

// Thrown when field not in schema

367

throw new UnexpectedFormatException("field " + fieldName + " is not in the schema.");

368

369

// Thrown when setting null to non-nullable field

370

throw new UnexpectedFormatException("field " + fieldName + " cannot be set to a null value.");

371

372

// Thrown when required field missing in build()

373

throw new UnexpectedFormatException("Field " + fieldName + " must contain a value.");

374

375

// Thrown when date/time value too large for storage type

376

throw new UnexpectedFormatException("Field " + fieldName + " was set to a date that is too large.");

377

```

378

379

## Performance Considerations

380

381

- StructuredRecord instances are immutable after construction

382

- Builder pattern allows efficient field-by-field construction

383

- Schema validation occurs during builder operations, not at record access time

384

- Date/time conversions handle precision and timezone conversions automatically

385

- Field access by name uses efficient hash-based lookup