0
# Structured Records
1
2
Type-safe record instances that conform to defined schemas, providing runtime data containers with builder pattern construction and specialized accessors for date/time logical types. StructuredRecord is essential for data pipeline processing, validation, and type-safe data manipulation in CDAP applications.
3
4
## Capabilities
5
6
### Record Creation
7
8
Create structured records using the builder pattern with schema validation.
9
10
```java { .api }
11
/**
12
* Create builder for constructing structured records
13
* @param schema Record schema (must be RECORD type with at least one field)
14
* @return Builder instance
15
* @throws UnexpectedFormatException if schema is not a valid record schema
16
*/
17
public static StructuredRecord.Builder builder(Schema schema);
18
```
19
20
**Usage Example:**
21
22
```java
23
Schema schema = Schema.recordOf("Person",
24
Schema.Field.of("name", Schema.of(Schema.Type.STRING)),
25
Schema.Field.of("age", Schema.of(Schema.Type.INT)),
26
Schema.Field.of("active", Schema.of(Schema.Type.BOOLEAN))
27
);
28
29
StructuredRecord.Builder builder = StructuredRecord.builder(schema);
30
```
31
32
### Data Access
33
34
Access field data from structured records with type safety and specialized accessors.
35
36
```java { .api }
37
/**
38
* Get schema of the record
39
* @return Record schema
40
*/
41
public Schema getSchema();
42
43
/**
44
* Get value of a field (generic accessor)
45
* @param fieldName Field name
46
* @param <T> Expected type of field value
47
* @return Field value or null
48
*/
49
public <T> T get(String fieldName);
50
51
/**
52
* Get LocalDate from DATE logical type field
53
* @param fieldName Date field name
54
* @return LocalDate value or null
55
* @throws UnexpectedFormatException if field is not DATE logical type
56
*/
57
public LocalDate getDate(String fieldName);
58
59
/**
60
* Get LocalTime from TIME logical type field
61
* @param fieldName Time field name
62
* @return LocalTime value or null
63
* @throws UnexpectedFormatException if field is not TIME_MILLIS or TIME_MICROS
64
*/
65
public LocalTime getTime(String fieldName);
66
67
/**
68
* Get ZonedDateTime with UTC timezone from TIMESTAMP logical type field
69
* @param fieldName Timestamp field name
70
* @return ZonedDateTime value or null
71
* @throws UnexpectedFormatException if field is not TIMESTAMP logical type
72
*/
73
public ZonedDateTime getTimestamp(String fieldName);
74
75
/**
76
* Get ZonedDateTime with specified timezone from TIMESTAMP logical type field
77
* @param fieldName Timestamp field name
78
* @param zoneId Timezone for result
79
* @return ZonedDateTime value or null
80
* @throws UnexpectedFormatException if field is not TIMESTAMP logical type
81
*/
82
public ZonedDateTime getTimestamp(String fieldName, ZoneId zoneId);
83
```
84
85
**Usage Examples:**
86
87
```java
88
// Create record with date/time fields
89
Schema schema = Schema.recordOf("Event",
90
Schema.Field.of("name", Schema.of(Schema.Type.STRING)),
91
Schema.Field.of("count", Schema.of(Schema.Type.INT)),
92
Schema.Field.of("eventDate", Schema.of(Schema.LogicalType.DATE)),
93
Schema.Field.of("timestamp", Schema.of(Schema.LogicalType.TIMESTAMP_MILLIS))
94
);
95
96
StructuredRecord record = StructuredRecord.builder(schema)
97
.set("name", "user_signup")
98
.set("count", 42)
99
.setDate("eventDate", LocalDate.of(2023, 6, 15))
100
.setTimestamp("timestamp", ZonedDateTime.now())
101
.build();
102
103
// Access field data
104
String name = record.get("name"); // "user_signup"
105
Integer count = record.get("count"); // 42
106
LocalDate eventDate = record.getDate("eventDate"); // 2023-06-15
107
ZonedDateTime timestamp = record.getTimestamp("timestamp");
108
109
// Access with specific timezone
110
ZonedDateTime pstTime = record.getTimestamp("timestamp",
111
ZoneId.of("America/Los_Angeles"));
112
```
113
114
## Builder Pattern
115
116
### Basic Field Assignment
117
118
Set field values with type checking and nullable field handling.
119
120
```java { .api }
121
/**
122
* Set field to given value
123
* @param fieldName Field name (must exist in schema)
124
* @param value Field value (type must match schema)
125
* @return Builder instance for chaining
126
* @throws UnexpectedFormatException if field not in schema or invalid value
127
*/
128
public StructuredRecord.Builder set(String fieldName, Object value);
129
```
130
131
**Usage Examples:**
132
133
```java
134
Schema schema = Schema.recordOf("Product",
135
Schema.Field.of("id", Schema.of(Schema.Type.LONG)),
136
Schema.Field.of("name", Schema.of(Schema.Type.STRING)),
137
Schema.Field.of("price", Schema.nullableOf(Schema.of(Schema.Type.DOUBLE))),
138
Schema.Field.of("active", Schema.of(Schema.Type.BOOLEAN))
139
);
140
141
StructuredRecord product = StructuredRecord.builder(schema)
142
.set("id", 12345L)
143
.set("name", "Widget")
144
.set("price", 29.99) // Can be null for nullable fields
145
.set("active", true)
146
.build();
147
148
// Nullable field example
149
StructuredRecord productNoPrice = StructuredRecord.builder(schema)
150
.set("id", 67890L)
151
.set("name", "Gadget")
152
.set("price", null) // Explicitly set null
153
.set("active", false)
154
.build();
155
```
156
157
### Date and Time Field Assignment
158
159
Set date and time fields using Java 8 time types with automatic conversion to underlying storage format.
160
161
```java { .api }
162
/**
163
* Set DATE logical type field
164
* @param fieldName Field name (must be DATE logical type)
165
* @param localDate Date value
166
* @return Builder instance
167
* @throws UnexpectedFormatException if field is not DATE type or date too large
168
*/
169
public StructuredRecord.Builder setDate(String fieldName, LocalDate localDate);
170
171
/**
172
* Set TIME logical type field
173
* @param fieldName Field name (must be TIME_MILLIS or TIME_MICROS)
174
* @param localTime Time value
175
* @return Builder instance
176
* @throws UnexpectedFormatException if field is not TIME type or time too large
177
*/
178
public StructuredRecord.Builder setTime(String fieldName, LocalTime localTime);
179
180
/**
181
* Set TIMESTAMP logical type field
182
* @param fieldName Field name (must be TIMESTAMP_MILLIS or TIMESTAMP_MICROS)
183
* @param zonedDateTime Timestamp value
184
* @return Builder instance
185
* @throws UnexpectedFormatException if field is not TIMESTAMP type or timestamp too large
186
*/
187
public StructuredRecord.Builder setTimestamp(String fieldName, ZonedDateTime zonedDateTime);
188
```
189
190
**Usage Examples:**
191
192
```java
193
Schema eventSchema = Schema.recordOf("Event",
194
Schema.Field.of("eventDate", Schema.of(Schema.LogicalType.DATE)),
195
Schema.Field.of("eventTime", Schema.of(Schema.LogicalType.TIME_MILLIS)),
196
Schema.Field.of("timestamp", Schema.of(Schema.LogicalType.TIMESTAMP_MILLIS))
197
);
198
199
LocalDate today = LocalDate.now();
200
LocalTime noon = LocalTime.of(12, 0, 0);
201
ZonedDateTime now = ZonedDateTime.now();
202
203
StructuredRecord event = StructuredRecord.builder(eventSchema)
204
.setDate("eventDate", today)
205
.setTime("eventTime", noon)
206
.setTimestamp("timestamp", now)
207
.build();
208
209
// Nullable date/time fields
210
Schema nullableEventSchema = Schema.recordOf("Event",
211
Schema.Field.of("optionalDate", Schema.nullableOf(Schema.of(Schema.LogicalType.DATE)))
212
);
213
214
StructuredRecord eventWithNull = StructuredRecord.builder(nullableEventSchema)
215
.setDate("optionalDate", null) // Null value for nullable field
216
.build();
217
```
218
219
### String Conversion and Legacy Date Support
220
221
Convert string values to appropriate field types and handle legacy Date objects.
222
223
```java { .api }
224
/**
225
* Convert string to field type and set value
226
* @param fieldName Field name
227
* @param strVal String value to convert
228
* @return Builder instance
229
* @throws UnexpectedFormatException if conversion fails or field invalid
230
*/
231
public StructuredRecord.Builder convertAndSet(String fieldName, String strVal);
232
233
/**
234
* Convert Date to field type and set value (deprecated)
235
* @param fieldName Field name
236
* @param date Date value
237
* @return Builder instance
238
* @throws UnexpectedFormatException if conversion fails
239
* @deprecated Use setDate, setTime, setTimestamp instead
240
*/
241
@Deprecated
242
public StructuredRecord.Builder convertAndSet(String fieldName, Date date);
243
244
/**
245
* Convert Date with format to field type and set value (deprecated)
246
* @param fieldName Field name
247
* @param date Date value
248
* @param dateFormat Format for string conversion
249
* @return Builder instance
250
* @throws UnexpectedFormatException if conversion fails
251
* @deprecated Use setDate, setTime, setTimestamp instead
252
*/
253
@Deprecated
254
public StructuredRecord.Builder convertAndSet(String fieldName, Date date, DateFormat dateFormat);
255
```
256
257
**Usage Examples:**
258
259
```java
260
Schema schema = Schema.recordOf("Data",
261
Schema.Field.of("id", Schema.of(Schema.Type.LONG)),
262
Schema.Field.of("score", Schema.of(Schema.Type.DOUBLE)),
263
Schema.Field.of("active", Schema.of(Schema.Type.BOOLEAN)),
264
Schema.Field.of("name", Schema.of(Schema.Type.STRING))
265
);
266
267
// String conversion automatically handles type conversion
268
StructuredRecord record = StructuredRecord.builder(schema)
269
.convertAndSet("id", "12345") // String "12345" -> Long 12345
270
.convertAndSet("score", "98.5") // String "98.5" -> Double 98.5
271
.convertAndSet("active", "true") // String "true" -> Boolean true
272
.convertAndSet("name", "John Doe") // String -> String (no conversion)
273
.build();
274
275
// Nullable field string conversion
276
Schema nullableSchema = Schema.recordOf("Data",
277
Schema.Field.of("optionalValue", Schema.nullableOf(Schema.of(Schema.Type.INT)))
278
);
279
280
StructuredRecord withNull = StructuredRecord.builder(nullableSchema)
281
.convertAndSet("optionalValue", null) // null string -> null value
282
.build();
283
```
284
285
### Record Finalization
286
287
Build the final structured record with validation.
288
289
```java { .api }
290
/**
291
* Build final StructuredRecord with validation
292
* @return Completed StructuredRecord
293
* @throws UnexpectedFormatException if non-nullable fields missing values
294
*/
295
public StructuredRecord build();
296
```
297
298
**Usage Example:**
299
300
```java
301
Schema schema = Schema.recordOf("User",
302
Schema.Field.of("id", Schema.of(Schema.Type.LONG)), // Required
303
Schema.Field.of("name", Schema.of(Schema.Type.STRING)), // Required
304
Schema.Field.of("email", Schema.nullableOf(Schema.of(Schema.Type.STRING))) // Optional
305
);
306
307
// Valid record - all required fields set
308
StructuredRecord validUser = StructuredRecord.builder(schema)
309
.set("id", 123L)
310
.set("name", "Alice")
311
// email not set, but nullable so gets null value
312
.build();
313
314
// Invalid record - missing required field will throw exception
315
try {
316
StructuredRecord invalidUser = StructuredRecord.builder(schema)
317
.set("id", 123L)
318
// Missing required "name" field
319
.build(); // Throws UnexpectedFormatException
320
} catch (UnexpectedFormatException e) {
321
// Handle validation error
322
}
323
```
324
325
## Type Conversion Rules
326
327
### String to Type Conversion
328
329
The `convertAndSet(String fieldName, String strVal)` method supports automatic conversion:
330
331
- **BOOLEAN**: `Boolean.parseBoolean(strVal)`
332
- **INT**: `Integer.parseInt(strVal)`
333
- **LONG**: `Long.parseLong(strVal)`
334
- **FLOAT**: `Float.parseFloat(strVal)`
335
- **DOUBLE**: `Double.parseDouble(strVal)`
336
- **BYTES**: `Bytes.toBytesBinary(strVal)` (binary-escaped format)
337
- **STRING**: No conversion (direct assignment)
338
- **NULL**: Always returns null
339
340
### Nullable Field Handling
341
342
- Nullable fields (union with null) accept null values
343
- Non-nullable fields throw `UnexpectedFormatException` for null values
344
- Missing fields in build() get null for nullable fields or throw exception for required fields
345
346
### Date/Time Type Storage
347
348
- **DATE**: Stored as INT (days since Unix epoch, max value ~2038-01-01)
349
- **TIME_MILLIS**: Stored as INT (milliseconds since midnight)
350
- **TIME_MICROS**: Stored as LONG (microseconds since midnight)
351
- **TIMESTAMP_MILLIS**: Stored as LONG (milliseconds since Unix epoch)
352
- **TIMESTAMP_MICROS**: Stored as LONG (microseconds since Unix epoch)
353
354
## Validation and Error Handling
355
356
### Field Validation
357
358
- Field names must exist in the record schema
359
- Field values must be compatible with schema types
360
- Null values only allowed for nullable (union with null) fields
361
- Date/time values must fit within storage type ranges
362
363
### Common Exceptions
364
365
```java { .api }
366
// Thrown when field not in schema
367
throw new UnexpectedFormatException("field " + fieldName + " is not in the schema.");
368
369
// Thrown when setting null to non-nullable field
370
throw new UnexpectedFormatException("field " + fieldName + " cannot be set to a null value.");
371
372
// Thrown when required field missing in build()
373
throw new UnexpectedFormatException("Field " + fieldName + " must contain a value.");
374
375
// Thrown when date/time value too large for storage type
376
throw new UnexpectedFormatException("Field " + fieldName + " was set to a date that is too large.");
377
```
378
379
## Performance Considerations
380
381
- StructuredRecord instances are immutable after construction
382
- Builder pattern allows efficient field-by-field construction
383
- Schema validation occurs during builder operations, not at record access time
384
- Date/time conversions handle precision and timezone conversions automatically
385
- Field access by name uses efficient hash-based lookup