0
# Schema Utilities
1
2
Utility functions for schema traversal, cloning, manipulation operations, and the visitor pattern for custom schema processing.
3
4
## Capabilities
5
6
### Schemas Utility Class
7
8
Static utility methods for common schema operations.
9
10
```java { .api }
11
/**
12
* Avro Schema utilities, to traverse and manipulate schemas.
13
* All methods are static and thread-safe.
14
*/
15
public final class Schemas {
16
17
/**
18
* Copy aliases from one schema to another (only for named types: RECORD, ENUM, FIXED)
19
* @param from Source schema
20
* @param to Destination schema
21
*/
22
public static void copyAliases(Schema from, Schema to);
23
24
/**
25
* Copy aliases from one field to another
26
* @param from Source field
27
* @param to Destination field
28
*/
29
public static void copyAliases(Schema.Field from, Schema.Field to);
30
31
/**
32
* Copy logical types from one schema to another
33
* @param from Source schema with logical type
34
* @param to Destination schema
35
*/
36
public static void copyLogicalTypes(Schema from, Schema to);
37
38
/**
39
* Copy properties from one JsonProperties object to another
40
* @param from Source properties
41
* @param to Destination properties
42
*/
43
public static void copyProperties(JsonProperties from, JsonProperties to);
44
45
/**
46
* Check if schema generates a Java class (ENUM, RECORD, FIXED types)
47
* @param schema Schema to check
48
* @return true if schema generates Java class
49
*/
50
public static boolean hasGeneratedJavaClass(Schema schema);
51
52
/**
53
* Get the Java class name for a schema
54
* @param schema Schema to get class name for
55
* @return Fully qualified Java class name
56
*/
57
public static String getJavaClassName(Schema schema);
58
59
/**
60
* Depth-first visit of schema tree using visitor pattern
61
* @param start Starting schema for traversal
62
* @param visitor Visitor implementation
63
* @return Result from visitor.get()
64
*/
65
public static <T> T visit(Schema start, SchemaVisitor<T> visitor);
66
}
67
```
68
69
### SchemaVisitor Interface
70
71
Visitor pattern interface for custom schema processing and traversal.
72
73
```java { .api }
74
/**
75
* Visitor pattern interface for schema traversal.
76
* Generic type T represents the result type of the visit operation.
77
*/
78
public interface SchemaVisitor<T> {
79
80
/**
81
* Invoked for schemas that do not have "child" schemas (like string, int ...)
82
* or for a previously encountered schema with children, which will be treated
83
* as a terminal to avoid circular recursion.
84
* @param terminal Terminal schema node
85
* @return Action to control traversal
86
*/
87
SchemaVisitorAction visitTerminal(Schema terminal);
88
89
/**
90
* Invoked for schema with children before proceeding to visit the children.
91
* @param nonTerminal Non-terminal schema node
92
* @return Action to control traversal
93
*/
94
SchemaVisitorAction visitNonTerminal(Schema nonTerminal);
95
96
/**
97
* Invoked for schemas with children after its children have been visited.
98
* @param nonTerminal Non-terminal schema node after visiting children
99
* @return Action to control traversal
100
*/
101
SchemaVisitorAction afterVisitNonTerminal(Schema nonTerminal);
102
103
/**
104
* Invoked when visiting is complete.
105
* @return Final result of the visit operation
106
*/
107
T get();
108
}
109
```
110
111
### SchemaVisitorAction Enum
112
113
Controls the flow of schema traversal operations.
114
115
```java { .api }
116
/**
117
* Actions for controlling schema traversal behavior
118
*/
119
public enum SchemaVisitorAction {
120
/** Continue normal traversal */
121
CONTINUE,
122
123
/** Skip the current subtree but continue with siblings */
124
SKIP_SUBTREE,
125
126
/** Skip remaining siblings at current level */
127
SKIP_SIBLINGS,
128
129
/** Terminate the entire traversal immediately */
130
TERMINATE
131
}
132
```
133
134
### CloningVisitor Class
135
136
Built-in visitor implementation for cloning schemas with customizable property copying.
137
138
```java { .api }
139
/**
140
* Visitor implementation for cloning schemas.
141
* Creates a clone of the original Schema with docs and other nonessential
142
* fields stripped by default. What attributes are copied is customizable.
143
*/
144
public final class CloningVisitor implements SchemaVisitor<Schema> {
145
146
/** Interface for customizing property copying behavior */
147
public interface PropertyCopier {
148
void copy(Schema first, Schema second);
149
void copy(Schema.Field first, Schema.Field second);
150
}
151
152
/**
153
* Create cloning visitor that copies only serialization necessary fields
154
* @param root Root schema to clone
155
*/
156
public CloningVisitor(Schema root);
157
158
/**
159
* Create cloning visitor with custom property copying behavior
160
* @param copyProperties Custom property copier
161
* @param copyDocs Whether to copy documentation
162
* @param root Root schema to clone
163
*/
164
public CloningVisitor(PropertyCopier copyProperties, boolean copyDocs, Schema root);
165
}
166
```
167
168
### ResolvingVisitor Class
169
170
Visitor implementation for resolving schema references in IDL contexts.
171
172
```java { .api }
173
/**
174
* Visitor for resolving unresolved schema references.
175
* Used primarily in IDL processing contexts.
176
*/
177
public final class ResolvingVisitor implements SchemaVisitor<Schema> {
178
179
/**
180
* Create resolving visitor
181
* @param root Root schema
182
* @param replace Map of schemas to replace
183
* @param symbolTable Function to resolve symbol names to schemas
184
*/
185
public ResolvingVisitor(Schema root, IdentityHashMap<Schema, Schema> replace,
186
Function<String, Schema> symbolTable);
187
}
188
```
189
190
### IsResolvedSchemaVisitor Class
191
192
Visitor to check if a schema is fully resolved (no unresolved references).
193
194
```java { .api }
195
/**
196
* Visitor that checks if the current schema is fully resolved.
197
* Returns true if schema has no unresolved parts.
198
*/
199
public final class IsResolvedSchemaVisitor implements SchemaVisitor<Boolean> {
200
201
/** Create visitor to check schema resolution status */
202
IsResolvedSchemaVisitor();
203
}
204
```
205
206
## Usage Examples
207
208
### Basic Schema Utilities
209
210
```java
211
import org.apache.avro.Schema;
212
import org.apache.avro.compiler.schema.Schemas;
213
214
// Check if schema generates Java class
215
Schema userSchema = new Schema.Parser().parse(new File("user.avsc"));
216
boolean hasClass = Schemas.hasGeneratedJavaClass(userSchema); // true for RECORD
217
218
Schema stringSchema = Schema.create(Schema.Type.STRING);
219
boolean hasClass2 = Schemas.hasGeneratedJavaClass(stringSchema); // false
220
221
// Get Java class name
222
String className = Schemas.getJavaClassName(userSchema);
223
// Returns: "com.example.User" (if namespace is com.example)
224
```
225
226
### Schema Property Copying
227
228
```java
229
import org.apache.avro.Schema;
230
import org.apache.avro.compiler.schema.Schemas;
231
232
// Copy aliases between schemas
233
Schema originalSchema = /* ... */;
234
Schema clonedSchema = /* ... */;
235
236
Schemas.copyAliases(originalSchema, clonedSchema);
237
Schemas.copyLogicalTypes(originalSchema, clonedSchema);
238
Schemas.copyProperties(originalSchema, clonedSchema);
239
```
240
241
### Custom Schema Visitor
242
243
```java
244
import org.apache.avro.Schema;
245
import org.apache.avro.compiler.schema.*;
246
247
// Count all record schemas in a complex schema
248
public class RecordCountingVisitor implements SchemaVisitor<Integer> {
249
private int recordCount = 0;
250
251
@Override
252
public SchemaVisitorAction visitTerminal(Schema terminal) {
253
if (terminal.getType() == Schema.Type.RECORD) {
254
recordCount++;
255
}
256
return SchemaVisitorAction.CONTINUE;
257
}
258
259
@Override
260
public SchemaVisitorAction visitNonTerminal(Schema nonTerminal) {
261
if (nonTerminal.getType() == Schema.Type.RECORD) {
262
recordCount++;
263
}
264
return SchemaVisitorAction.CONTINUE;
265
}
266
267
@Override
268
public SchemaVisitorAction afterVisitNonTerminal(Schema nonTerminal) {
269
return SchemaVisitorAction.CONTINUE;
270
}
271
272
@Override
273
public Integer get() {
274
return recordCount;
275
}
276
}
277
278
// Usage
279
Schema complexSchema = /* ... */;
280
Integer recordCount = Schemas.visit(complexSchema, new RecordCountingVisitor());
281
System.out.println("Found " + recordCount + " record schemas");
282
```
283
284
### Schema Name Collection Visitor
285
286
```java
287
import java.util.*;
288
289
// Collect all named schema types
290
public class NameCollectingVisitor implements SchemaVisitor<Set<String>> {
291
private Set<String> names = new HashSet<>();
292
293
@Override
294
public SchemaVisitorAction visitTerminal(Schema terminal) {
295
if (terminal.getName() != null) {
296
names.add(terminal.getFullName());
297
}
298
return SchemaVisitorAction.CONTINUE;
299
}
300
301
@Override
302
public SchemaVisitorAction visitNonTerminal(Schema nonTerminal) {
303
if (nonTerminal.getName() != null) {
304
names.add(nonTerminal.getFullName());
305
}
306
return SchemaVisitorAction.CONTINUE;
307
}
308
309
@Override
310
public SchemaVisitorAction afterVisitNonTerminal(Schema nonTerminal) {
311
return SchemaVisitorAction.CONTINUE;
312
}
313
314
@Override
315
public Set<String> get() {
316
return names;
317
}
318
}
319
320
// Usage
321
Set<String> schemaNames = Schemas.visit(schema, new NameCollectingVisitor());
322
```
323
324
### Conditional Traversal Control
325
326
```java
327
// Visitor that stops at first error type encountered
328
public class ErrorFindingVisitor implements SchemaVisitor<Schema> {
329
private Schema errorSchema = null;
330
331
@Override
332
public SchemaVisitorAction visitTerminal(Schema terminal) {
333
if (terminal.getType() == Schema.Type.RECORD &&
334
terminal.getName().endsWith("Error")) {
335
errorSchema = terminal;
336
return SchemaVisitorAction.TERMINATE; // Stop immediately
337
}
338
return SchemaVisitorAction.CONTINUE;
339
}
340
341
@Override
342
public SchemaVisitorAction visitNonTerminal(Schema nonTerminal) {
343
if (nonTerminal.getType() == Schema.Type.UNION) {
344
// Skip all union types and their children
345
return SchemaVisitorAction.SKIP_SUBTREE;
346
}
347
return SchemaVisitorAction.CONTINUE;
348
}
349
350
@Override
351
public SchemaVisitorAction afterVisitNonTerminal(Schema nonTerminal) {
352
return SchemaVisitorAction.CONTINUE;
353
}
354
355
@Override
356
public Schema get() {
357
return errorSchema;
358
}
359
}
360
```
361
362
### Schema Cloning
363
364
```java
365
import org.apache.avro.compiler.schema.CloningVisitor;
366
367
// Clone a schema structure (default: copy only essential fields)
368
Schema originalSchema = /* ... */;
369
Schema clonedSchema = Schemas.visit(originalSchema, new CloningVisitor(originalSchema));
370
371
// Clone with custom property copying
372
CloningVisitor.PropertyCopier customCopier = new CloningVisitor.PropertyCopier() {
373
@Override
374
public void copy(Schema first, Schema second) {
375
Schemas.copyAliases(first, second);
376
Schemas.copyLogicalTypes(first, second);
377
// Copy custom properties
378
first.forEachProperty(second::addProp);
379
}
380
381
@Override
382
public void copy(Schema.Field first, Schema.Field second) {
383
Schemas.copyAliases(first, second);
384
}
385
};
386
387
Schema clonedWithDocs = Schemas.visit(originalSchema,
388
new CloningVisitor(customCopier, true, originalSchema));
389
```
390
391
### Schema Resolution Checking
392
393
```java
394
import org.apache.avro.compiler.idl.IsResolvedSchemaVisitor;
395
396
// Check if schema is fully resolved (no unresolved references)
397
Schema schema = /* ... */;
398
Boolean isResolved = Schemas.visit(schema, new IsResolvedSchemaVisitor());
399
400
if (isResolved) {
401
System.out.println("Schema is fully resolved");
402
} else {
403
System.out.println("Schema has unresolved references");
404
}
405
```
406
407
### Schema Reference Resolution
408
409
```java
410
import org.apache.avro.compiler.idl.ResolvingVisitor;
411
import java.util.*;
412
import java.util.function.Function;
413
414
// Resolve schema references using symbol table
415
IdentityHashMap<Schema, Schema> replacements = new IdentityHashMap<>();
416
Function<String, Schema> symbolTable = schemaName -> {
417
// Look up schema by name in symbol table
418
return findSchemaByName(schemaName);
419
};
420
421
Schema unresolvedSchema = /* ... */;
422
Schema resolvedSchema = Schemas.visit(unresolvedSchema,
423
new ResolvingVisitor(unresolvedSchema, replacements, symbolTable));
424
```
425
426
### Field Alias Copying
427
428
```java
429
// Copy aliases between record fields
430
Schema.Field originalField = /* ... */;
431
Schema.Field newField = /* ... */;
432
433
Schemas.copyAliases(originalField, newField);
434
// Now newField has all the aliases from originalField
435
```
436
437
## Traversal Behavior
438
439
The `Schemas.visit()` method performs depth-first traversal with these characteristics:
440
441
- **Circular Reference Handling**: Previously visited schemas are treated as terminals to avoid infinite recursion
442
- **Order**: Array elements, record fields, union types, and map values are visited in definition order
443
- **State Management**: Visitor maintains its own state across the traversal
444
- **Early Termination**: `TERMINATE` action stops traversal immediately and returns current result
445
- **Subtree Skipping**: `SKIP_SUBTREE` skips children but continues with siblings
446
- **Sibling Skipping**: `SKIP_SIBLINGS` skips remaining siblings at current level
447
448
## Thread Safety
449
450
- All `Schemas` utility methods are thread-safe
451
- `SchemaVisitor` implementations should be designed for single-threaded use
452
- Each visitor instance should be used for only one traversal operation