0
# Transformations
1
2
Bidirectional transformation system for preprocessing input graphs and postprocessing algorithm results. Transformations enable type conversions, data scaling, and format adaptations while maintaining result consistency.
3
4
## Capabilities
5
6
### Transform Interface
7
8
Base interface for bidirectional transformations supporting both input preprocessing and result postprocessing.
9
10
```java { .api }
11
/**
12
* Bidirectional transformation interface
13
* @param <II> Input data type (before transformation)
14
* @param <IO> Input output type (after input transformation)
15
* @param <RI> Result input type (before reverse transformation)
16
* @param <RO> Result output type (after reverse transformation)
17
*/
18
public interface Transform<II, IO, RI, RO> extends Parameterized {
19
/** Human-readable transform identifier */
20
String getIdentity();
21
22
/** Forward transformation applied to input data */
23
IO transformInput(II input) throws Exception;
24
25
/** Reverse transformation applied to algorithm results */
26
RO transformResult(RI result) throws Exception;
27
}
28
```
29
30
### Transformable Interface
31
32
Interface for components that support transformations.
33
34
```java { .api }
35
/**
36
* Indicates that a component supports transformations
37
*/
38
public interface Transformable {
39
/** Get list of transforms supported by this component */
40
List<Transform> getTransformers();
41
}
42
```
43
44
## Graph Key Type Transformations
45
46
### Graph Key Type Transform
47
48
Transforms graph vertex and edge key types between different numeric representations with automatic bounds checking and reverse mapping support.
49
50
```java { .api }
51
/**
52
* Transform graph key types for memory optimization and compatibility
53
* @param <VV> Vertex value type (unchanged)
54
* @param <EV> Edge value type (unchanged)
55
*/
56
public class GraphKeyTypeTransform<VV, EV> extends ParameterizedBase
57
implements Transform<Graph<LongValue, VV, EV>, Graph<?, VV, EV>, DataSet<?>, DataSet<LongValue>> {
58
59
/** Target key type for transformation */
60
ChoiceParameter type;
61
62
/** Disable reverse transformation for results */
63
BooleanParameter disableTypeReversal;
64
65
/** Create transform with vertex count for bounds checking */
66
public GraphKeyTypeTransform(long vertexCount);
67
}
68
```
69
70
**Supported Key Types:**
71
72
```java { .api }
73
// Available transformation target types
74
public enum KeyType {
75
BYTE, // 8-bit signed integer (-128 to 127)
76
SHORT, // 16-bit signed integer (-32,768 to 32,767)
77
INT, // 32-bit signed integer
78
LONG, // 64-bit signed integer (default)
79
FLOAT, // 32-bit floating point
80
DOUBLE, // 64-bit floating point
81
STRING // String representation
82
}
83
```
84
85
**Usage Examples:**
86
87
```java
88
// Transform large vertex IDs to smaller integers
89
GraphKeyTypeTransform<NullValue, NullValue> transform =
90
new GraphKeyTypeTransform<>(10000); // Max 10K vertices
91
92
transform.configure(ParameterTool.fromArgs(new String[]{
93
"--type", "short" // Use 16-bit integers instead of 64-bit
94
}));
95
96
// Apply transformation
97
Graph<ShortValue, NullValue, NullValue> compactGraph =
98
transform.transformInput(originalGraph);
99
100
// Results are automatically transformed back to original type
101
DataSet<LongValue> originalResults =
102
transform.transformResult(algorithmResults);
103
```
104
105
**Command-Line Usage:**
106
107
```bash
108
# Transform to 32-bit integers for memory efficiency
109
--type int
110
111
# Transform to bytes for very small graphs (up to 127 vertices)
112
--type byte
113
114
# Transform to strings for debugging
115
--type string
116
117
# Disable reverse transformation (keep results in transformed type)
118
--type short --disable_type_reversal
119
```
120
121
### Long Value with Proper Hash Code
122
123
Enhanced LongValue implementation with correct hash code computation for use in transformations.
124
125
```java { .api }
126
/**
127
* LongValue with proper hash code implementation
128
* Used internally by transformations for consistent hashing
129
*/
130
public class LongValueWithProperHashCode extends LongValue {
131
/** Construct from long value */
132
public LongValueWithProperHashCode(long value);
133
134
/** Proper hash code implementation */
135
public int hashCode();
136
137
/** Enhanced equals implementation */
138
public boolean equals(Object obj);
139
}
140
```
141
142
## Transformation Usage Patterns
143
144
### Input Transformation
145
146
Transformations are automatically applied to inputs that implement the `Transformable` interface.
147
148
```java { .api }
149
// Example of transformable input
150
public class TransformableInput extends GeneratedGraph implements Transformable {
151
private GraphKeyTypeTransform<NullValue, NullValue> keyTransform;
152
153
@Override
154
public List<Transform> getTransformers() {
155
return Arrays.asList(keyTransform);
156
}
157
158
@Override
159
public Graph create(ExecutionEnvironment env) throws Exception {
160
Graph originalGraph = generateGraph(env); // Generate with LongValue keys
161
return keyTransform.transformInput(originalGraph); // Transform to smaller keys
162
}
163
}
164
```
165
166
### Algorithm Result Transformation
167
168
Results are automatically transformed back to original format after algorithm execution.
169
170
```java { .api }
171
// Transformation pipeline in Runner
172
List<Transform> transforms = new ArrayList<>();
173
174
// Collect transforms from input and algorithm
175
if (input instanceof Transformable) {
176
transforms.addAll(((Transformable) input).getTransformers());
177
}
178
if (algorithm instanceof Transformable) {
179
transforms.addAll(((Transformable) algorithm).getTransformers());
180
}
181
182
// Apply forward transforms to input
183
Graph transformedGraph = input.create(env);
184
for (Transform transform : transforms) {
185
transformedGraph = transform.transformInput(transformedGraph);
186
}
187
188
// Run algorithm on transformed graph
189
DataSet result = algorithm.plan(transformedGraph);
190
191
// Apply reverse transforms to results (in reverse order)
192
Collections.reverse(transforms);
193
for (Transform transform : transforms) {
194
result = transform.transformResult(result);
195
}
196
```
197
198
### Memory Optimization Example
199
200
```java
201
// Large graph with 1M vertices - use integer keys to save memory
202
String[] args = {
203
"--algorithm", "PageRank",
204
"--input", "CompleteGraph",
205
"--vertex_count", "1000000",
206
"--type", "int", // Use 32-bit integers instead of 64-bit
207
"--output", "CSV",
208
"--output_filename", "pagerank_results.csv"
209
};
210
211
Runner runner = new Runner(args);
212
runner.run(); // Automatically applies transformations
213
```
214
215
### Type Compatibility Example
216
217
```java
218
// Transform string vertex IDs to numeric for algorithm compatibility
219
String[] args = {
220
"--algorithm", "ConnectedComponents",
221
"--input", "CSV",
222
"--input_filename", "string_vertices.csv",
223
"--type", "long", // Convert strings to longs
224
"--output", "CSV",
225
"--output_filename", "components.csv"
226
// Results will be transformed back to original string format
227
};
228
```
229
230
## Transform Configuration
231
232
### Bounds Checking
233
234
Transformations automatically validate that vertex counts fit within target type ranges.
235
236
```java { .api }
237
// Automatic bounds validation
238
GraphKeyTypeTransform transform = new GraphKeyTypeTransform(100000);
239
transform.configure(ParameterTool.fromArgs(new String[]{"--type", "short"}));
240
// Throws exception: vertex count 100000 exceeds SHORT range (-32768 to 32767)
241
```
242
243
### Type Selection Guidelines
244
245
| Target Type | Range | Memory Savings | Use Case |
246
|-------------|--------|----------------|----------|
247
| `byte` | -128 to 127 | 87.5% | Tiny graphs (< 128 vertices) |
248
| `short` | -32,768 to 32,767 | 75% | Small graphs (< 32K vertices) |
249
| `int` | -2.1B to 2.1B | 50% | Medium graphs (< 2B vertices) |
250
| `long` | Full range | 0% | Default (no transformation) |
251
| `float` | 32-bit float | 50% | Approximate numeric keys |
252
| `double` | 64-bit float | 0% | High-precision numeric keys |
253
| `string` | Variable | Variable | Debugging and compatibility |
254
255
### Performance Considerations
256
257
- **Memory Usage**: Smaller key types reduce memory consumption for large graphs
258
- **CPU Overhead**: Transformation adds computational cost during input/output phases
259
- **Hash Performance**: Proper hash code implementation ensures efficient hash-based operations
260
- **Serialization**: Smaller types reduce network and disk I/O overhead
261
262
### Error Handling
263
264
Transformations provide comprehensive error handling for type conversion issues:
265
266
```java { .api }
267
// Common transformation errors
268
public class TransformationException extends Exception {
269
// Thrown for:
270
// - Vertex count exceeds target type range
271
// - Invalid type conversion (e.g., non-numeric strings to numbers)
272
// - Reverse transformation failures
273
// - Memory allocation errors during transformation
274
}
275
```
276
277
**Error Examples:**
278
279
```
280
Error: Vertex count 100000 exceeds range for type 'short' (-32768 to 32767)
281
Error: Cannot convert vertex key 'invalid_number' to numeric type
282
Error: Transformation failed: insufficient memory for key mapping table
283
```