0
# Performance Testing Programs
1
2
Manual programs for performance benchmarking, scalability testing, and resource usage validation. These standalone programs are designed for manual execution and analysis, providing insights into Flink application performance characteristics under various workloads and configurations.
3
4
## Capabilities
5
6
### Massive String Sorting
7
8
Large-scale string sorting performance test for batch processing.
9
10
```java { .api }
11
/**
12
* Large-scale string sorting test for batch processing performance
13
*/
14
public class MassiveStringSorting {
15
16
/**
17
* Main entry point for massive string sorting performance test
18
* Tests DataSet sorting performance with large string datasets
19
* @param args Command line arguments: [numElements] [outputPath]
20
* @throws Exception if sorting test fails
21
*/
22
public static void main(String[] args) throws Exception;
23
}
24
```
25
26
**Usage:**
27
```bash
28
# Sort 10 million strings and output to file
29
java -cp flink-tests.jar org.apache.flink.test.manual.MassiveStringSorting 10000000 /tmp/sorted-output
30
31
# Monitor memory usage and execution time
32
java -Xmx4g -XX:+PrintGCDetails -cp flink-tests.jar \
33
org.apache.flink.test.manual.MassiveStringSorting 10000000 /tmp/output
34
```
35
36
### Massive StringValue Sorting
37
38
Large-scale StringValue sorting performance test optimized for Flink Value types.
39
40
```java { .api }
41
/**
42
* Large-scale StringValue sorting test for optimized performance
43
*/
44
public class MassiveStringValueSorting {
45
46
/**
47
* Main entry point for massive StringValue sorting performance test
48
* Tests DataSet sorting performance with Flink StringValue types for reduced GC pressure
49
* @param args Command line arguments: [numElements] [outputPath]
50
* @throws Exception if sorting test fails
51
*/
52
public static void main(String[] args) throws Exception;
53
}
54
```
55
56
**Usage:**
57
```bash
58
# Sort 10 million StringValues for comparison with regular strings
59
java -cp flink-tests.jar org.apache.flink.test.manual.MassiveStringValueSorting 10000000 /tmp/stringvalue-output
60
61
# Compare GC behavior with regular string sorting
62
java -Xmx4g -XX:+PrintGCDetails -XX:+PrintGCTimeStamps \
63
-cp flink-tests.jar org.apache.flink.test.manual.MassiveStringValueSorting 10000000 /tmp/output
64
```
65
66
### Streaming Scalability and Latency
67
68
Comprehensive streaming performance test measuring throughput and latency characteristics.
69
70
```java { .api }
71
/**
72
* Streaming performance test for scalability and latency measurement
73
*/
74
public class StreamingScalabilityAndLatency {
75
76
/**
77
* Main entry point for streaming scalability and latency test
78
* Tests streaming throughput, latency, and scalability under various loads
79
* @param args Command line arguments: [throughputTarget] [duration] [parallelism]
80
* @throws Exception if streaming test fails
81
*/
82
public static void main(String[] args) throws Exception;
83
}
84
```
85
86
**Usage:**
87
```bash
88
# Test with 100k events/sec for 60 seconds with parallelism 8
89
java -cp flink-tests.jar org.apache.flink.test.manual.StreamingScalabilityAndLatency 100000 60 8
90
91
# High throughput test with monitoring
92
java -Xmx8g -XX:+UseG1GC -cp flink-tests.jar \
93
org.apache.flink.test.manual.StreamingScalabilityAndLatency 1000000 300 16
94
```
95
96
### Reduce Performance
97
98
Performance test for reduce operations measuring computation efficiency.
99
100
```java { .api }
101
/**
102
* Performance test for reduce operations
103
*/
104
public class ReducePerformance {
105
106
/**
107
* Main entry point for reduce performance test
108
* Tests performance of various reduce operations and aggregations
109
* @param args Command line arguments: [numElements] [numKeys] [numReduceIterations]
110
* @throws Exception if reduce test fails
111
*/
112
public static void main(String[] args) throws Exception;
113
}
114
```
115
116
**Usage:**
117
```bash
118
# Test reduce with 1M elements, 1000 keys, 10 iterations
119
java -cp flink-tests.jar org.apache.flink.test.manual.ReducePerformance 1000000 1000 10
120
121
# CPU-intensive reduce test
122
java -Xmx2g -XX:+UseParallelGC -cp flink-tests.jar \
123
org.apache.flink.test.manual.ReducePerformance 10000000 10000 5
124
```
125
126
### Object Overwrite Performance
127
128
Test for object reuse and memory allocation patterns.
129
130
```java { .api }
131
/**
132
* Test for object overwrite and reuse performance
133
*/
134
public class OverwriteObjects {
135
136
/**
137
* Main entry point for object overwrite performance test
138
* Tests impact of object reuse vs allocation on performance and GC
139
* @param args Command line arguments: [numIterations] [objectSize] [reuseObjects]
140
* @throws Exception if overwrite test fails
141
*/
142
public static void main(String[] args) throws Exception;
143
}
144
```
145
146
**Usage:**
147
```bash
148
# Test with object reuse enabled
149
java -cp flink-tests.jar org.apache.flink.test.manual.OverwriteObjects 1000000 1024 true
150
151
# Test without object reuse for comparison
152
java -cp flink-tests.jar org.apache.flink.test.manual.OverwriteObjects 1000000 1024 false
153
```
154
155
### Hash Table Record Width Combinations
156
157
Performance test for hash table operations with various record widths.
158
159
```java { .api }
160
/**
161
* Performance test for hash table operations with different record widths
162
*/
163
public class HashTableRecordWidthCombinations {
164
165
/**
166
* Main entry point for hash table record width performance test
167
* Tests hash table performance with various record sizes and configurations
168
* @param args Command line arguments: [numRecords] [recordWidth] [numBuckets]
169
* @throws Exception if hash table test fails
170
*/
171
public static void main(String[] args) throws Exception;
172
}
173
```
174
175
**Usage:**
176
```bash
177
# Test hash table with 1M records, width 128 bytes, 1024 buckets
178
java -cp flink-tests.jar org.apache.flink.test.manual.HashTableRecordWidthCombinations 1000000 128 1024
179
180
# Memory-intensive hash table test
181
java -Xmx4g -XX:+PrintGCDetails -cp flink-tests.jar \
182
org.apache.flink.test.manual.HashTableRecordWidthCombinations 5000000 512 4096
183
```
184
185
### Performance Testing Patterns
186
187
Common patterns for running and analyzing performance tests:
188
189
**Batch Performance Testing:**
190
191
```bash
192
#!/bin/bash
193
# Batch performance test script
194
195
# Test different data sizes
196
for size in 1000000 5000000 10000000 50000000; do
197
echo "Testing with $size elements..."
198
199
# Run string sorting test
200
start_time=$(date +%s)
201
java -Xmx4g -XX:+PrintGCDetails \
202
-cp flink-tests.jar org.apache.flink.test.manual.MassiveStringSorting \
203
$size /tmp/output-$size.txt 2> gc-$size.log
204
end_time=$(date +%s)
205
206
duration=$((end_time - start_time))
207
echo "Size: $size, Duration: ${duration}s"
208
209
# Extract GC statistics
210
grep "GC" gc-$size.log | tail -5
211
done
212
```
213
214
**Streaming Performance Testing:**
215
216
```bash
217
#!/bin/bash
218
# Streaming performance test script
219
220
# Test different throughput targets
221
for throughput in 10000 50000 100000 500000 1000000; do
222
echo "Testing throughput: $throughput events/sec"
223
224
# Run streaming test for 60 seconds
225
java -Xmx8g -XX:+UseG1GC -XX:+PrintGCDetails \
226
-cp flink-tests.jar org.apache.flink.test.manual.StreamingScalabilityAndLatency \
227
$throughput 60 8 2> streaming-gc-$throughput.log &
228
229
# Monitor system resources
230
pid=$!
231
top -p $pid -b -n 60 -d 1 > cpu-$throughput.log &
232
233
wait $pid
234
235
echo "Throughput $throughput completed"
236
done
237
```
238
239
**Memory Usage Analysis:**
240
241
```bash
242
#!/bin/bash
243
# Memory usage analysis script
244
245
# Test object reuse impact
246
echo "Testing object reuse vs allocation..."
247
248
# With object reuse
249
java -Xmx2g -XX:+PrintGCDetails -XX:+PrintGCTimeStamps \
250
-cp flink-tests.jar org.apache.flink.test.manual.OverwriteObjects \
251
1000000 1024 true 2> gc-reuse.log
252
253
# Without object reuse
254
java -Xmx2g -XX:+PrintGCDetails -XX:+PrintGCTimeStamps \
255
-cp flink-tests.jar org.apache.flink.test.manual.OverwriteObjects \
256
1000000 1024 false 2> gc-no-reuse.log
257
258
# Compare GC frequency and duration
259
echo "With object reuse:"
260
grep "Full GC" gc-reuse.log | wc -l
261
echo "Without object reuse:"
262
grep "Full GC" gc-no-reuse.log | wc -l
263
```
264
265
**Performance Comparison Script:**
266
267
```bash
268
#!/bin/bash
269
# Performance comparison between String and StringValue
270
271
sizes=(1000000 5000000 10000000)
272
273
for size in "${sizes[@]}"; do
274
echo "Comparing String vs StringValue for size: $size"
275
276
# Test regular strings
277
echo "Testing String sorting..."
278
time java -Xmx4g -cp flink-tests.jar \
279
org.apache.flink.test.manual.MassiveStringSorting \
280
$size /tmp/string-$size.txt
281
282
# Test StringValue
283
echo "Testing StringValue sorting..."
284
time java -Xmx4g -cp flink-tests.jar \
285
org.apache.flink.test.manual.MassiveStringValueSorting \
286
$size /tmp/stringvalue-$size.txt
287
288
echo "---"
289
done
290
```
291
292
**Resource Monitoring Script:**
293
294
```bash
295
#!/bin/bash
296
# Comprehensive resource monitoring during performance tests
297
298
test_name=$1
299
test_command=$2
300
301
echo "Starting performance test: $test_name"
302
303
# Start resource monitoring
304
vmstat 1 > vmstat-$test_name.log &
305
vmstat_pid=$!
306
307
iostat 1 > iostat-$test_name.log &
308
iostat_pid=$!
309
310
# Run the test
311
start_time=$(date +%s)
312
eval $test_command
313
end_time=$(date +%s)
314
315
# Stop monitoring
316
kill $vmstat_pid $iostat_pid
317
318
duration=$((end_time - start_time))
319
echo "Test completed in ${duration} seconds"
320
321
# Generate summary report
322
echo "Performance Test Report: $test_name" > report-$test_name.txt
323
echo "Duration: ${duration}s" >> report-$test_name.txt
324
echo "Average CPU:" >> report-$test_name.txt
325
awk 'NR>3 {sum+=$15; count++} END {print 100-(sum/count)"%"}' vmstat-$test_name.log >> report-$test_name.txt
326
echo "Peak Memory Usage:" >> report-$test_name.txt
327
grep -o "used [0-9]*" vmstat-$test_name.log | cut -d' ' -f2 | sort -n | tail -1 >> report-$test_name.txt
328
```
329
330
These performance testing programs provide comprehensive benchmarking capabilities for evaluating Flink application performance across different dimensions including throughput, latency, memory usage, CPU utilization, and scalability characteristics.