0
# Apache Flink Gelly Examples
1
2
Apache Flink Gelly Examples is a comprehensive collection of example applications demonstrating graph processing algorithms using Apache Flink's Gelly Graph API. It provides both a command-line execution framework and standalone example implementations for common graph algorithms including PageRank, Connected Components, Single Source Shortest Paths, Triangle Listing, and graph similarity measures.
3
4
## Package Information
5
6
- **Package Name**: flink-gelly-examples_2.12
7
- **Package Type**: Maven
8
- **Group ID**: org.apache.flink
9
- **Language**: Java and Scala
10
- **Version**: 1.16.3
11
- **Installation**: Add Maven dependency:
12
13
```xml
14
<dependency>
15
<groupId>org.apache.flink</groupId>
16
<artifactId>flink-gelly-examples_2.12</artifactId>
17
<version>1.16.3</version>
18
</dependency>
19
```
20
21
## Core Imports
22
23
```java
24
import org.apache.flink.graph.Runner;
25
import org.apache.flink.graph.drivers.*;
26
import org.apache.flink.graph.drivers.input.*;
27
import org.apache.flink.graph.drivers.output.*;
28
import org.apache.flink.graph.examples.*;
29
```
30
31
## Basic Usage
32
33
### Command-Line Execution
34
35
```bash
36
# Run PageRank on a complete graph with 1000 vertices
37
flink run examples/flink-gelly-examples_2.12-1.16.3.jar \
38
--algorithm PageRank \
39
--input CompleteGraph --vertex_count 1000 \
40
--output Print
41
42
# Run Connected Components on CSV input
43
flink run examples/flink-gelly-examples_2.12-1.16.3.jar \
44
--algorithm ConnectedComponents \
45
--input CSV --input_filename graph.csv \
46
--output CSV --output_filename results.csv
47
```
48
49
### Programmatic Usage
50
51
```java
52
import org.apache.flink.graph.Runner;
53
54
public class MyGraphAnalysis {
55
public static void main(String[] args) throws Exception {
56
// Execute PageRank algorithm
57
String[] params = {
58
"--algorithm", "PageRank",
59
"--input", "CompleteGraph", "--vertex_count", "1000",
60
"--output", "Print"
61
};
62
63
Runner runner = new Runner(params);
64
runner.run().execute();
65
}
66
}
67
```
68
69
## Architecture
70
71
Apache Flink Gelly Examples is built around several key architectural components:
72
73
- **Runner Framework**: Central orchestrator (`Runner` class) that coordinates input sources, algorithms, and output handlers through a pluggable architecture
74
- **Driver System**: Standardized interface (`Driver`) for implementing graph algorithms with consistent parameter handling and analytics reporting
75
- **Input Sources**: Pluggable input system supporting both file-based data (`CSV`) and algorithmic graph generators (`CompleteGraph`, `RMatGraph`, etc.)
76
- **Output Handlers**: Flexible output system supporting multiple formats including console (`Print`), files (`CSV`), and verification (`Hash`)
77
- **Parameter System**: Type-safe parameter handling with automatic validation, usage generation, and command-line parsing
78
- **Transform System**: Bidirectional transformation capabilities for preprocessing inputs and postprocessing results
79
80
## Capabilities
81
82
### Command-Line Execution Framework
83
84
Central execution framework that coordinates graph algorithms with configurable inputs and outputs through command-line interface.
85
86
```java { .api }
87
public class Runner extends ParameterizedBase {
88
public Runner(String[] args);
89
public ExecutionEnvironment getExecutionEnvironment();
90
public DataSet getResult();
91
public Runner run() throws Exception;
92
public static void main(String[] args) throws Exception;
93
}
94
```
95
96
[Execution Framework](./execution-framework.md)
97
98
### Graph Algorithm Drivers
99
100
Comprehensive collection of graph processing algorithms including centrality measures, clustering, and similarity computations.
101
102
```java { .api }
103
public interface Driver<K, VV, EV> extends Parameterized {
104
String getShortDescription();
105
String getLongDescription();
106
DataSet plan(Graph<K, VV, EV> graph) throws Exception;
107
void printAnalytics(PrintStream out);
108
}
109
```
110
111
Available algorithms: PageRank, ConnectedComponents, HITS, AdamicAdar, JaccardIndex, ClusteringCoefficient, TriangleListing, GraphMetrics, EdgeList.
112
113
[Graph Algorithms](./graph-algorithms.md)
114
115
### Graph Input Sources
116
117
Flexible input system supporting both file-based graph data and algorithmic graph generation for testing and benchmarking.
118
119
```java { .api }
120
public interface Input<K, VV, EV> extends Parameterized {
121
String getIdentity();
122
Graph<K, VV, EV> create(ExecutionEnvironment env) throws Exception;
123
}
124
```
125
126
Available inputs: CSV file reader, CompleteGraph, GridGraph, RMatGraph, StarGraph, CycleGraph, PathGraph, HypercubeGraph, CirculantGraph, and more.
127
128
[Input Sources](./input-sources.md)
129
130
### Result Output Handlers
131
132
Multiple output formats for algorithm results including console display, file export, and verification utilities.
133
134
```java { .api }
135
public interface Output<T> extends Parameterized {
136
void write(String executionName, PrintStream out, DataSet<T> data) throws Exception;
137
}
138
```
139
140
Available outputs: Print (console), CSV (file), Hash (verification).
141
142
[Output Handlers](./output-handlers.md)
143
144
### Parameter Management System
145
146
Type-safe parameter system providing validation, default values, and automatic usage string generation for configurable components.
147
148
```java { .api }
149
public interface Parameterized {
150
String getName();
151
String getUsage();
152
void configure(ParameterTool parameterTool) throws ProgramParametrizationException;
153
}
154
155
public interface Parameter<T> {
156
String getUsage();
157
boolean isHidden();
158
void configure(ParameterTool parameterTool);
159
T getValue();
160
}
161
```
162
163
Available parameter types: BooleanParameter, LongParameter, DoubleParameter, StringParameter, ChoiceParameter.
164
165
[Parameter System](./parameter-system.md)
166
167
### Graph Transformations
168
169
Bidirectional transformation system for preprocessing input graphs and postprocessing algorithm results.
170
171
```java { .api }
172
public interface Transform<II, IO, RI, RO> extends Parameterized {
173
String getIdentity();
174
IO transformInput(II input) throws Exception;
175
RO transformResult(RI result) throws Exception;
176
}
177
178
public interface Transformable {
179
List<Transform> getTransformers();
180
}
181
```
182
183
[Transformations](./transformations.md)
184
185
### Standalone Example Implementations
186
187
Ready-to-use implementations of common graph algorithms demonstrating different programming patterns and API usage approaches.
188
189
```java { .api }
190
// PageRank example with generic key type support
191
public class PageRank<K> {
192
public PageRank(double beta, int maxIterations);
193
public DataSet<Vertex<K, Double>> run(Graph<K, Double, Double> network);
194
}
195
```
196
197
Available examples: PageRank, SingleSourceShortestPaths, ConnectedComponents, IncrementalSSSP, MusicProfiles, EuclideanGraphWeighing.
198
199
[Example Implementations](./example-implementations.md)
200
201
## Types
202
203
### Core Framework Types
204
205
```java { .api }
206
// Parameter factory for component instantiation
207
public class ParameterizedFactory<T extends Parameterized> implements Iterable<T> {
208
public ParameterizedFactory<T> addClass(Class<? extends T> cls);
209
public T get(String name);
210
}
211
212
// Base class for parameterized components
213
public abstract class ParameterizedBase implements Parameterized {
214
public String getName();
215
public String getUsage();
216
public void configure(ParameterTool parameterTool) throws ProgramParametrizationException;
217
}
218
```