A collection of example applications demonstrating graph processing algorithms using Apache Flink's Gelly Graph API
npx @tessl/cli install tessl/maven-org-apache-flink--flink-gelly-examples-2-12@1.16.0Apache Flink Gelly Examples is a comprehensive collection of example applications demonstrating graph processing algorithms using Apache Flink's Gelly Graph API. It provides both a command-line execution framework and standalone example implementations for common graph algorithms including PageRank, Connected Components, Single Source Shortest Paths, Triangle Listing, and graph similarity measures.
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-gelly-examples_2.12</artifactId>
<version>1.16.3</version>
</dependency>import org.apache.flink.graph.Runner;
import org.apache.flink.graph.drivers.*;
import org.apache.flink.graph.drivers.input.*;
import org.apache.flink.graph.drivers.output.*;
import org.apache.flink.graph.examples.*;# Run PageRank on a complete graph with 1000 vertices
flink run examples/flink-gelly-examples_2.12-1.16.3.jar \
--algorithm PageRank \
--input CompleteGraph --vertex_count 1000 \
--output Print
# Run Connected Components on CSV input
flink run examples/flink-gelly-examples_2.12-1.16.3.jar \
--algorithm ConnectedComponents \
--input CSV --input_filename graph.csv \
--output CSV --output_filename results.csvimport org.apache.flink.graph.Runner;
public class MyGraphAnalysis {
public static void main(String[] args) throws Exception {
// Execute PageRank algorithm
String[] params = {
"--algorithm", "PageRank",
"--input", "CompleteGraph", "--vertex_count", "1000",
"--output", "Print"
};
Runner runner = new Runner(params);
runner.run().execute();
}
}Apache Flink Gelly Examples is built around several key architectural components:
Runner class) that coordinates input sources, algorithms, and output handlers through a pluggable architectureDriver) for implementing graph algorithms with consistent parameter handling and analytics reportingCSV) and algorithmic graph generators (CompleteGraph, RMatGraph, etc.)Print), files (CSV), and verification (Hash)Central execution framework that coordinates graph algorithms with configurable inputs and outputs through command-line interface.
public class Runner extends ParameterizedBase {
public Runner(String[] args);
public ExecutionEnvironment getExecutionEnvironment();
public DataSet getResult();
public Runner run() throws Exception;
public static void main(String[] args) throws Exception;
}Comprehensive collection of graph processing algorithms including centrality measures, clustering, and similarity computations.
public interface Driver<K, VV, EV> extends Parameterized {
String getShortDescription();
String getLongDescription();
DataSet plan(Graph<K, VV, EV> graph) throws Exception;
void printAnalytics(PrintStream out);
}Available algorithms: PageRank, ConnectedComponents, HITS, AdamicAdar, JaccardIndex, ClusteringCoefficient, TriangleListing, GraphMetrics, EdgeList.
Flexible input system supporting both file-based graph data and algorithmic graph generation for testing and benchmarking.
public interface Input<K, VV, EV> extends Parameterized {
String getIdentity();
Graph<K, VV, EV> create(ExecutionEnvironment env) throws Exception;
}Available inputs: CSV file reader, CompleteGraph, GridGraph, RMatGraph, StarGraph, CycleGraph, PathGraph, HypercubeGraph, CirculantGraph, and more.
Multiple output formats for algorithm results including console display, file export, and verification utilities.
public interface Output<T> extends Parameterized {
void write(String executionName, PrintStream out, DataSet<T> data) throws Exception;
}Available outputs: Print (console), CSV (file), Hash (verification).
Type-safe parameter system providing validation, default values, and automatic usage string generation for configurable components.
public interface Parameterized {
String getName();
String getUsage();
void configure(ParameterTool parameterTool) throws ProgramParametrizationException;
}
public interface Parameter<T> {
String getUsage();
boolean isHidden();
void configure(ParameterTool parameterTool);
T getValue();
}Available parameter types: BooleanParameter, LongParameter, DoubleParameter, StringParameter, ChoiceParameter.
Bidirectional transformation system for preprocessing input graphs and postprocessing algorithm results.
public interface Transform<II, IO, RI, RO> extends Parameterized {
String getIdentity();
IO transformInput(II input) throws Exception;
RO transformResult(RI result) throws Exception;
}
public interface Transformable {
List<Transform> getTransformers();
}Ready-to-use implementations of common graph algorithms demonstrating different programming patterns and API usage approaches.
// PageRank example with generic key type support
public class PageRank<K> {
public PageRank(double beta, int maxIterations);
public DataSet<Vertex<K, Double>> run(Graph<K, Double, Double> network);
}Available examples: PageRank, SingleSourceShortestPaths, ConnectedComponents, IncrementalSSSP, MusicProfiles, EuclideanGraphWeighing.
// Parameter factory for component instantiation
public class ParameterizedFactory<T extends Parameterized> implements Iterable<T> {
public ParameterizedFactory<T> addClass(Class<? extends T> cls);
public T get(String name);
}
// Base class for parameterized components
public abstract class ParameterizedBase implements Parameterized {
public String getName();
public String getUsage();
public void configure(ParameterTool parameterTool) throws ProgramParametrizationException;
}