Input sources provide various ways to create graphs for algorithm execution, including file readers and procedural graph generators. All input sources implement the Input interface and support configurable parameters through the unified parameter system.
Base interface for all graph input sources.
/**
* Input source for a Graph, such as a file reader or graph generator
* Produces graphs for algorithm processing
*/
public interface Input<K, VV, EV> extends Parameterized {
/**
* Human-readable identifier summarizing the input and configuration
* @return Unique identifier string for this input configuration
*/
String getIdentity();
/**
* Create the input graph using the specified execution environment
* @param env Flink ExecutionEnvironment for graph creation
* @return Generated or loaded graph ready for algorithm processing
* @throws Exception if graph creation fails
*/
Graph<K, VV, EV> create(ExecutionEnvironment env) throws Exception;
}Reads graphs from CSV files with configurable key types and delimiters.
/**
* CSV file input source with configurable key types and parsing options
* Supports integer, long, and string vertex IDs
*/
public class CSV<K extends Comparable<K>> extends ParameterizedBase
implements Input<K, NullValue, NullValue> {
public String getName(); // "CSV"
public String getIdentity(); // includes filename and configuration
public Graph<K, NullValue, NullValue> create(ExecutionEnvironment env) throws Exception;
}Configuration Parameters:
type (ChoiceParameter): Vertex ID type - "integer", "long", or "string" (default: "integer")input_filename (StringParameter): Path to CSV file (required)comment_prefix (StringParameter): Comment line prefix to skip (default: "#")input_line_delimiter (StringParameter): Line delimiter (default: system default)input_field_delimiter (StringParameter): Field delimiter (default: system default)CSV Format:
# Comments start with specified prefix
source_vertex,target_vertex
1,2
2,3
3,1Usage Examples:
# Read integer vertex IDs from CSV
--input CSV --input_filename graph.csv --type integer
# Read string vertex IDs with custom delimiters
--input CSV --input_filename graph.txt --type string \
--input_field_delimiter ";" --comment_prefix "//"Abstract base class for procedural graph generators with type translation support.
/**
* Abstract base class for generated graphs with automatic type translation
* Supports output in integer, long, string, and native Flink value types
*/
public abstract class GeneratedGraph<K> extends ParameterizedBase
implements Input<K, NullValue, NullValue> {
/**
* Create a graph with automatic type translation based on configured type parameter
* Calls generate() internally and translates vertex IDs to requested type
* @param env Flink ExecutionEnvironment for graph creation
* @return Graph with vertex IDs in the requested type
* @throws Exception if graph generation or translation fails
*/
public Graph<K, NullValue, NullValue> create(ExecutionEnvironment env) throws Exception;
/**
* Get the human-readable name of the configured type
* @return Capitalized type name (e.g., "Integer", "Long", "String")
*/
public String getTypeName();
/**
* Generate the graph with native LongValue vertex IDs (implemented by subclasses)
* @param env Flink ExecutionEnvironment
* @return Generated graph with LongValue vertex IDs
* @throws Exception if generation fails
*/
protected abstract Graph<LongValue, NullValue, NullValue> generate(ExecutionEnvironment env) throws Exception;
/**
* Return vertex count for validation and metrics (implemented by subclasses)
* @return Number of vertices in the generated graph
*/
protected abstract long vertexCount();
/**
* Check if vertex count matches expected value and log discrepancies
* @param graph Generated graph to validate
* @throws Exception if validation fails
*/
protected void checkVertexCount(Graph<?, ?, ?> graph) throws Exception;
}Configuration Parameters:
type (ChoiceParameter): Output type - "integer", "long", "string", plus native types (default varies by generator)Extended base class for graphs supporting simplification and multiple edge handling.
/**
* Extended base class for graphs that may contain multiple edges or self-loops
* Provides automatic simplification options to remove parallel edges and self-loops
*/
public abstract class GeneratedMultiGraph<K extends Comparable<K>> extends GeneratedGraph<K> {
/**
* Create graph with optional simplification based on configured parameters
* Can remove parallel edges, self-loops, or both based on simplify parameter
* @param env Flink ExecutionEnvironment for graph creation
* @return Graph with optional simplification applied
* @throws Exception if graph generation or simplification fails
*/
public Graph<K, NullValue, NullValue> create(ExecutionEnvironment env) throws Exception;
/**
* Get short description of simplification settings for identity string
* @return Abbreviated simplification description (e.g., "undirected", "simple")
*/
protected String getSimplifyShortString();
}Additional Configuration Parameters:
simplify (Simplify): Graph simplification options - remove parallel edges, self-loops, or both (default: no simplification)Generates R-MAT (Recursive Matrix) scale-free random graphs with configurable skew parameters.
/**
* R-MAT graph generator for scale-free random graphs
* Uses recursive matrix approach with configurable skew parameters
*/
public class RMatGraph extends GeneratedMultiGraph<LongValue> {
public String getName(); // "RMatGraph"
public String getIdentity(); // includes scale, edge factor, and parameters
public Graph<LongValue, NullValue, NullValue> generate(ExecutionEnvironment env) throws Exception;
}Configuration Parameters:
scale (LongParameter): Generate 2^scale vertices (default: 10, minimum: 1)edge_factor (LongParameter): Generate edgeFactor * 2^scale edges (default: 16, minimum: 1)a, b, c (DoubleParameter): R-MAT matrix skew parameters (defaults from generator)noise_enabled (BooleanParameter): Enable noise perturbation (default: false)noise (DoubleParameter): Noise level when enabled (default from generator, range: 0.0-2.0)seed (LongParameter): Random seed for reproducible generationlittle_parallelism (LongParameter): Parallelism settingUsage Examples:
# Generate R-MAT graph with 2^15 vertices and edge factor 8
--input RMatGraph --scale 15 --edge_factor 8 --seed 42
# R-MAT with custom skew parameters and noise
--input RMatGraph --scale 12 --a 0.6 --b 0.2 --c 0.15 \
--noise_enabled true --noise 0.1Generates complete graphs where every vertex is connected to every other vertex.
/**
* Complete graph generator creating fully connected graphs
* Generates N vertices with N*(N-1)/2 edges (undirected) or N*(N-1) edges (directed)
*/
public class CompleteGraph extends GeneratedGraph<LongValue> {
public String getName(); // "CompleteGraph"
public String getIdentity(); // includes vertex count
public Graph<LongValue, NullValue, NullValue> generate(ExecutionEnvironment env) throws Exception;
}Configuration Parameters:
vertex_count (LongParameter): Number of vertices (minimum: implementation-defined)little_parallelism (LongParameter): Parallelism setting (default: system default)Usage Example:
# Generate complete graph with 500 vertices
--input CompleteGraph --vertex_count 500Generates multi-dimensional grid graphs with configurable dimensions and endpoint wrapping.
/**
* Multi-dimensional grid graph generator
* Supports arbitrary dimensions with optional endpoint wrapping per dimension
*/
public class GridGraph extends GeneratedGraph<LongValue> {
public String getName(); // "GridGraph"
public String getIdentity(); // includes dimension configuration
public String getUsage(); // custom usage for dynamic dimensions
public void configure(ParameterTool parameterTool) throws ProgramParametrizationException;
public Graph<LongValue, NullValue, NullValue> generate(ExecutionEnvironment env) throws Exception;
}Configuration Parameters:
dim0, dim1, dim2, etc. with format "size:wrap_endpoints"little_parallelism (LongParameter): Parallelism settingDimension Format: "<size>:<wrap>" where:
size: Number of vertices in this dimensionwrap: Boolean indicating whether endpoints connect (true/false)Usage Examples:
# 2D grid: 10x10 with wrapping on both dimensions (torus)
--input GridGraph --dim0 "10:true" --dim1 "10:true"
# 3D grid: 5x8x4 with no wrapping
--input GridGraph --dim0 "5:false" --dim1 "8:false" --dim2 "4:false"Generates circulant graphs with configurable offset ranges for neighbor connections.
/**
* Circulant graph generator with configurable offset ranges
* Each vertex connects to neighbors at specified offset distances
*/
public class CirculantGraph extends GeneratedGraph<LongValue> {
public String getName(); // "CirculantGraph"
public String getIdentity(); // includes vertex count and ranges
public String getUsage(); // custom usage for dynamic ranges
public void configure(ParameterTool parameterTool) throws ProgramParametrizationException;
public Graph<LongValue, NullValue, NullValue> generate(ExecutionEnvironment env) throws Exception;
}Configuration Parameters:
vertex_count (LongParameter): Number of vertices in the circulant graphrange0, range1, etc. with format "offset:length"little_parallelism (LongParameter): Parallelism settingRange Format: "<offset>:<length>" where:
offset: Starting offset distance for connectionslength: Number of consecutive offsets to includeUsage Example:
# Circulant graph with 100 vertices, connecting to neighbors at offsets 1-3 and 10-12
--input CirculantGraph --vertex_count 100 --range0 "1:3" --range1 "10:3"Generates cycle/ring graphs where vertices form a circular chain.
/**
* Cycle graph generator creating ring topologies
* Each vertex connects to its two neighbors in a circular arrangement
*/
public class CycleGraph extends GeneratedGraph<LongValue> {
public String getName(); // "CycleGraph"
public String getIdentity(); // includes vertex count
public Graph<LongValue, NullValue, NullValue> generate(ExecutionEnvironment env) throws Exception;
}Configuration Parameters:
vertex_count (LongParameter): Number of vertices in the cycleUsage Example:
--input CycleGraph --vertex_count 1000Generates star topology graphs with one central vertex connected to all others.
/**
* Star graph generator creating star topologies
* One central vertex connects to all other vertices
*/
public class StarGraph extends GeneratedGraph<LongValue> {
public String getName(); // "StarGraph"
public String getIdentity(); // includes vertex count
public Graph<LongValue, NullValue, NullValue> generate(ExecutionEnvironment env) throws Exception;
}Configuration Parameters:
vertex_count (LongParameter): Total number of vertices (including center)Usage Example:
--input StarGraph --vertex_count 5000Generates linear path graphs where vertices form a straight chain.
/**
* Path graph generator creating linear chains
* Vertices connect in a single linear sequence
*/
public class PathGraph extends GeneratedGraph<LongValue> {
public String getName(); // "PathGraph"
public String getIdentity(); // includes vertex count
public Graph<LongValue, NullValue, NullValue> generate(ExecutionEnvironment env) throws Exception;
}Configuration Parameters:
vertex_count (LongParameter): Number of vertices in the pathUsage Example:
--input PathGraph --vertex_count 10000Generates N-dimensional hypercube graphs.
/**
* Hypercube graph generator creating N-dimensional hypercubes
* Vertices represent N-bit binary strings with edges between strings differing by one bit
*/
public class HypercubeGraph extends GeneratedGraph<LongValue> {
public String getName(); // "HypercubeGraph"
public String getIdentity(); // includes dimensions
public Graph<LongValue, NullValue, NullValue> generate(ExecutionEnvironment env) throws Exception;
}Configuration Parameters:
dimensions (LongParameter): Number of hypercube dimensions (generates 2^dimensions vertices)Usage Example:
# 10-dimensional hypercube (1024 vertices)
--input HypercubeGraph --dimensions 10Generates graphs with vertices but no edges.
/**
* Empty graph generator creating graphs with no edges
* Useful for testing vertex-only algorithms
*/
public class EmptyGraph extends GeneratedGraph<LongValue> {
public String getName(); // "EmptyGraph"
public String getIdentity(); // includes vertex count
public Graph<LongValue, NullValue, NullValue> generate(ExecutionEnvironment env) throws Exception;
}Configuration Parameters:
vertex_count (LongParameter): Number of isolated verticesGenerates echo graphs with specified vertex degree.
/**
* Echo graph generator with configurable vertex degree
* Each vertex has exactly the specified number of connections
*/
public class EchoGraph extends GeneratedGraph<LongValue> {
public String getName(); // "EchoGraph"
public String getIdentity(); // includes vertex count and degree
public Graph<LongValue, NullValue, NullValue> generate(ExecutionEnvironment env) throws Exception;
}Configuration Parameters:
vertex_count (LongParameter): Number of verticesvertex_degree (LongParameter): Degree for each vertexGenerates graphs composed only of singleton edges (vertex pairs).
/**
* Singleton edge graph generator creating graphs with only disconnected edges
* Each edge connects exactly two vertices with no shared vertices between edges
*/
public class SingletonEdgeGraph extends GeneratedGraph<LongValue> {
public String getName(); // "SingletonEdgeGraph"
public String getIdentity(); // includes vertex pair count
public Graph<LongValue, NullValue, NullValue> generate(ExecutionEnvironment env) throws Exception;
}Configuration Parameters:
vertex_pairs (LongParameter): Number of vertex pairs (creates 2 * vertex_pairs vertices)// Core graph types
class Graph<K, VV, EV> {
// Graph structure and operations
}
// Flink value types for efficient serialization
class LongValue implements CopyableValue<LongValue> {
public LongValue(long value);
public long getValue();
}
class NullValue implements Value {
public static final NullValue getInstance();
}
// ExecutionEnvironment for graph creation
class ExecutionEnvironment {
public static ExecutionEnvironment getExecutionEnvironment();
// Graph creation and data source methods
}
// Parameter parsing
class ParameterTool {
public String get(String key);
public String getRequired(String key) throws RuntimeException;
public boolean has(String key);
}