or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

graph-analytics.mdgraph-construction.mdgraph-operations.mdindex.mditerative-algorithms.md

graph-construction.mddocs/

0

# Graph Construction

1

2

Complete API for creating graphs from various data sources.

3

4

## Factory Methods Overview

5

6

The `Graph` companion object provides multiple factory methods for creating graphs from different data sources:

7

8

- DataSets of vertices and edges

9

- Scala collections

10

- Tuple-based data

11

- CSV files with extensive configuration options

12

13

## DataSet-based Construction

14

15

### From Vertex and Edge DataSets

16

17

```scala { .api }

18

def fromDataSet[K: TypeInformation : ClassTag, VV: TypeInformation : ClassTag, EV: TypeInformation : ClassTag](

19

vertices: DataSet[Vertex[K, VV]],

20

edges: DataSet[Edge[K, EV]],

21

env: ExecutionEnvironment

22

): Graph[K, VV, EV]

23

```

24

25

Creates a graph from separate vertex and edge DataSets.

26

27

**Parameters:**

28

- `vertices` - DataSet containing graph vertices with IDs and values

29

- `edges` - DataSet containing graph edges with source, target, and values

30

- `env` - Flink execution environment

31

32

### From Edge DataSet Only

33

34

```scala { .api }

35

def fromDataSet[K: TypeInformation : ClassTag, EV: TypeInformation : ClassTag](

36

edges: DataSet[Edge[K, EV]],

37

env: ExecutionEnvironment

38

): Graph[K, NullValue, EV]

39

```

40

41

Creates a graph from edges only. Vertices are automatically created with `NullValue` as vertex values.

42

43

**Parameters:**

44

- `edges` - DataSet containing graph edges

45

- `env` - Flink execution environment

46

47

### From Edges with Vertex Initializer

48

49

```scala { .api }

50

def fromDataSet[K: TypeInformation : ClassTag, VV: TypeInformation : ClassTag, EV: TypeInformation : ClassTag](

51

edges: DataSet[Edge[K, EV]],

52

vertexValueInitializer: MapFunction[K, VV],

53

env: ExecutionEnvironment

54

): Graph[K, VV, EV]

55

```

56

57

Creates a graph from edges and initializes vertex values using a mapping function.

58

59

**Parameters:**

60

- `edges` - DataSet containing graph edges

61

- `vertexValueInitializer` - Function to initialize vertex values from vertex IDs

62

- `env` - Flink execution environment

63

64

## Collection-based Construction

65

66

### From Scala Collections

67

68

```scala { .api }

69

def fromCollection[K: TypeInformation : ClassTag, VV: TypeInformation : ClassTag, EV: TypeInformation : ClassTag](

70

vertices: Seq[Vertex[K, VV]],

71

edges: Seq[Edge[K, EV]],

72

env: ExecutionEnvironment

73

): Graph[K, VV, EV]

74

```

75

76

Creates a graph from Scala collections of vertices and edges.

77

78

```scala { .api }

79

def fromCollection[K: TypeInformation : ClassTag, EV: TypeInformation : ClassTag](

80

edges: Seq[Edge[K, EV]],

81

env: ExecutionEnvironment

82

): Graph[K, NullValue, EV]

83

```

84

85

Creates a graph from a collection of edges only.

86

87

```scala { .api }

88

def fromCollection[K: TypeInformation : ClassTag, VV: TypeInformation : ClassTag, EV: TypeInformation : ClassTag](

89

edges: Seq[Edge[K, EV]],

90

vertexValueInitializer: MapFunction[K, VV],

91

env: ExecutionEnvironment

92

): Graph[K, VV, EV]

93

```

94

95

Creates a graph from edges with vertex value initialization.

96

97

## Tuple-based Construction

98

99

### From Tuple DataSets

100

101

```scala { .api }

102

def fromTupleDataSet[K: TypeInformation : ClassTag, VV: TypeInformation : ClassTag, EV: TypeInformation : ClassTag](

103

vertices: DataSet[(K, VV)],

104

edges: DataSet[(K, K, EV)],

105

env: ExecutionEnvironment

106

): Graph[K, VV, EV]

107

```

108

109

Creates a graph from tuple DataSets where:

110

- Vertex tuples: `(vertexId, vertexValue)`

111

- Edge tuples: `(sourceId, targetId, edgeValue)`

112

113

```scala { .api }

114

def fromTupleDataSet[K: TypeInformation : ClassTag, EV: TypeInformation : ClassTag](

115

edges: DataSet[(K, K, EV)],

116

env: ExecutionEnvironment

117

): Graph[K, NullValue, EV]

118

```

119

120

Creates a graph from edge tuples only.

121

122

```scala { .api }

123

def fromTupleDataSet[K: TypeInformation : ClassTag, VV: TypeInformation : ClassTag, EV: TypeInformation : ClassTag](

124

edges: DataSet[(K, K, EV)],

125

vertexValueInitializer: MapFunction[K, VV],

126

env: ExecutionEnvironment

127

): Graph[K, VV, EV]

128

```

129

130

Creates a graph from edge tuples with vertex value initialization.

131

132

### From Tuple2 DataSets (No Edge Values)

133

134

```scala { .api }

135

def fromTuple2DataSet[K: TypeInformation : ClassTag](

136

edges: DataSet[(K, K)],

137

env: ExecutionEnvironment

138

): Graph[K, NullValue, NullValue]

139

```

140

141

Creates a graph from simple edge pairs with no values.

142

143

```scala { .api }

144

def fromTuple2DataSet[K: TypeInformation : ClassTag, VV: TypeInformation : ClassTag](

145

edges: DataSet[(K, K)],

146

vertexValueInitializer: MapFunction[K, VV],

147

env: ExecutionEnvironment

148

): Graph[K, VV, NullValue]

149

```

150

151

Creates a graph from edge pairs with vertex value initialization.

152

153

## CSV File Construction

154

155

```scala { .api }

156

def fromCsvReader[K: TypeInformation : ClassTag, VV: TypeInformation : ClassTag, EV: TypeInformation : ClassTag](

157

env: ExecutionEnvironment,

158

pathEdges: String,

159

pathVertices: String = null,

160

lineDelimiterVertices: String = "\n",

161

fieldDelimiterVertices: String = ",",

162

quoteCharacterVertices: Character = null,

163

ignoreFirstLineVertices: Boolean = false,

164

ignoreCommentsVertices: String = null,

165

lenientVertices: Boolean = false,

166

includedFieldsVertices: Array[Int] = null,

167

lineDelimiterEdges: String = "\n",

168

fieldDelimiterEdges: String = ",",

169

quoteCharacterEdges: Character = null,

170

ignoreFirstLineEdges: Boolean = false,

171

ignoreCommentsEdges: String = null,

172

lenientEdges: Boolean = false,

173

includedFieldsEdges: Array[Int] = null,

174

vertexValueInitializer: MapFunction[K, VV] = null

175

): Graph[K, VV, EV]

176

```

177

178

Creates a graph from CSV files with extensive configuration options.

179

180

**Parameters:**

181

- `env` - Flink execution environment

182

- `pathEdges` - File path containing the edges (required)

183

- `pathVertices` - File path containing the vertices (optional)

184

- `lineDelimiterVertices` - Line separator for vertices file (default: "\n")

185

- `fieldDelimiterVertices` - Field separator for vertices file (default: ",")

186

- `quoteCharacterVertices` - Quote character for vertices file parsing

187

- `ignoreFirstLineVertices` - Whether to skip first line in vertices file

188

- `ignoreCommentsVertices` - String prefix for comment lines to ignore in vertices file

189

- `lenientVertices` - Whether to silently ignore malformed lines in vertices file

190

- `includedFieldsVertices` - Array of field indices to read from vertices file

191

- `lineDelimiterEdges` - Line separator for edges file (default: "\n")

192

- `fieldDelimiterEdges` - Field separator for edges file (default: ",")

193

- `quoteCharacterEdges` - Quote character for edges file parsing

194

- `ignoreFirstLineEdges` - Whether to skip first line in edges file

195

- `ignoreCommentsEdges` - String prefix for comment lines to ignore in edges file

196

- `lenientEdges` - Whether to silently ignore malformed lines in edges file

197

- `includedFieldsEdges` - Array of field indices to read from edges file

198

- `vertexValueInitializer` - Function to initialize vertex values if no vertices file provided

199

200

## Usage Examples

201

202

### Basic Graph Creation

203

204

```scala

205

import org.apache.flink.api.scala._

206

import org.apache.flink.graph.scala._

207

import org.apache.flink.graph.{Edge, Vertex}

208

209

val env = ExecutionEnvironment.getExecutionEnvironment

210

211

// From collections

212

val vertices = List(

213

new Vertex(1L, "Node A"),

214

new Vertex(2L, "Node B"),

215

new Vertex(3L, "Node C")

216

)

217

218

val edges = List(

219

new Edge(1L, 2L, 1.0),

220

new Edge(2L, 3L, 2.0),

221

new Edge(1L, 3L, 3.0)

222

)

223

224

val graph = Graph.fromCollection(vertices, edges, env)

225

```

226

227

### From Tuples

228

229

```scala

230

val vertexTuples = env.fromCollection(List(

231

(1L, "A"),

232

(2L, "B"),

233

(3L, "C")

234

))

235

236

val edgeTuples = env.fromCollection(List(

237

(1L, 2L, 1.0),

238

(2L, 3L, 2.0)

239

))

240

241

val graph = Graph.fromTupleDataSet(vertexTuples, edgeTuples, env)

242

```

243

244

### From CSV Files

245

246

```scala

247

val graph = Graph.fromCsvReader[Long, String, Double](

248

env = env,

249

pathEdges = "path/to/edges.csv",

250

pathVertices = "path/to/vertices.csv",

251

fieldDelimiterEdges = "\t",

252

ignoreFirstLineEdges = true

253

)

254

```