0
# Row Operations
1
2
Core interface for working with structured row data in Spark Catalyst, providing both generic access and type-safe methods for data manipulation.
3
4
## Capabilities
5
6
### Row Interface
7
8
The Row trait provides the primary interface for accessing structured data in Catalyst.
9
10
```scala { .api }
11
/**
12
* Represents one row of output from a relational operator.
13
* Allows both generic access by ordinal and native primitive access.
14
*/
15
trait Row extends Serializable {
16
/** Number of elements in the Row */
17
def size: Int
18
/** Number of elements in the Row */
19
def length: Int
20
/** Schema for the row - returns null by default */
21
def schema: StructType = null
22
/** Returns the value at position i */
23
def apply(i: Int): Any
24
/** Returns the value at position i */
25
def get(i: Int): Any
26
/** Checks whether the value at position i is null */
27
def isNullAt(i: Int): Boolean
28
/** Make a copy of the current Row object */
29
def copy(): Row
30
/** Returns true if there are any NULL values in this row */
31
def anyNull: Boolean
32
/** Return a Scala Seq representing the row */
33
def toSeq: Seq[Any]
34
}
35
```
36
37
**Usage Examples:**
38
39
```scala
40
import org.apache.spark.sql._
41
42
// Create a Row from values
43
val row = Row(1, "Alice", true, null)
44
45
// Access values by position
46
val id = row.getInt(0) // 1
47
val name = row.getString(1) // "Alice"
48
val active = row.getBoolean(2) // true
49
val value = row.get(3) // null
50
51
// Check for null values
52
val isNull = row.isNullAt(3) // true
53
54
// Get row size and convert to sequence
55
val size = row.length // 4
56
val seq = row.toSeq // Seq(1, "Alice", true, null)
57
```
58
59
### Primitive Type Accessors
60
61
Type-safe accessors for primitive values with automatic casting. All primitive accessors internally use the `getAnyValAs` method which performs null checking.
62
63
```scala { .api }
64
/**
65
* Returns the value at position i as a primitive boolean.
66
* @throws ClassCastException when data type does not match
67
* @throws NullPointerException when value is null
68
*/
69
def getBoolean(i: Int): Boolean
70
71
/**
72
* Returns the value at position i as a primitive byte.
73
* @throws ClassCastException when data type does not match
74
* @throws NullPointerException when value is null
75
*/
76
def getByte(i: Int): Byte
77
78
/**
79
* Returns the value at position i as a primitive short.
80
* @throws ClassCastException when data type does not match
81
* @throws NullPointerException when value is null
82
*/
83
def getShort(i: Int): Short
84
85
/**
86
* Returns the value at position i as a primitive int.
87
* @throws ClassCastException when data type does not match
88
* @throws NullPointerException when value is null
89
*/
90
def getInt(i: Int): Int
91
92
/**
93
* Returns the value at position i as a primitive long.
94
* @throws ClassCastException when data type does not match
95
* @throws NullPointerException when value is null
96
*/
97
def getLong(i: Int): Long
98
99
/**
100
* Returns the value at position i as a primitive float.
101
* @throws ClassCastException when data type does not match
102
* @throws NullPointerException when value is null
103
*/
104
def getFloat(i: Int): Float
105
106
/**
107
* Returns the value at position i as a primitive double.
108
* @throws ClassCastException when data type does not match
109
* @throws NullPointerException when value is null
110
*/
111
def getDouble(i: Int): Double
112
```
113
114
### Object Type Accessors
115
116
Accessors for object types including strings, decimals, dates, and timestamps.
117
118
```scala { .api }
119
/**
120
* Returns the value at position i as a String object.
121
* @throws ClassCastException when data type does not match
122
*/
123
def getString(i: Int): String
124
125
/**
126
* Returns the value at position i of decimal type as java.math.BigDecimal.
127
* @throws ClassCastException when data type does not match
128
*/
129
def getDecimal(i: Int): java.math.BigDecimal
130
131
/**
132
* Returns the value at position i of date type as java.sql.Date.
133
* @throws ClassCastException when data type does not match
134
*/
135
def getDate(i: Int): java.sql.Date
136
137
/**
138
* Returns the value at position i of timestamp type as java.sql.Timestamp.
139
* @throws ClassCastException when data type does not match
140
*/
141
def getTimestamp(i: Int): java.sql.Timestamp
142
```
143
144
### Collection Type Accessors
145
146
Accessors for complex types including arrays, maps, and structs.
147
148
```scala { .api }
149
/**
150
* Returns the value at position i of array type as a Scala Seq.
151
* @throws ClassCastException when data type does not match
152
*/
153
def getSeq[T](i: Int): Seq[T]
154
155
/**
156
* Returns the value at position i of array type as java.util.List.
157
* @throws ClassCastException when data type does not match
158
*/
159
def getList[T](i: Int): java.util.List[T]
160
161
/**
162
* Returns the value at position i of map type as a Scala Map.
163
* @throws ClassCastException when data type does not match
164
*/
165
def getMap[K, V](i: Int): scala.collection.Map[K, V]
166
167
/**
168
* Returns the value at position i of map type as java.util.Map.
169
* @throws ClassCastException when data type does not match
170
*/
171
def getJavaMap[K, V](i: Int): java.util.Map[K, V]
172
173
/**
174
* Returns the value at position i of struct type as a Row object.
175
* @throws ClassCastException when data type does not match
176
*/
177
def getStruct(i: Int): Row
178
```
179
180
### Generic Accessors
181
182
Generic type-safe accessors and field name based access.
183
184
```scala { .api }
185
/**
186
* Returns the value at position i with generic type casting.
187
* For primitive types if value is null it returns 'zero value' specific for primitive
188
* @throws ClassCastException when data type does not match
189
*/
190
def getAs[T](i: Int): T
191
192
/**
193
* Returns the value of a given fieldName.
194
* @throws UnsupportedOperationException when schema is not defined
195
* @throws IllegalArgumentException when fieldName does not exist
196
* @throws ClassCastException when data type does not match
197
*/
198
def getAs[T](fieldName: String): T
199
200
/**
201
* Returns the index of a given field name.
202
* Default implementation throws UnsupportedOperationException.
203
* @throws UnsupportedOperationException when schema is not defined ("fieldIndex on a Row without schema is undefined.")
204
* @throws IllegalArgumentException when fieldName does not exist
205
*/
206
def fieldIndex(name: String): Int = {
207
throw new UnsupportedOperationException("fieldIndex on a Row without schema is undefined.")
208
}
209
210
/**
211
* Returns a Map(name -> value) for the requested fieldNames
212
* @throws UnsupportedOperationException when schema is not defined
213
* @throws IllegalArgumentException when fieldName does not exist
214
* @throws ClassCastException when data type does not match
215
*/
216
def getValuesMap[T](fieldNames: Seq[String]): Map[String, T]
217
```
218
219
### Row Factory Methods
220
221
Factory methods for creating Row instances from various data sources.
222
223
```scala { .api }
224
object Row {
225
/**
226
* Pattern matching extractor for Row objects.
227
* Example: case Row(key: Int, value: String) => key -> value
228
*/
229
def unapplySeq(row: Row): Some[Seq[Any]]
230
231
/**
232
* Create a Row with the given values.
233
*/
234
def apply(values: Any*): Row
235
236
/**
237
* Create a Row from a Seq of values.
238
*/
239
def fromSeq(values: Seq[Any]): Row
240
241
/**
242
* Create a Row from a tuple.
243
*/
244
def fromTuple(tuple: Product): Row
245
246
/**
247
* Merge multiple rows into a single row, one after another.
248
*/
249
def merge(rows: Row*): Row
250
251
/** Returns an empty row */
252
val empty: Row
253
}
254
```
255
256
**Usage Examples:**
257
258
```scala
259
import org.apache.spark.sql._
260
261
// Create rows from different sources
262
val row1 = Row(1, "Alice", 25.5)
263
val row2 = Row.fromSeq(Seq(2, "Bob", 30.0))
264
val row3 = Row.fromTuple((3, "Charlie", 35.5))
265
266
// Merge rows
267
val merged = Row.merge(row1, row2)
268
// Result: Row with values (1, "Alice", 25.5, 2, "Bob", 30.0)
269
270
// Pattern matching
271
val pairs = Seq(Row(1, "Alice"), Row(2, "Bob")).map {
272
case Row(id: Int, name: String) => id -> name
273
}
274
275
// Empty row
276
val empty = Row.empty
277
```
278
279
### Utility Methods
280
281
Additional utility methods for row manipulation and display.
282
283
```scala { .api }
284
/** Displays all elements of this sequence in a string (without a separator) */
285
def mkString: String
286
287
/** Displays all elements of this sequence in a string using a separator string */
288
def mkString(sep: String): String
289
290
/**
291
* Displays all elements of this sequence in a string using start, end, and separator strings
292
*/
293
def mkString(start: String, sep: String, end: String): String
294
```
295
296
**Usage Examples:**
297
298
```scala
299
val row = Row(1, "Alice", true)
300
301
// String representations
302
val str1 = row.mkString // "1Alicetrue"
303
val str2 = row.mkString(", ") // "1, Alice, true"
304
val str3 = row.mkString("[", ", ", "]") // "[1, Alice, true]"
305
306
// Default toString
307
val str4 = row.toString() // "[1,Alice,true]"
308
```