0
# Array Manipulation and Transformation
1
2
Structural operations for reshaping, filtering, combining, and transforming arrays while preserving type information and handling variable-length data gracefully. These operations form the core of awkward array's data manipulation capabilities.
3
4
## Capabilities
5
6
### Array Concatenation and Joining
7
8
Functions for combining multiple arrays along specified axes while handling complex nested structures and maintaining type consistency.
9
10
```python { .api }
11
def concatenate(arrays, axis=0, highlevel=True, behavior=None):
12
"""
13
Concatenate arrays along the specified axis.
14
15
Parameters:
16
- arrays: sequence of Arrays to concatenate
17
- axis: int, axis along which to concatenate
18
- highlevel: bool, if True return Array, if False return Content layout
19
- behavior: dict, custom behavior for the result
20
21
Returns:
22
Array containing concatenated data
23
"""
24
25
def zip(arrays, depth_limit=None, parameters=None, with_name=None,
26
highlevel=True, behavior=None):
27
"""
28
Combine arrays into a record structure with field names from dict keys.
29
30
Parameters:
31
- arrays: dict mapping field names to Arrays, or sequence of Arrays
32
- depth_limit: int, maximum depth to zip (None for unlimited)
33
- parameters: dict, parameters for the resulting record type
34
- with_name: str, name for the record type
35
- highlevel: bool, if True return Array, if False return Content layout
36
- behavior: dict, custom behavior for the result
37
38
Returns:
39
Array with record structure containing input arrays as fields
40
"""
41
42
def unzip(array, highlevel=True, behavior=None):
43
"""
44
Split a record array into separate arrays for each field.
45
46
Parameters:
47
- array: Array with record structure to unzip
48
- highlevel: bool, if True return Array, if False return Content layout
49
- behavior: dict, custom behavior for the result
50
51
Returns:
52
dict mapping field names to Arrays
53
"""
54
```
55
56
### Structure Manipulation
57
58
Operations that change the shape and organization of nested data while preserving content and maintaining data integrity.
59
60
```python { .api }
61
def flatten(array, axis=1, highlevel=True, behavior=None):
62
"""
63
Flatten nested lists by removing one level of list structure.
64
65
Parameters:
66
- array: Array to flatten
67
- axis: int, axis to flatten (default 1 removes innermost list level)
68
- highlevel: bool, if True return Array, if False return Content layout
69
- behavior: dict, custom behavior for the result
70
71
Returns:
72
Array with one fewer level of nesting
73
"""
74
75
def unflatten(array, counts, axis=0, highlevel=True, behavior=None):
76
"""
77
Add a level of list structure by partitioning elements according to counts.
78
79
Parameters:
80
- array: Array to unflatten
81
- counts: Array of integers specifying partition sizes
82
- axis: int, axis along which to unflatten
83
- highlevel: bool, if True return Array, if False return Content layout
84
- behavior: dict, custom behavior for the result
85
86
Returns:
87
Array with additional level of list nesting
88
"""
89
90
def ravel(array, highlevel=True, behavior=None):
91
"""
92
Flatten array to one dimension by removing all list structure.
93
94
Parameters:
95
- array: Array to ravel
96
- highlevel: bool, if True return Array, if False return Content layout
97
- behavior: dict, custom behavior for the result
98
99
Returns:
100
One-dimensional Array containing all leaf elements
101
"""
102
103
def num(array, axis=1, highlevel=True, behavior=None):
104
"""
105
Count elements in each nested structure.
106
107
Parameters:
108
- array: Array to count elements in
109
- axis: int, axis along which to count
110
- highlevel: bool, if True return Array, if False return Content layout
111
- behavior: dict, custom behavior for the result
112
113
Returns:
114
Array of integers representing element counts
115
"""
116
```
117
118
### Array Filtering and Selection
119
120
Functions for selecting subsets of data based on conditions, indices, or structural patterns while maintaining array structure.
121
122
```python { .api }
123
def mask(array, selection, highlevel=True, behavior=None):
124
"""
125
Apply boolean mask to select elements.
126
127
Parameters:
128
- array: Array to mask
129
- selection: Array of booleans indicating which elements to select
130
- highlevel: bool, if True return Array, if False return Content layout
131
- behavior: dict, custom behavior for the result
132
133
Returns:
134
Array containing only elements where selection is True
135
"""
136
137
def where(condition, x, y, highlevel=True, behavior=None):
138
"""
139
Select elements from x or y depending on condition.
140
141
Parameters:
142
- condition: Array of booleans
143
- x: Array of values to select when condition is True
144
- y: Array of values to select when condition is False
145
- highlevel: bool, if True return Array, if False return Content layout
146
- behavior: dict, custom behavior for the result
147
148
Returns:
149
Array with elements selected based on condition
150
"""
151
152
def drop_none(array, axis=None, highlevel=True, behavior=None):
153
"""
154
Remove None/missing values from array.
155
156
Parameters:
157
- array: Array to process
158
- axis: int, axis along which to drop None values (None for all axes)
159
- highlevel: bool, if True return Array, if False return Content layout
160
- behavior: dict, custom behavior for the result
161
162
Returns:
163
Array with None values removed
164
"""
165
166
def fill_none(array, value, axis=None, highlevel=True, behavior=None):
167
"""
168
Replace None/missing values with specified value.
169
170
Parameters:
171
- array: Array to process
172
- value: Value to use as replacement for None
173
- axis: int, axis along which to fill (None for all axes)
174
- highlevel: bool, if True return Array, if False return Content layout
175
- behavior: dict, custom behavior for the result
176
177
Returns:
178
Array with None values replaced by value
179
"""
180
181
def pad_none(array, target, axis=1, clip=False, highlevel=True, behavior=None):
182
"""
183
Pad variable-length lists to target length using None values.
184
185
Parameters:
186
- array: Array to pad
187
- target: int, target length for lists
188
- axis: int, axis along which to pad
189
- clip: bool, if True clip lists longer than target
190
- highlevel: bool, if True return Array, if False return Content layout
191
- behavior: dict, custom behavior for the result
192
193
Returns:
194
Array with lists padded to target length
195
"""
196
```
197
198
### Field and Attribute Manipulation
199
200
Operations for working with record structures, adding/removing fields, and managing metadata attributes.
201
202
```python { .api }
203
def fields(array):
204
"""
205
Get field names from record array.
206
207
Parameters:
208
- array: Array with record structure
209
210
Returns:
211
list of str containing field names
212
"""
213
214
def with_field(array, what, where=None, highlevel=True, behavior=None):
215
"""
216
Add or replace a field in record array.
217
218
Parameters:
219
- array: Array with record structure
220
- what: Array containing values for the field
221
- where: str, field name (if None, what must be dict mapping names to values)
222
- highlevel: bool, if True return Array, if False return Content layout
223
- behavior: dict, custom behavior for the result
224
225
Returns:
226
Array with field added or modified
227
"""
228
229
def without_field(array, where, highlevel=True, behavior=None):
230
"""
231
Remove a field from record array.
232
233
Parameters:
234
- array: Array with record structure
235
- where: str or sequence of str, field name(s) to remove
236
- highlevel: bool, if True return Array, if False return Content layout
237
- behavior: dict, custom behavior for the result
238
239
Returns:
240
Array with specified field(s) removed
241
"""
242
243
def with_name(array, name, highlevel=True, behavior=None):
244
"""
245
Add a name to the array's type.
246
247
Parameters:
248
- array: Array to name
249
- name: str, name for the type
250
- highlevel: bool, if True return Array, if False return Content layout
251
- behavior: dict, custom behavior for the result
252
253
Returns:
254
Array with named type
255
"""
256
257
def with_parameter(array, key, value, highlevel=True, behavior=None):
258
"""
259
Add a parameter to the array's type.
260
261
Parameters:
262
- array: Array to modify
263
- key: str, parameter name
264
- value: parameter value
265
- highlevel: bool, if True return Array, if False return Content layout
266
- behavior: dict, custom behavior for the result
267
268
Returns:
269
Array with parameter added
270
"""
271
272
def without_parameters(array, highlevel=True, behavior=None):
273
"""
274
Remove all parameters from array's type.
275
276
Parameters:
277
- array: Array to modify
278
- highlevel: bool, if True return Array, if False return Content layout
279
- behavior: dict, custom behavior for the result
280
281
Returns:
282
Array without type parameters
283
"""
284
285
def parameters(array):
286
"""
287
Get parameters from array's type.
288
289
Parameters:
290
- array: Array to examine
291
292
Returns:
293
dict of parameters from the array's type
294
"""
295
```
296
297
### Advanced Structural Operations
298
299
Complex operations for generating combinations, cartesian products, and other advanced structural transformations.
300
301
```python { .api }
302
def combinations(array, n, axis=1, fields=None, parameters=None,
303
with_name=None, highlevel=True, behavior=None):
304
"""
305
Generate n-element combinations from each list in the array.
306
307
Parameters:
308
- array: Array containing lists to generate combinations from
309
- n: int, number of elements per combination
310
- axis: int, axis along which to generate combinations
311
- fields: list of str, field names for tuple elements
312
- parameters: dict, parameters for the resulting type
313
- with_name: str, name for the resulting type
314
- highlevel: bool, if True return Array, if False return Content layout
315
- behavior: dict, custom behavior for the result
316
317
Returns:
318
Array of n-tuples containing all combinations
319
"""
320
321
def argcombinations(array, n, axis=1, fields=None, parameters=None,
322
with_name=None, highlevel=True, behavior=None):
323
"""
324
Generate indices of n-element combinations from each list.
325
326
Parameters:
327
- array: Array containing lists to generate combination indices from
328
- n: int, number of elements per combination
329
- axis: int, axis along which to generate combinations
330
- fields: list of str, field names for tuple elements
331
- parameters: dict, parameters for the resulting type
332
- with_name: str, name for the resulting type
333
- highlevel: bool, if True return Array, if False return Content layout
334
- behavior: dict, custom behavior for the result
335
336
Returns:
337
Array of n-tuples containing indices of all combinations
338
"""
339
340
def cartesian(arrays, axis=1, nested=None, parameters=None, with_name=None,
341
highlevel=True, behavior=None):
342
"""
343
Generate cartesian product of arrays.
344
345
Parameters:
346
- arrays: dict mapping field names to Arrays, or sequence of Arrays
347
- axis: int, axis along which to form cartesian product
348
- nested: bool or sequence, control nesting behavior
349
- parameters: dict, parameters for the resulting type
350
- with_name: str, name for the resulting type
351
- highlevel: bool, if True return Array, if False return Content layout
352
- behavior: dict, custom behavior for the result
353
354
Returns:
355
Array containing cartesian product as records or tuples
356
"""
357
358
def argcartesian(arrays, axis=1, nested=None, parameters=None, with_name=None,
359
highlevel=True, behavior=None):
360
"""
361
Generate indices of cartesian product elements.
362
363
Parameters:
364
- arrays: dict mapping field names to Arrays, or sequence of Arrays
365
- axis: int, axis along which to form cartesian product
366
- nested: bool or sequence, control nesting behavior
367
- parameters: dict, parameters for the resulting type
368
- with_name: str, name for the resulting type
369
- highlevel: bool, if True return Array, if False return Content layout
370
- behavior: dict, custom behavior for the result
371
372
Returns:
373
Array containing indices of cartesian product elements
374
"""
375
```
376
377
### Array Transformation and Broadcasting
378
379
Functions for transforming array structure and ensuring compatible shapes for operations.
380
381
```python { .api }
382
def broadcast_arrays(*arrays, depth_limit=None, highlevel=True, behavior=None):
383
"""
384
Broadcast arrays to a common structure.
385
386
Parameters:
387
- arrays: Arrays to broadcast
388
- depth_limit: int, maximum depth to broadcast (None for unlimited)
389
- highlevel: bool, if True return Array, if False return Content layout
390
- behavior: dict, custom behavior for the result
391
392
Returns:
393
tuple of Arrays broadcasted to common structure
394
"""
395
396
def broadcast_fields(*arrays, highlevel=True, behavior=None):
397
"""
398
Broadcast record fields to common structure.
399
400
Parameters:
401
- arrays: Arrays with record structure to broadcast
402
- highlevel: bool, if True return Array, if False return Content layout
403
- behavior: dict, custom behavior for the result
404
405
Returns:
406
tuple of Arrays with fields broadcasted to common structure
407
"""
408
409
def singletons(array, highlevel=True, behavior=None):
410
"""
411
Wrap each element in a list of length 1.
412
413
Parameters:
414
- array: Array to wrap in singletons
415
- highlevel: bool, if True return Array, if False return Content layout
416
- behavior: dict, custom behavior for the result
417
418
Returns:
419
Array where each element is wrapped in a singleton list
420
"""
421
422
def firsts(array, axis=1, highlevel=True, behavior=None):
423
"""
424
Extract the first element from each list.
425
426
Parameters:
427
- array: Array containing lists
428
- axis: int, axis along which to extract firsts
429
- highlevel: bool, if True return Array, if False return Content layout
430
- behavior: dict, custom behavior for the result
431
432
Returns:
433
Array containing first element from each list (None for empty lists)
434
"""
435
```
436
437
### Utility Functions
438
439
Helper functions for copying arrays, checking array properties, and performing basic transformations.
440
441
```python { .api }
442
def copy(array, highlevel=True, behavior=None):
443
"""
444
Create a deep copy of the array.
445
446
Parameters:
447
- array: Array to copy
448
- highlevel: bool, if True return Array, if False return Content layout
449
- behavior: dict, custom behavior for the result
450
451
Returns:
452
Deep copy of the input array
453
"""
454
455
def transform(array, function, *args, highlevel=True, behavior=None, **kwargs):
456
"""
457
Apply a function to transform array structure.
458
459
Parameters:
460
- array: Array to transform
461
- function: callable that transforms Content layouts
462
- args: positional arguments for function
463
- highlevel: bool, if True return Array, if False return Content layout
464
- behavior: dict, custom behavior for the result
465
- kwargs: keyword arguments for function
466
467
Returns:
468
Array transformed by function
469
"""
470
471
def materialize(array, highlevel=True, behavior=None):
472
"""
473
Force materialization of lazy array operations.
474
475
Parameters:
476
- array: Array to materialize
477
- highlevel: bool, if True return Array, if False return Content layout
478
- behavior: dict, custom behavior for the result
479
480
Returns:
481
Materialized Array with all lazy operations evaluated
482
"""
483
484
def to_packed(array, highlevel=True, behavior=None):
485
"""
486
Pack array into contiguous memory layout.
487
488
Parameters:
489
- array: Array to pack
490
- highlevel: bool, if True return Array, if False return Content layout
491
- behavior: dict, custom behavior for the result
492
493
Returns:
494
Array with packed memory layout
495
"""
496
```
497
498
## Usage Examples
499
500
### Basic Manipulation
501
502
```python
503
import awkward as ak
504
505
# Create nested array
506
data = [[1, 2, 3], [4], [5, 6, 7, 8]]
507
array = ak.Array(data)
508
509
# Flatten one level
510
flat = ak.flatten(array) # [1, 2, 3, 4, 5, 6, 7, 8]
511
512
# Count elements per list
513
counts = ak.num(array) # [3, 1, 4]
514
515
# Unflatten back
516
reconstructed = ak.unflatten(flat, counts)
517
```
518
519
### Record Manipulation
520
521
```python
522
import awkward as ak
523
524
# Create record array
525
records = ak.Array([
526
{"x": 1, "y": [1, 2]},
527
{"x": 2, "y": [3, 4, 5]}
528
])
529
530
# Add field
531
with_z = ak.with_field(records, [10, 20], "z")
532
533
# Remove field
534
without_y = ak.without_field(records, "y")
535
536
# Get field names
537
field_names = ak.fields(records) # ["x", "y"]
538
```
539
540
### Filtering and Selection
541
542
```python
543
import awkward as ak
544
545
data = ak.Array([[1, 2, 3], [4], [5, 6]])
546
547
# Create mask (lists with more than 1 element)
548
mask = ak.num(data) > 1
549
550
# Apply mask
551
filtered = data[mask] # [[1, 2, 3], [5, 6]]
552
553
# Fill None values
554
with_nones = ak.Array([[1, None, 3], None, [5, 6]])
555
filled = ak.fill_none(with_nones, 0)
556
```
557
558
### Combinations and Products
559
560
```python
561
import awkward as ak
562
563
# Generate combinations
564
data = ak.Array([[1, 2, 3, 4], [5, 6]])
565
pairs = ak.combinations(data, 2) # [[(1,2), (1,3), (1,4), (2,3), (2,4), (3,4)], [(5,6)]]
566
567
# Cartesian product of two arrays
568
a = ak.Array([[1, 2], [3]])
569
b = ak.Array([["x", "y"], ["z"]])
570
product = ak.cartesian({"a": a, "b": b})
571
```