Tessl Tile for pypi/awkward@2.8.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

array-creation.md array-manipulation.md data-conversion.md index.md integration.md mathematical-operations.md string-operations.md type-system.md

integration.mddocs/

0
# Integration Modules
1

2
Seamless integration with high-performance computing frameworks including Numba JIT compilation, JAX automatic differentiation, and specialized backends for GPU computing and scientific workflows. These integrations enable awkward arrays to participate in high-performance computing pipelines while maintaining their flexible data model.
3

4
## Capabilities
5

6
### Numba Integration
7

8
Just-in-time compilation support for high-performance computing with awkward arrays, enabling compiled functions that work directly with nested data structures.
9

10
```python { .api }
11
import awkward.numba
12

13
def enable_numba():
14
    """
15
    Enable Numba integration for awkward arrays.
16
    
17
    This function registers awkward array types with Numba's type system,
18
    allowing awkward arrays to be used in @numba.jit decorated functions.
19
    """
20

21
# Numba-compilable operations
22
@numba.jit
23
def compute_with_awkward(array):
24
    """
25
    Example of Numba-compiled function working with awkward arrays.
26
    
27
    Parameters:
28
    - array: Awkward array that will be compiled
29
    
30
    Returns:
31
    Computed result with full JIT performance
32
    """
33
```
34

35
The Numba integration provides:
36
- **Type registration**: Awkward array types are registered with Numba's type inference system
37
- **Layout support**: All awkward layout types can be used in compiled functions  
38
- **Memory management**: Proper memory handling for nested structures in compiled code
39
- **Performance**: Near C-speed execution for complex data processing pipelines
40

41
### JAX Integration
42

43
Automatic differentiation and GPU computing support through JAX integration, enabling machine learning and scientific computing workflows.
44

45
```python { .api }
46
import awkward.jax
47

48
def register_jax():
49
    """
50
    Register awkward arrays with JAX transformation system.
51
    
52
    Enables awkward arrays to participate in JAX transformations like
53
    jit, grad, vmap, and pmap for automatic differentiation and 
54
    parallelization.
55
    """
56

57
# JAX transformation support
58
def jax_compatible_function(array):
59
    """
60
    Function that can be transformed by JAX (jit, grad, etc.).
61
    
62
    Parameters:
63
    - array: Awkward array compatible with JAX transformations
64
    
65
    Returns:
66
    Result that supports automatic differentiation
67
    """
68
```
69

70
JAX integration features:
71
- **Automatic differentiation**: Compute gradients through nested data operations
72
- **JIT compilation**: Compile functions involving awkward arrays for GPU execution
73
- **Vectorization**: Apply functions across batches of nested data
74
- **Parallelization**: Multi-device execution for large-scale computations
75

76
### Backend Management
77

78
Unified interface for managing computational backends and moving arrays between different execution environments.
79

80
```python { .api }
81
def backend(array):
82
    """
83
    Get the computational backend currently used by array.
84
    
85
    Parameters:
86
    - array: Array to check backend for
87
    
88
    Returns:
89
    str indicating current backend ("cpu", "cuda", "jax", etc.)
90
    """
91

92
def to_backend(array, backend, highlevel=True, behavior=None):
93
    """
94
    Move array to specified computational backend.
95
    
96
    Parameters:
97
    - array: Array to move
98
    - backend: str, target backend name
99
        - "cpu": Standard CPU backend using NumPy
100
        - "cuda": CUDA backend using CuPy  
101
        - "jax": JAX backend for automatic differentiation
102
        - "typetracer": Type inference backend without data
103
    - highlevel: bool, if True return Array, if False return Content layout
104
    - behavior: dict, custom behavior for the result
105
    
106
    Returns:
107
    Array moved to target backend
108
    """
109

110
def copy_to(array, backend):
111
    """
112
    Copy array data to different backend.
113
    
114
    Parameters:
115
    - array: Array to copy
116
    - backend: str, destination backend
117
    
118
    Returns:
119
    Array copy on target backend
120
    """
121
```
122

123
### Type Tracer Integration
124

125
Lazy type inference system that analyzes array operations without materializing data, enabling static analysis and optimization.
126

127
```python { .api }
128
import awkward.typetracer
129

130
class TypeTracer:
131
    """
132
    Lazy evaluation system for type inference without data materialization.
133
    
134
    TypeTracer arrays track type information and operations without 
135
    storing actual data, enabling:
136
    - Static type checking
137
    - Memory usage analysis  
138
    - Operation optimization
139
    - Schema inference
140
    """
141
    
142
    def touch_data(self, recursive=True):
143
        """
144
        Mark data as accessed for dependency tracking.
145
        
146
        Parameters:
147
        - recursive: bool, if True mark nested data as touched
148
        """
149
        
150
    def touch_shape(self, recursive=True):
151
        """
152
        Mark shape information as accessed.
153
        
154
        Parameters:
155
        - recursive: bool, if True mark nested shapes as touched
156
        """
157

158
def typetracer_with_report(array):
159
    """
160
    Create type tracer that generates access reports.
161
    
162
    Parameters:
163
    - array: Array to create type tracer for
164
    
165
    Returns:
166
    tuple of (TypeTracer array, report function)
167
    """
168

169
def typetracer_from_form(form):
170
    """
171
    Create type tracer directly from Form description.
172
    
173
    Parameters:
174
    - form: Form object describing array structure
175
    
176
    Returns:
177
    TypeTracer array matching the form
178
    """
179
```
180

181
### CppYY Integration
182

183
C++ interoperability through cppyy, enabling integration with C++ libraries and ROOT ecosystem common in high-energy physics.
184

185
```python { .api }
186
import awkward.cppyy
187

188
def register_cppyy():
189
    """
190
    Register awkward types with cppyy for C++ interoperability.
191
    
192
    Enables:
193
    - Passing awkward arrays to C++ functions
194
    - Converting C++ containers to awkward arrays  
195
    - Integration with ROOT data analysis framework
196
    - Zero-copy data sharing where possible
197
    """
198

199
def cpp_interface(array):
200
    """
201
    Create C++-compatible interface for array.
202
    
203
    Parameters:  
204
    - array: Awkward array to create C++ interface for
205
    
206
    Returns:
207
    C++-compatible proxy object
208
    """
209
```
210

211
### GPU Computing Support
212

213
Functions for GPU-accelerated computing using CUDA and related frameworks.
214

215
```python { .api }
216
def to_cuda(array):
217
    """
218
    Move array to CUDA GPU memory.
219
    
220
    Parameters:
221
    - array: Array to move to GPU
222
    
223
    Returns:
224
    Array with data in GPU memory
225
    """
226

227
def from_cuda(array):
228
    """  
229
    Move array from GPU to CPU memory.
230
    
231
    Parameters:
232
    - array: GPU array to move to CPU
233
    
234
    Returns:
235
    Array with data in CPU memory
236
    """
237

238
def is_cuda(array):
239
    """
240
    Test if array data resides in GPU memory.
241
    
242
    Parameters:
243
    - array: Array to test
244
    
245
    Returns:
246
    bool indicating if array is on GPU
247
    """
248
```
249

250
### Framework-Specific Integration Utilities
251

252
Helper functions for specific integration scenarios and framework compatibility.
253

254
```python { .api }
255
def numba_array_typer(array_type):
256
    """
257
    Create Numba type signature for awkward array type.
258
    
259
    Parameters:
260
    - array_type: Awkward array type
261
    
262
    Returns:
263
    Numba type signature for compilation
264
    """
265

266
def jax_pytree_flatten(array):
267
    """
268
    Flatten awkward array for JAX pytree operations.
269
    
270
    Parameters:
271
    - array: Array to flatten
272
    
273
    Returns:
274
    tuple of (leaves, tree_def) for JAX pytree operations
275
    """
276

277
def jax_pytree_unflatten(tree_def, leaves):
278
    """
279
    Reconstruct awkward array from JAX pytree components.
280
    
281
    Parameters:
282
    - tree_def: Tree definition from flatten operation
283
    - leaves: Leaf values from flatten operation
284
    
285
    Returns:
286
    Reconstructed awkward array
287
    """
288

289
def dispatch_map():
290
    """
291
    Get mapping of operations to backend-specific implementations.
292
    
293
    Returns:
294
    dict mapping operation names to backend implementations
295
    """
296
```
297

298
### Performance Optimization Utilities
299

300
Tools for analyzing and optimizing performance across different backends and integration scenarios.
301

302
```python { .api }
303
def benchmark_backends(array, operation, backends=None):
304
    """
305
    Benchmark operation performance across different backends.
306
    
307
    Parameters:
308
    - array: Array to benchmark with
309
    - operation: Function to benchmark
310
    - backends: list of str, backends to test (None for all available)
311
    
312
    Returns:
313
    dict mapping backend names to timing results
314
    """
315

316
def memory_usage(array, backend=None):
317
    """
318
    Analyze memory usage of array on specified backend.
319
    
320
    Parameters:
321
    - array: Array to analyze
322
    - backend: str, backend to check (None for current)
323
    
324
    Returns:
325
    dict with memory usage statistics
326
    """
327

328
def optimize_for_backend(array, backend, operation_hint=None):
329
    """
330
    Optimize array layout for specific backend and operation.
331
    
332
    Parameters:
333
    - array: Array to optimize
334
    - backend: str, target backend
335
    - operation_hint: str, hint about intended operations
336
    
337
    Returns:
338
    Array optimized for target backend
339
    """
340
```
341

342
## Usage Examples
343

344
### Numba JIT Compilation
345

346
```python
347
import awkward as ak
348
import numba
349
import numpy as np
350

351
# Enable numba integration
352
ak.numba.register()
353

354
@numba.jit
355
def fast_computation(events):
356
    """JIT-compiled function working with nested data."""
357
    total = 0.0
358
    for event in events:
359
        for particle in event.particles:
360
            if particle.pt > 10.0:
361
                total += particle.pt * particle.pt
362
    return total
363

364
# Use with nested data
365
events = ak.Array([
366
    {"particles": [{"pt": 15.0}, {"pt": 5.0}]},
367
    {"particles": [{"pt": 25.0}, {"pt": 12.0}]}
368
])
369

370
result = fast_computation(events)  # Runs at compiled speed
371
```
372

373
### JAX Integration
374

375
```python
376
import awkward as ak
377
import jax
378
import jax.numpy as jnp
379

380
# Register awkward arrays as JAX pytrees
381
ak.jax.register()
382

383
def physics_calculation(events):
384
    """Function that can be JAX-transformed."""
385
    pts = events.particles.pt
386
    return ak.sum(pts * pts, axis=1)
387

388
# Apply JAX transformations
389
jit_calc = jax.jit(physics_calculation)
390
vectorized_calc = jax.vmap(physics_calculation)
391

392
# Automatic differentiation
393
def loss_function(events, weights):
394
    result = physics_calculation(events)
395
    return jnp.sum(result * weights)
396

397
gradient_fn = jax.grad(loss_function, argnums=1)
398
```
399

400
### Backend Management
401

402
```python
403
import awkward as ak
404
import cupy as cp
405

406
# Create array on CPU
407
cpu_array = ak.Array([[1, 2, 3], [4, 5]])
408
print(ak.backend(cpu_array))  # "cpu"
409

410
# Move to GPU
411
gpu_array = ak.to_backend(cpu_array, "cuda")
412
print(ak.backend(gpu_array))  # "cuda"
413

414
# Check if CUDA is available
415
if cp.cuda.is_available():
416
    # Perform GPU computation
417
    gpu_result = ak.sum(gpu_array * gpu_array)
418
    
419
    # Move result back to CPU  
420
    cpu_result = ak.to_backend(gpu_result, "cpu")
421
```
422

423
### Type Tracing for Optimization
424

425
```python
426
import awkward as ak
427

428
# Create type tracer for schema analysis
429
form = ak.forms.RecordForm([
430
    ak.forms.ListForm("i64", "i64", ak.forms.NumpyForm("float64")),
431
    ak.forms.NumpyForm("int32")
432
], ["particles", "event_id"])
433

434
tracer = ak.typetracer.typetracer_from_form(form)
435

436
def analyze_operation(data):
437
    """Function to analyze without data."""
438
    return ak.sum(data.particles, axis=1) + data.event_id
439

440
# Trace operation to understand access patterns
441
traced_result = analyze_operation(tracer)
442
print(f"Result type: {ak.type(traced_result)}")
443
```
444

445
### C++ Integration via CppYY
446

447
```python
448
import awkward as ak
449
import cppyy
450

451
# Register awkward arrays with cppyy  
452
ak.cppyy.register()
453

454
# Define C++ function (example)
455
cppyy.cppdef("""
456
double compute_mass(const std::vector<double>& pt, 
457
                   const std::vector<double>& eta) {
458
    double total = 0.0;
459
    for(size_t i = 0; i < pt.size(); ++i) {
460
        total += pt[i] * cosh(eta[i]);
461
    }
462
    return total;
463
}
464
""")
465

466
# Use with awkward arrays
467
particles = ak.Array({
468
    "pt": [[10.0, 20.0], [15.0]], 
469
    "eta": [[1.0, 0.5], [1.2]]
470
})
471

472
# Convert to C++ compatible format and call
473
for event in particles:
474
    mass = cppyy.gbl.compute_mass(event.pt, event.eta)
475
    print(f"Event mass: {mass}")
476
```
477

478
### Performance Benchmarking
479

480
```python
481
import awkward as ak
482
import time
483

484
# Create test data
485
large_array = ak.Array([
486
    [i + j for j in range(1000)] 
487
    for i in range(1000)
488
])
489

490
def benchmark_operation(array, backend_name):
491
    """Benchmark array operation on specific backend."""
492
    # Move to backend
493
    backend_array = ak.to_backend(array, backend_name)
494
    
495
    # Time the operation
496
    start = time.time()
497
    result = ak.sum(backend_array * backend_array, axis=1)
498
    end = time.time()
499
    
500
    return end - start
501

502
# Compare backends
503
backends = ["cpu"]
504
if ak.backend.cuda_available():
505
    backends.append("cuda")
506
if ak.backend.jax_available():
507
    backends.append("jax")
508

509
for backend in backends:
510
    duration = benchmark_operation(large_array, backend)
511
    print(f"{backend}: {duration:.4f} seconds")
512
```
513

514
### Integration Best Practices
515

516
```python
517
import awkward as ak
518

519
def optimize_for_computation(array, target_backend="cpu", operation="reduction"):
520
    """Optimize array for specific computation pattern."""
521
    
522
    # Pack array for better memory layout
523
    packed = ak.to_packed(array)
524
    
525
    # Move to target backend
526
    backend_array = ak.to_backend(packed, target_backend)
527
    
528
    # Apply operation-specific optimizations
529
    if operation == "reduction" and target_backend == "cuda":
530
        # Use specific CUDA optimizations
531
        return ak.with_parameter(backend_array, "gpu_optimized", True)
532
    
533
    return backend_array
534

535
# Example usage
536
data = ak.Array([[1, 2, 3], [4, 5, 6, 7], [8, 9]])
537
optimized = optimize_for_computation(data, "cuda", "reduction")
538
result = ak.sum(optimized, axis=1)  # Runs with optimizations
539
```

Version

Tile

Files

integration.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

integration.mddocs/