Tessl Tile for pypi/cupy@13.6.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

tessl/pypi-cupy

NumPy & SciPy-compatible array library for GPU-accelerated computing with Python

Workspace: tessl
Visibility: Public
Created: 3 months ago
Last updated: 3 months ago
Describes: pkg:pypi/cupy@13.6.x

To install, run

npx @tessl/cli install tessl/pypi-cupy@13.6.0

0
# CuPy
1

2
CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated computing with Python. CuPy acts as a drop-in replacement to run existing NumPy/SciPy code on NVIDIA CUDA or AMD ROCm platforms, providing significant performance improvements for mathematical computations, linear algebra, and scientific computing workloads.
3

4
## Package Information
5

6
- **Package Name**: cupy
7
- **Language**: Python
8
- **Installation**: `pip install cupy` (or `cupy-cuda11x`, `cupy-cuda12x` for specific CUDA versions)
9

10
## Core Imports
11

12
```python
13
import cupy as cp
14
```
15

16
For CUDA-specific functionality:
17

18
```python
19
import cupy.cuda as cuda
20
```
21

22
For extended functionality:
23

24
```python
25
import cupyx
26
```
27

28
## Basic Usage
29

30
```python
31
import cupy as cp
32
import numpy as np
33

34
# Create arrays on GPU
35
x_gpu = cp.array([1, 2, 3, 4, 5])
36
y_gpu = cp.zeros((3, 3))
37

38
# Perform operations on GPU (same API as NumPy)
39
result_gpu = cp.sum(x_gpu)
40
z_gpu = cp.dot(x_gpu, x_gpu)
41

42
# Transfer data between CPU and GPU
43
x_cpu = cp.asnumpy(x_gpu)  # GPU to CPU
44
x_gpu_from_cpu = cp.asarray(x_cpu)  # CPU to GPU
45

46
# Linear algebra operations
47
A = cp.random.random((1000, 1000))
48
B = cp.random.random((1000, 1000))
49
C = cp.dot(A, B)  # Performed on GPU
50

51
# Element-wise operations with broadcasting
52
result = cp.sqrt(A) + cp.sin(B)
53
```
54

55
## Architecture
56

57
CuPy's architecture mirrors NumPy while enabling GPU acceleration:
58

59
- **ndarray**: GPU-accelerated equivalent of NumPy arrays, supporting same interface and operations
60
- **CUDA Memory Management**: Automatic memory pooling and allocation on GPU devices  
61
- **Universal Functions (ufuncs)**: Element-wise operations optimized for parallel GPU execution
62
- **Kernel System**: Custom CUDA kernels for specialized operations not covered by standard functions
63
- **Stream Management**: CUDA streams for asynchronous execution and memory operations
64
- **Multi-GPU Support**: Distribution of computations across multiple GPU devices
65

66
This design enables seamless migration from NumPy to GPU computing while maintaining full API compatibility and adding CUDA-specific enhancements for maximum performance.
67

68
## Capabilities
69

70
### Core Array Class
71

72
The fundamental ndarray class providing GPU-accelerated multi-dimensional arrays.
73

74
```python { .api }
75
class ndarray:
76
    """
77
    GPU-accelerated multi-dimensional array object.
78
    
79
    Attributes:
80
    - shape: tuple, dimensions of the array
81
    - dtype: data type of array elements  
82
    - size: int, total number of elements
83
    - ndim: int, number of dimensions
84
    - itemsize: int, size of each element in bytes
85
    - nbytes: int, total bytes consumed by elements
86
    - device: cupy.cuda.Device, GPU device where array resides
87
    """
88
    
89
    def __init__(self, shape, dtype=float, order='C'): ...
90
    def astype(self, dtype, order='K', casting='unsafe', subok=True, copy=True): ...
91
    def copy(self, order='C'): ...
92
    def flatten(self, order='C'): ...
93
    def ravel(self, order='C'): ...
94
    def reshape(self, *shape, order='C'): ...
95
    def squeeze(self, axis=None): ...
96
    def transpose(self, *axes): ...
97
    def swapaxes(self, axis1, axis2): ...
98
    def get(self, stream=None, order='C', out=None): ...
99
    def set(self, arr, stream=None): ...
100
    def sum(self, axis=None, dtype=None, out=None, keepdims=False): ...
101
    def mean(self, axis=None, dtype=None, out=None, keepdims=False): ...
102
    def std(self, axis=None, dtype=None, out=None, ddof=0, keepdims=False): ...
103
    def var(self, axis=None, dtype=None, out=None, ddof=0, keepdims=False): ...
104
    def max(self, axis=None, out=None, keepdims=False, initial=None, where=None): ...
105
    def min(self, axis=None, out=None, keepdims=False, initial=None, where=None): ...
106
    def dot(self, b, out=None): ...
107
    def sort(self, axis=-1, kind=None, order=None): ...
108
    def argsort(self, axis=-1, kind=None, order=None): ...
109
```
110

111
### Array Creation and Manipulation
112

113
Core functionality for creating, reshaping, and manipulating GPU arrays with the same interface as NumPy.
114

115
```python { .api }
116
def array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0): ...
117
def zeros(shape, dtype=None, order='C'): ...
118
def ones(shape, dtype=None, order='C'): ...
119
def empty(shape, dtype=float32, order='C'): ...
120
def arange(start, stop=None, step=1, dtype=None): ...
121
def linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0): ...
122
def reshape(a, newshape, order='C'): ...
123
def concatenate(arrays, axis=0, out=None, dtype=None, casting='same_kind'): ...
124
```
125

126
[Array Creation and Manipulation](./array-creation.md)
127

128
### Array Manipulation and Reshaping
129

130
Shape manipulation, joining, splitting, and rearranging array operations.
131

132
```python { .api }
133
def reshape(a, newshape, order='C'): ...
134
def ravel(a, order='C'): ...
135
def transpose(a, axes=None): ...
136
def moveaxis(a, source, destination): ...
137
def swapaxes(a, axis1, axis2): ...
138
def squeeze(a, axis=None): ...
139
def expand_dims(a, axis): ...
140
def atleast_1d(*arys): ...
141
def atleast_2d(*arys): ...
142
def atleast_3d(*arys): ...
143
def stack(arrays, axis=0, out=None): ...
144
def vstack(tup): ...
145
def hstack(tup): ...
146
def dstack(tup): ...
147
def split(ary, indices_or_sections, axis=0): ...
148
def hsplit(ary, indices_or_sections): ...
149
def vsplit(ary, indices_or_sections): ...
150
def repeat(a, repeats, axis=None): ...
151
def tile(A, reps): ...
152
def flip(m, axis=None): ...
153
def roll(a, shift, axis=None): ...
154
```
155

156
### Mathematical Operations
157

158
Element-wise mathematical functions including trigonometric, logarithmic, arithmetic, and comparison operations.
159

160
```python { .api }
161
def add(x1, x2, /, out=None): ...
162
def multiply(x1, x2, /, out=None): ...
163
def sin(x, /, out=None): ...
164
def cos(x, /, out=None): ...
165
def exp(x, /, out=None): ...
166
def log(x, /, out=None): ...
167
def sqrt(x, /, out=None): ...
168
def maximum(x1, x2, /, out=None): ...
169
def sum(a, axis=None, dtype=None, out=None, keepdims=False, initial=None, where=None): ...
170
```
171

172
[Mathematical Operations](./math-functions.md)
173

174
### Linear Algebra
175

176
GPU-accelerated linear algebra operations including matrix multiplication, decompositions, eigenvalue computation, and solving linear systems.
177

178
```python { .api }
179
def dot(a, b, out=None): ...
180
def matmul(x1, x2, /, out=None, *, casting='same_kind', order='K', dtype=None, subok=True): ...
181
def einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe', optimize=False): ...
182
```
183

184
From `cupy.linalg`:
185

186
```python { .api }
187
def norm(x, ord=None, axis=None, keepdims=False): ...
188
def svd(a, full_matrices=True, compute_uv=True, hermitian=False): ...
189
def inv(a): ...
190
def solve(a, b): ...
191
def eigh(a, UPLO='L'): ...
192
```
193

194
[Linear Algebra](./linear-algebra.md)
195

196
### Random Number Generation
197

198
GPU-accelerated random number generation with multiple generators and probability distributions.
199

200
```python { .api }
201
def rand(*args): ...
202
def randn(*args): ...
203
def randint(low, high=None, size=None, dtype=int): ...
204
def random_sample(size=None): ...
205
def normal(loc=0.0, scale=1.0, size=None): ...
206
def uniform(low=0.0, high=1.0, size=None): ...
207
def choice(a, size=None, replace=True, p=None): ...
208
```
209

210
Generator API:
211

212
```python { .api }
213
def default_rng(seed=None): ...
214
class Generator:
215
    def random(self, size=None, dtype=float64, out=None): ...
216
    def integers(self, low, high=None, size=None, dtype=int64, endpoint=False): ...
217
```
218

219
[Random Number Generation](./random.md)
220

221
### Fast Fourier Transform
222

223
GPU-accelerated discrete Fourier transforms for signal processing and frequency domain analysis.
224

225
```python { .api }
226
def fft(a, n=None, axis=-1, norm=None): ...
227
def ifft(a, n=None, axis=-1, norm=None): ...
228
def fft2(a, s=None, axes=(-2, -1), norm=None): ...
229
def fftn(a, s=None, axes=None, norm=None): ...
230
def rfft(a, n=None, axis=-1, norm=None): ...
231
def fftshift(x, axes=None): ...
232
def fftfreq(n, d=1.0): ...
233
```
234

235
[Fast Fourier Transform](./fft.md)
236

237
### CUDA Memory and Device Management
238

239
Low-level CUDA functionality for memory allocation, device management, and stream operations.
240

241
```python { .api }
242
def get_default_memory_pool(): ...
243
def get_default_pinned_memory_pool(): ...
244
def is_available(): ...
245
def asnumpy(a, stream=None, order='C', out=None, *, blocking=True): ...
246
def get_array_module(*args): ...
247
```
248

249
From `cupy.cuda`:
250

251
```python { .api }
252
class Device:
253
    def __init__(self, device=None): ...
254
    def __enter__(self): ...
255
    def __exit__(self, *args): ...
256

257
class Stream:
258
    def __init__(self, null=False, non_blocking=False, priority=0): ...
259
    def synchronize(self): ...
260

261
class MemoryPool:
262
    def __init__(self, allocator=None): ...
263
    def malloc(self, size): ...
264
    def free_all_blocks(self): ...
265
```
266

267
[CUDA Memory and Device Management](./cuda-management.md)
268

269
### Custom Kernels and Performance
270

271
Tools for writing custom CUDA kernels and optimizing GPU performance.
272

273
```python { .api }
274
class ElementwiseKernel:
275
    def __init__(self, in_params, out_params, operation, name='kernel', **kwargs): ...
276
    def __call__(self, *args, **kwargs): ...
277

278
class ReductionKernel:
279
    def __init__(self, in_params, out_params, map_expr, reduce_expr, post_map_expr, identity, name='kernel', **kwargs): ...
280
    def __call__(self, *args, **kwargs): ...
281

282
class RawKernel:
283
    def __init__(self, code, name, **kwargs): ...
284
    def __call__(self, grid, block, args, *, shared_mem=0, stream=None): ...
285
```
286

287
[Custom Kernels and Performance](./kernels.md)
288

289
### Statistics and Data Analysis
290

291
Statistical functions for data analysis including descriptive statistics, correlations, and histograms.
292

293
```python { .api }
294
def mean(a, axis=None, dtype=None, out=None, keepdims=False): ...
295
def std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False): ...
296
def var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False): ...
297
def median(a, axis=None, out=None, overwrite_input=False, keepdims=False): ...
298
def corrcoef(x, y=None, rowvar=True, bias=None, ddof=None): ...
299
def histogram(a, bins=10, range=None, weights=None, density=None): ...
300
def percentile(a, q, axis=None, out=None, overwrite_input=False, method='linear', keepdims=False): ...
301
```
302

303
[Statistics and Data Analysis](./statistics.md)
304

305
### Indexing and Selection
306

307
Advanced indexing operations including multi-dimensional indexing, selection, and array generation utilities.
308

309
```python { .api }
310
def take(a, indices, axis=None, out=None, mode='raise'): ...
311
def take_along_axis(arr, indices, axis): ...
312
def choose(a, choices, out=None, mode='raise'): ...
313
def compress(condition, a, axis=None, out=None): ...
314
def extract(condition, arr): ...
315
def select(condlist, choicelist, default=0): ...
316
def indices(dimensions, dtype=int, sparse=False): ...
317
def ix_(*args): ...
318
def ravel_multi_index(multi_index, dims, mode='raise', order='C'): ...
319
def unravel_index(indices, shape, order='C'): ...
320
def diagonal(a, offset=0, axis1=0, axis2=1): ...
321
def diag_indices(n, ndim=2): ...
322
def triu_indices(n, k=0, m=None): ...
323
def tril_indices(n, k=0, m=None): ...
324
```
325

326
### Sparse Matrix Operations
327

328
GPU-accelerated sparse matrix operations for large-scale scientific computing.
329

330
```python { .api }
331
class csr_matrix:
332
    def __init__(self, arg1, shape=None, dtype=None, copy=False): ...
333
    def dot(self, other): ...
334
    def transpose(self, axes=None, copy=False): ...
335

336
class csc_matrix:
337
    def __init__(self, arg1, shape=None, dtype=None, copy=False): ...
338

339
class coo_matrix:
340
    def __init__(self, arg1, shape=None, dtype=None, copy=False): ...
341
```
342

343
[Sparse Matrix Operations](./sparse.md)
344

345
### SciPy Compatibility Extensions
346

347
Extended scientific computing functions from cupyx.scipy for advanced mathematical operations.
348

349
From `cupyx.scipy`:
350

351
```python { .api }
352
# Signal processing
353
def convolve(in1, in2, mode='full', method='auto'): ...
354
def correlate(in1, in2, mode='full', method='auto'): ...
355

356
# Image processing  
357
def gaussian_filter(input, sigma, order=0, output=None, mode='reflect', cval=0.0, truncate=4.0): ...
358
def sobel(input, axis=-1, output=None, mode='reflect', cval=0.0): ...
359

360
# Optimization
361
def minimize(fun, x0, args=(), method=None, jac=None, bounds=None, constraints=()): ...
362
```
363

364
[SciPy Extensions](./scipy-extensions.md)
365

366
### Input/Output Operations
367

368
File operations for loading and saving arrays in various formats.
369

370
```python { .api }
371
def load(file, mmap_mode=None, allow_pickle=True, fix_imports=True, encoding='ASCII'): ...
372
def save(file, arr, allow_pickle=True, fix_imports=True): ...
373
def savez(file, *args, **kwds): ...
374
def savez_compressed(file, *args, **kwds): ...
375
def savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\n', header='', footer='', comments='# ', encoding=None): ...
376
def loadtxt(fname, dtype=float, comments='#', delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0, encoding='bytes', max_rows=None): ...
377
```
378

379
### Logic and Comparison Functions
380

381
Element-wise logical operations, truth value testing, and array comparison functions.
382

383
```python { .api }
384
def allclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False): ...
385
def array_equal(a1, a2, equal_nan=False): ...
386
def array_equiv(a1, a2): ...
387
def isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False): ...
388
def isfinite(x, /, out=None): ...
389
def isinf(x, /, out=None): ...
390
def isnan(x, /, out=None): ...
391
def isreal(x): ...
392
def iscomplex(x): ...
393
def in1d(ar1, ar2, assume_unique=False, invert=False): ...
394
def isin(element, test_elements, assume_unique=False, invert=False): ...
395
def intersect1d(ar1, ar2, assume_unique=False, return_indices=False): ...
396
def setdiff1d(ar1, ar2, assume_unique=False): ...
397
def union1d(ar1, ar2): ...
398
```
399

400
### Binary Operations
401

402
Bitwise operations and binary representations.
403

404
```python { .api }
405
def bitwise_and(x1, x2, /, out=None): ...
406
def bitwise_or(x1, x2, /, out=None): ...
407
def bitwise_xor(x1, x2, /, out=None): ...
408
def bitwise_not(x, /, out=None): ...
409
def left_shift(x1, x2, /, out=None): ...
410
def right_shift(x1, x2, /, out=None): ...
411
def packbits(a, axis=None, bitorder='big'): ...
412
def unpackbits(a, axis=None, count=None, bitorder='big'): ...
413
```
414

415
## Error Handling
416

417
CuPy uses the same exception hierarchy as NumPy with additional CUDA-specific exceptions:
418

419
```python { .api }
420
class AxisError(Exception): ...
421
class ComplexWarning(Warning): ...
422
class TooHardError(Exception): ...
423
class VisibleDeprecationWarning(Warning): ...
424
```
425

426
Common CUDA-related errors are automatically handled with informative error messages for debugging GPU memory issues, device compatibility, and kernel execution problems.