Tessl Tile for pypi/cupy-cuda11x@13.6.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

array-operations.md cuda-integration.md custom-kernels.md fft.md index.md io-operations.md jit-compilation.md linear-algebra.md mathematical-functions.md performance-profiling.md polynomial-operations.md random.md scipy-extensions.md

index.mddocs/

0
# CuPy
1

2
CuPy is a NumPy/SciPy-compatible array library that accelerates NumPy-based code using NVIDIA CUDA or AMD ROCm platforms. It provides a comprehensive GPU-accelerated computing framework for scientific computing, machine learning, and data analysis, serving as a drop-in replacement for NumPy arrays with extensive mathematical operations, linear algebra, signal processing, and statistical functions.
3

4
## Package Information
5

6
- **Package Name**: cupy-cuda11x
7
- **Language**: Python
8
- **Installation**: `pip install cupy-cuda11x`
9
- **CUDA Compatibility**: CUDA 11.2 through 11.8
10
- **Platform Support**: Linux (x86_64, aarch64), Windows (x86_64)
11

12
## Core Imports
13

14
```python
15
import cupy as cp
16
```
17

18
For specific modules:
19

20
```python
21
import cupy
22
from cupy import fft, linalg, random
23
import cupyx
24
from cupyx import scipy
25
```
26

27
## Basic Usage
28

29
```python
30
import cupy as cp
31
import numpy as np
32

33
# Create arrays on GPU
34
gpu_array = cp.array([1, 2, 3, 4])
35
gpu_zeros = cp.zeros((1000, 1000))
36

37
# NumPy-compatible operations
38
result = cp.sum(gpu_array)
39
matrix_mult = cp.dot(gpu_zeros, gpu_zeros.T)
40

41
# Transfer between CPU and GPU
42
cpu_array = cp.asnumpy(gpu_array)  # GPU to CPU
43
gpu_from_cpu = cp.asarray(cpu_array)  # CPU to GPU
44

45
# Mathematical operations
46
x = cp.linspace(0, 2*cp.pi, 1000)
47
y = cp.sin(x)
48
```
49

50
## Architecture
51

52
CuPy's architecture mirrors NumPy while providing GPU acceleration:
53

54
- **Core Array**: `cupy.ndarray` - GPU memory-resident array objects with NumPy-compatible interface
55
- **Mathematical Functions**: Element-wise and reduction operations leveraging CUDA kernels
56
- **Memory Management**: Automatic memory pooling with configurable allocators for optimal GPU memory usage
57
- **Stream Processing**: Asynchronous execution support through CUDA streams
58
- **Kernel Integration**: Custom CUDA kernel support via RawKernel and ElementwiseKernel
59
- **SciPy Extensions**: `cupyx.scipy` provides GPU-accelerated SciPy-compatible functions
60

61
## Capabilities
62

63
### Array Operations
64

65
Core array creation, manipulation, and mathematical operations that form the foundation of GPU-accelerated NumPy-compatible computing.
66

67
```python { .api }
68
def array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0): ...
69
def zeros(shape, dtype=float, order='C'): ...
70
def ones(shape, dtype=None, order='C'): ...
71
def empty(shape, dtype=float, order='C'): ...
72
def arange(start, stop=None, step=1, dtype=None): ...
73
def linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0): ...
74
```
75

76
[Array Operations](./array-operations.md)
77

78
### Mathematical Functions
79

80
Comprehensive mathematical operations including trigonometric, hyperbolic, exponential, logarithmic, and statistical functions optimized for GPU execution.
81

82
```python { .api }
83
def sin(x, out=None, **kwargs): ...
84
def cos(x, out=None, **kwargs): ...
85
def exp(x, out=None, **kwargs): ...
86
def log(x, out=None, **kwargs): ...
87
def sqrt(x, out=None, **kwargs): ...
88
def sum(a, axis=None, dtype=None, out=None, keepdims=False): ...
89
def mean(a, axis=None, dtype=None, out=None, keepdims=False): ...
90
```
91

92
[Mathematical Functions](./mathematical-functions.md)
93

94
### Linear Algebra
95

96
GPU-accelerated linear algebra operations including matrix multiplication, decomposition, eigenvalue computation, and solving linear systems.
97

98
```python { .api }
99
def dot(a, b, out=None): ...
100
def matmul(x1, x2, out=None): ...
101
def einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe', optimize=False): ...
102
```
103

104
[Linear Algebra](./linear-algebra.md)
105

106
### Fast Fourier Transform
107

108
GPU-accelerated FFT operations supporting 1D, 2D, and N-D transforms with both forward and inverse operations.
109

110
```python { .api }
111
def fft(a, n=None, axis=-1, norm=None): ...
112
def ifft(a, n=None, axis=-1, norm=None): ...
113
def fft2(a, s=None, axes=(-2, -1), norm=None): ...
114
def fftn(a, s=None, axes=None, norm=None): ...
115
```
116

117
[Fast Fourier Transform](./fft.md)
118

119
### Random Number Generation
120

121
Comprehensive random number generation including uniform, normal, and specialized distributions, all optimized for GPU parallel execution.
122

123
```python { .api }
124
def random(size=None, dtype=float, out=None): ...
125
def normal(loc=0.0, scale=1.0, size=None, dtype=float): ...
126
def uniform(low=0.0, high=1.0, size=None, dtype=float): ...
127
def choice(a, size=None, replace=True, p=None): ...
128
```
129

130
[Random Number Generation](./random.md)
131

132
### CUDA Integration
133

134
Direct CUDA device management, memory operations, kernel execution, and stream processing for advanced GPU programming.
135

136
```python { .api }
137
class Device:
138
    def __init__(self, device=None): ...
139
    def use(self): ...
140
    
141
def get_device_id(): ...
142
def synchronize(): ...
143
```
144

145
[CUDA Integration](./cuda-integration.md)
146

147
### SciPy Extensions
148

149
GPU-accelerated SciPy-compatible functions including sparse matrices, signal processing, image processing, optimization, and statistical operations.
150

151
```python { .api }
152
# Available through cupyx.scipy
153
import cupyx.scipy as scipy
154
```
155

156
[SciPy Extensions](./scipy-extensions.md)
157

158
### Custom Kernel Development
159

160
Advanced CUDA kernel development enabling custom element-wise operations, reduction kernels, and raw CUDA programming for maximum performance and specialized computational tasks.
161

162
```python { .api }
163
class ElementwiseKernel:
164
    def __init__(self, in_params, out_params, operation, name="kernel", **kwargs): ...
165
    def __call__(self, *args, **kwargs): ...
166

167
class ReductionKernel:
168
    def __init__(self, in_params, out_params, map_expr, reduce_expr, **kwargs): ...
169
    def __call__(self, *args, **kwargs): ...
170

171
class RawKernel:
172
    def __init__(self, code, name, **kwargs): ...
173
    def __call__(self, grid, block, args=(), shared_mem=0, stream=None): ...
174
```
175

176
[Custom Kernel Development](./custom-kernels.md)
177

178
### JIT Compilation
179

180
Just-in-time compilation of Python functions to GPU kernels, enabling high-performance GPU programming with Python syntax and automatic optimization.
181

182
```python { .api }
183
def rawkernel(device=False): ...
184
def kernel(grid=None, block=None, shared_mem=0): ...
185
def elementwise(signature): ...
186
def reduction(signature, identity=None): ...
187
```
188

189
[JIT Compilation](./jit-compilation.md)
190

191
### Performance Profiling
192

193
Comprehensive performance analysis tools for measuring execution times, analyzing GPU utilization, memory usage profiling, and identifying optimization opportunities.
194

195
```python { .api }
196
def benchmark(func, args=(), kwargs=None, **params): ...
197
def time_range(): ...
198
def profile(): ...
199
def nvtx_push(message, color=None): ...
200
```
201

202
[Performance Profiling](./performance-profiling.md)
203

204
### Input/Output Operations
205

206
File I/O operations supporting various formats including binary, text, and compressed data with efficient GPU-CPU data transfer and memory management.
207

208
```python { .api }
209
def save(file, arr): ...
210
def load(file, **kwargs): ...
211
def loadtxt(fname, **kwargs): ...
212
def savetxt(fname, X, **kwargs): ...
213
```
214

215
[Input/Output Operations](./io-operations.md)
216

217
### Polynomial Operations
218

219
Mathematical operations with polynomials including arithmetic, evaluation, fitting, root finding, and advanced polynomial manipulations with support for various polynomial bases.
220

221
```python { .api }
222
class poly1d:
223
    def __init__(self, c_or_r, r=False, variable=None): ...
224
    def __call__(self, val): ...
225

226
def polyfit(x, y, deg, **kwargs): ...
227
def polyval(p, x): ...
228
def roots(p): ...
229
```
230

231
[Polynomial Operations](./polynomial-operations.md)
232

233
## Types
234

235
```python { .api }
236
class ndarray:
237
    """
238
    GPU-resident N-dimensional array object compatible with NumPy arrays.
239
    
240
    Attributes:
241
        shape: tuple of ints - dimensions of the array
242
        dtype: numpy.dtype - data type of array elements  
243
        size: int - total number of elements
244
        ndim: int - number of dimensions
245
        device: cupy.cuda.Device - GPU device containing the array
246
    """
247
    def __init__(self, shape, dtype=float, memptr=None, strides=None, order='C'): ...
248
    def get(self, stream=None, order='C', out=None): ...
249
    def set(self, arr, stream=None): ...
250
    def copy(self, order='C'): ...
251
    def astype(self, dtype, order='K', casting='unsafe', subok=True, copy=True): ...
252

253
class ufunc:
254
    """Universal function for element-wise operations on arrays."""
255
    def __call__(self, *args, **kwargs): ...
256
    def reduce(self, a, axis=0, dtype=None, out=None, keepdims=False): ...
257
    def accumulate(self, a, axis=0, dtype=None, out=None): ...
258

259
def asnumpy(a, stream=None, order='C', out=None, *, blocking=True) -> numpy.ndarray:
260
    """Convert CuPy array to NumPy array on CPU."""
261

262
def get_array_module(*args):
263
    """Return cupy if any argument is a CuPy array, otherwise numpy."""
264
```

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/