Tessl Tile for pypi/cupy-cuda112@10.6.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

array-operations.md cuda-interface.md fft-operations.md index.md input-output.md linear-algebra.md math-operations.md random-generation.md scipy-extensions.md

index.mddocs/

0
# CuPy
1

2
CuPy is a NumPy & SciPy-compatible GPU-accelerated computing library that enables high-performance array operations on NVIDIA CUDA GPUs. It provides a drop-in replacement for NumPy, allowing existing NumPy/SciPy code to run on GPUs with minimal modifications while delivering significant performance improvements for large-scale numerical computations.
3

4
## Package Information
5

6
- **Package Name**: cupy-cuda112
7
- **Language**: Python
8
- **Installation**: `pip install cupy-cuda112`
9
- **GPU Requirements**: NVIDIA CUDA 11.2 or compatible
10
- **Homepage**: https://cupy.dev/
11
- **Documentation**: https://docs.cupy.dev/
12

13
## Core Imports
14

15
```python
16
import cupy as cp
17
```
18

19
For CUDA-specific functionality:
20

21
```python
22
import cupy.cuda
23
```
24

25
For SciPy-compatible extensions:
26

27
```python
28
import cupyx.scipy
29
```
30

31
## Basic Usage
32

33
```python
34
import cupy as cp
35
import numpy as np
36

37
# Create arrays on GPU
38
gpu_array = cp.array([1, 2, 3, 4, 5])
39
gpu_zeros = cp.zeros((3, 4))
40
gpu_random = cp.random.random((1000, 1000))
41

42
# Array operations (executed on GPU)
43
result = cp.sqrt(gpu_array)
44
matrix_mult = cp.dot(gpu_random, gpu_random.T)
45

46
# Convert back to NumPy for CPU operations
47
cpu_result = cp.asnumpy(result)
48

49
# Memory pool management
50
mempool = cp.get_default_memory_pool()
51
print(f"Used bytes: {mempool.used_bytes()}")
52
print(f"Total bytes: {mempool.total_bytes()}")
53

54
# Check GPU availability
55
if cp.cuda.is_available():
56
    print(f"CUDA devices available: {cp.cuda.runtime.getDeviceCount()}")
57
```
58

59
## Architecture
60

61
CuPy's architecture mirrors NumPy while adding GPU-specific capabilities:
62

63
- **Core Arrays**: `cupy.ndarray` provides GPU-accelerated N-dimensional arrays with NumPy-compatible interface
64
- **Universal Functions**: GPU-accelerated element-wise operations through `cupy.ufunc`
65
- **Memory Management**: Automatic memory pooling with configurable allocators for optimal GPU memory usage
66
- **CUDA Integration**: Direct access to CUDA streams, events, memory management, and custom kernel compilation
67
- **Custom Kernels**: Support for user-defined CUDA kernels through `RawKernel`, `ElementwiseKernel`, and `ReductionKernel`
68
- **Multi-GPU**: Support for multi-GPU computation and memory management
69
- **CuPy Extensions (cupyx)**: Additional functionality including SciPy compatibility, profiling, JIT compilation, and advanced linear algebra
70

71
This design enables seamless migration from NumPy-based code to GPU-accelerated computation while providing advanced CUDA programming capabilities for performance-critical applications.
72

73
## Capabilities
74

75
### Array Creation and Manipulation
76

77
Core functionality for creating, reshaping, and manipulating N-dimensional arrays on GPU, providing NumPy-compatible array creation routines with GPU memory allocation.
78

79
```python { .api }
80
def array(obj, dtype=None, copy=True, order='K', subok=False, ndmin=0): ...
81
def zeros(shape, dtype=float, order='C'): ...
82
def ones(shape, dtype=float, order='C'): ...
83
def empty(shape, dtype=float, order='C'): ...
84
def arange(start, stop=None, step=1, dtype=None): ...
85
def linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None): ...
86
def reshape(a, newshape, order='C'): ...
87
def transpose(a, axes=None): ...
88
def concatenate(arrays, axis=0, out=None): ...
89
```
90

91
[Array Operations](./array-operations.md)
92

93
### Mathematical Functions
94

95
Comprehensive collection of mathematical operations including trigonometric, hyperbolic, exponential, logarithmic, and arithmetic functions optimized for GPU execution.
96

97
```python { .api }
98
def sin(x, out=None, **kwargs): ...
99
def cos(x, out=None, **kwargs): ...
100
def exp(x, out=None, **kwargs): ...
101
def log(x, out=None, **kwargs): ...
102
def sqrt(x, out=None, **kwargs): ...
103
def add(x1, x2, out=None, **kwargs): ...
104
def multiply(x1, x2, out=None, **kwargs): ...
105
def sum(a, axis=None, dtype=None, out=None, keepdims=False): ...
106
def mean(a, axis=None, dtype=None, out=None, keepdims=False): ...
107
```
108

109
[Mathematical Operations](./math-operations.md)
110

111
### Linear Algebra
112

113
GPU-accelerated linear algebra operations including matrix multiplication, decompositions, eigenvalue computation, and equation solving using cuBLAS and cuSOLVER.
114

115
```python { .api }
116
def dot(a, b, out=None): ...
117
def matmul(x1, x2, out=None): ...
118
def linalg.svd(a, full_matrices=True, compute_uv=True, hermitian=False): ...
119
def linalg.eigh(a, UPLO='L'): ...
120
def linalg.solve(a, b): ...
121
def linalg.inv(a): ...
122
def linalg.norm(x, ord=None, axis=None, keepdims=False): ...
123
def einsum(subscripts, *operands, **kwargs): ...
124
```
125

126
[Linear Algebra](./linear-algebra.md)
127

128
### Random Number Generation
129

130
GPU-accelerated random number generation supporting multiple bit generators and probability distributions for statistical computing and simulation.
131

132
```python { .api }
133
def random.random(size=None, dtype=float): ...
134
def random.rand(*args): ...
135
def random.randn(*args): ...
136
def random.randint(low, high=None, size=None, dtype=int): ...
137
def random.normal(loc=0.0, scale=1.0, size=None): ...
138
def random.uniform(low=0.0, high=1.0, size=None): ...
139
class random.Generator: ...
140
def random.default_rng(seed=None): ...
141
```
142

143
[Random Number Generation](./random-generation.md)
144

145
### CUDA Integration
146

147
Direct interface to CUDA runtime, memory management, stream processing, and custom kernel development for advanced GPU programming.
148

149
```python { .api }
150
class cuda.Device: ...
151
def cuda.get_device_id(): ...
152
class cuda.MemoryPool: ...
153
class cuda.Stream: ...
154
class cuda.Event: ...
155
def cuda.compile_with_cache(source, options=(), **kwargs): ...
156
class ElementwiseKernel: ...
157
class RawKernel: ...
158
```
159

160
[CUDA Interface](./cuda-interface.md)
161

162
### Fast Fourier Transform
163

164
GPU-accelerated FFT operations for signal processing and frequency domain analysis using cuFFT library.
165

166
```python { .api }
167
def fft.fft(a, n=None, axis=-1, norm=None): ...
168
def fft.ifft(a, n=None, axis=-1, norm=None): ...
169
def fft.fft2(a, s=None, axes=(-2, -1), norm=None): ...
170
def fft.fftn(a, s=None, axes=None, norm=None): ...
171
def fft.rfft(a, n=None, axis=-1, norm=None): ...
172
def fft.fftfreq(n, d=1.0): ...
173
```
174

175
[FFT Operations](./fft-operations.md)
176

177
### SciPy Compatibility
178

179
Extended functionality providing SciPy-compatible operations for sparse matrices, signal processing, image processing, and specialized mathematical functions.
180

181
```python { .api }
182
import cupyx.scipy.sparse
183
import cupyx.scipy.ndimage
184
import cupyx.scipy.signal
185
import cupyx.scipy.special
186
import cupyx.scipy.linalg
187
def cupyx.scipy.sparse.csr_matrix(arg1, shape=None, dtype=None, copy=False): ...
188
def cupyx.scipy.ndimage.gaussian_filter(input, sigma, **kwargs): ...
189
```
190

191
[SciPy Extensions](./scipy-extensions.md)
192

193
### Input/Output Operations
194

195
File I/O operations for saving and loading arrays in various formats including NumPy's .npy and .npz formats.
196

197
```python { .api }
198
def save(file, arr, allow_pickle=True, fix_imports=True): ...
199
def load(file, mmap_mode=None, allow_pickle=False, fix_imports=True, encoding='ASCII'): ...
200
def savez(file, *args, **kwds): ...
201
def savez_compressed(file, *args, **kwds): ...
202
def savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\\n', header='', footer='', comments='# ', encoding=None): ...
203
```
204

205
[Input/Output](./input-output.md)
206

207
## Types
208

209
```python { .api }
210
class ndarray:
211
    """N-dimensional array object on GPU memory"""
212
    def __init__(self, shape, dtype=float, buffer=None, offset=0, strides=None, order=None): ...
213
    def get(self, stream=None, order='C', out=None): ...  # Transfer to CPU
214
    def set(self, arr, stream=None): ...  # Transfer from CPU
215
    @property
216
    def device(self): ...
217
    @property
218
    def data(self): ...
219
    @property
220
    def shape(self): ...
221
    @property
222
    def dtype(self): ...
223

224
class ufunc:
225
    """Universal function for element-wise operations"""
226
    def __call__(self, *args, **kwargs): ...
227
    def reduce(self, a, axis=0, dtype=None, out=None, keepdims=False): ...
228
    def accumulate(self, a, axis=0, dtype=None, out=None): ...
229

230
# Memory management types
231
class cuda.MemoryPointer: ...
232
class cuda.Memory: ...
233
class cuda.MemoryPool: ...
234
class cuda.PinnedMemory: ...
235

236
# Stream and event types  
237
class cuda.Stream: ...
238
class cuda.Event: ...
239
class cuda.Device: ...
240

241
# Custom kernel types
242
class ElementwiseKernel: ...
243
class ReductionKernel: ...
244
class RawKernel: ...
245
class RawModule: ...
246
```

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/