Tessl Tile for pypi/cupy-rocm-4-3@13.3.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

tessl/pypi-cupy-rocm-4-3

CuPy: NumPy & SciPy for GPU - A NumPy/SciPy-compatible array library for GPU-accelerated computing with Python, specifically built for AMD ROCm 4.3 platform

Workspace: tessl
Visibility: Public
Created: 3 months ago
Last updated: 3 months ago
Describes: pkg:pypi/cupy-rocm-4-3@13.3.x

To install, run

npx @tessl/cli install tessl/pypi-cupy-rocm-4-3@13.3.0

0
# CuPy
1

2
CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated computing with Python. It acts as a drop-in replacement for NumPy and SciPy code, enabling seamless migration of existing CPU-based numerical computations to GPU hardware for significant performance improvements. CuPy supports both NVIDIA CUDA and AMD ROCm platforms.
3

4
## Package Information
5

6
- **Package Name**: cupy-rocm-4-3
7
- **Language**: Python
8
- **Installation**: `pip install cupy-rocm-4-3`
9
- **GPU Platform**: AMD ROCm 4.3
10
- **Compatibility**: NumPy 1.26+, Python 3.8+
11

12
## Core Imports
13

14
```python
15
import cupy as cp
16
```
17

18
For specific functionality:
19

20
```python
21
import cupy
22
from cupy import cuda
23
import cupy.linalg
24
import cupy.random
25
import cupy.fft
26
import cupyx
27
```
28

29
## Basic Usage
30

31
```python
32
import cupy as cp
33
import numpy as np
34

35
# Create arrays on GPU
36
x_gpu = cp.array([1, 2, 3, 4, 5])
37
y_gpu = cp.linspace(0, 10, 100)
38

39
# NumPy-compatible operations run on GPU
40
z_gpu = cp.sin(x_gpu) * 2
41
mean_val = cp.mean(y_gpu)
42

43
# Linear algebra operations
44
A = cp.random.rand(1000, 1000)
45
B = cp.random.rand(1000, 1000)
46
C = cp.dot(A, B)  # Matrix multiplication on GPU
47

48
# Convert back to CPU when needed
49
result = cp.asnumpy(C)  # Returns numpy array
50

51
# Memory management
52
mempool = cp.get_default_memory_pool()
53
print(f"Memory used: {mempool.used_bytes()} bytes")
54
```
55

56
## Architecture
57

58
CuPy provides GPU acceleration through several key architectural components:
59

60
- **GPU Arrays**: `cupy.ndarray` objects that mirror NumPy's ndarray API but execute on GPU
61
- **Memory Management**: Automatic memory pools for efficient GPU memory allocation and deallocation
62
- **CUDA/ROCm Integration**: Direct access to GPU runtime, streams, events, and kernel compilation
63
- **Kernel System**: Custom kernel creation through ElementwiseKernel, ReductionKernel, and RawKernel
64
- **Device Management**: Multi-GPU support with context switching and device selection
65

66
The library maintains NumPy API compatibility while providing GPU-specific extensions through the `cupy.cuda` and `cupyx` modules, enabling both easy migration and advanced GPU programming.
67

68
## Capabilities
69

70
### Array Creation and Manipulation
71

72
Core array creation functions, data type handling, and array manipulation operations that mirror NumPy's functionality. Includes basic array creation, shape manipulation, indexing, and element access.
73

74
```python { .api }
75
def array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0): ...
76
def zeros(shape, dtype=float, order='C'): ...
77
def ones(shape, dtype=None, order='C'): ...
78
def empty(shape, dtype=float, order='C'): ...
79
def arange(start, stop=None, step=1, dtype=None): ...
80
def linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0): ...
81
```
82

83
[Array Creation and Manipulation](./array-creation.md)
84

85
### Mathematical Functions
86

87
Comprehensive mathematical operations including trigonometric, exponential, logarithmic, hyperbolic, arithmetic, and special functions. All operations are GPU-accelerated and maintain NumPy compatibility.
88

89
```python { .api }
90
def sin(x): ...
91
def cos(x): ...
92
def exp(x): ...
93
def log(x): ...
94
def sqrt(x): ...
95
def add(x1, x2): ...
96
def multiply(x1, x2): ...
97
def power(x1, x2): ...
98
```
99

100
[Mathematical Functions](./mathematical-functions.md)
101

102
### Linear Algebra
103

104
GPU-accelerated linear algebra operations including matrix multiplication, decompositions, eigenvalue computations, and equation solving. Powered by cuBLAS and cuSOLVER libraries.
105

106
```python { .api }
107
def dot(a, b, out=None): ...
108
def matmul(x1, x2, out=None): ...
109
def solve(a, b): ...
110
def inv(a): ...
111
def svd(a, full_matrices=True, compute_uv=True, hermitian=False): ...
112
def eigh(a, UPLO='L'): ...
113
```
114

115
[Linear Algebra](./linear-algebra.md)
116

117
### Random Number Generation
118

119
GPU-accelerated random number generation supporting multiple distributions and random number generators. Provides both legacy RandomState interface and modern Generator interface.
120

121
```python { .api }
122
def random(size=None): ...
123
def normal(loc=0.0, scale=1.0, size=None): ...
124
def uniform(low=0.0, high=1.0, size=None): ...
125
def choice(a, size=None, replace=True, p=None): ...
126
class RandomState: ...
127
class Generator: ...
128
```
129

130
[Random Number Generation](./random.md)
131

132
### Fast Fourier Transform
133

134
GPU-accelerated Fast Fourier Transform operations supporting 1D, 2D, and N-dimensional transforms for both complex and real data. Compatible with NumPy's FFT interface.
135

136
```python { .api }
137
def fft(a, n=None, axis=-1, norm=None): ...
138
def ifft(a, n=None, axis=-1, norm=None): ...
139
def rfft(a, n=None, axis=-1, norm=None): ...
140
def fft2(a, s=None, axes=(-2, -1), norm=None): ...
141
def fftn(a, s=None, axes=None, norm=None): ...
142
```
143

144
[Fast Fourier Transform](./fft.md)
145

146
### Statistics and Sorting
147

148
Statistical functions, sorting algorithms, and searching functions. Includes descriptive statistics, histograms, correlations, and efficient sorting operations.
149

150
```python { .api }
151
def mean(a, axis=None, dtype=None, out=None, keepdims=False): ...
152
def std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False): ...
153
def sort(a, axis=-1, kind=None, order=None): ...
154
def argsort(a, axis=-1, kind=None, order=None): ...
155
def histogram(a, bins=10, range=None, normed=None, weights=None, density=None): ...
156
```
157

158
[Statistics and Sorting](./statistics.md)
159

160
### CUDA Integration
161

162
Direct CUDA/ROCm integration providing low-level GPU control including memory management, stream operations, kernel compilation, and device management.
163

164
```python { .api }
165
class Device: ...
166
class Stream: ...
167
class MemoryPool: ...
168
def get_device_id(): ...
169
def synchronize(): ...
170
def malloc(size): ...
171
```
172

173
[CUDA Integration](./cuda-integration.md)
174

175
### Custom Kernels
176

177
Create custom GPU kernels for specialized operations. Supports element-wise kernels, reduction kernels, and raw CUDA kernels with just-in-time compilation.
178

179
```python { .api }
180
class ElementwiseKernel: ...
181
class ReductionKernel: ...
182
class RawKernel: ...
183
def fuse(*args, **kwargs): ...
184
```
185

186
[Custom Kernels](./custom-kernels.md)
187

188
### Extended Functionality (cupyx)
189

190
Extended functionality beyond NumPy compatibility including SciPy-compatible functions, JIT compilation, optimization utilities, and specialized GPU algorithms.
191

192
```python { .api }
193
def scatter_add(a, indices, updates, axis=None): ...
194
def rsqrt(x): ...
195
class GeneralizedUFunc: ...
196
def empty_pinned(shape, dtype=float, order='C'): ...
197
```
198

199
[Extended Functionality](./extended-functionality.md)
200

201
### Input/Output Functions
202

203
File input/output operations for saving and loading arrays in various formats. Supports NumPy-compatible binary formats (.npy, .npz) and text formats with automatic GPU-CPU data transfers.
204

205
```python { .api }
206
def save(file, arr, allow_pickle=True, fix_imports=True): ...
207
def load(file, mmap_mode=None, allow_pickle=False, fix_imports=True, encoding='ASCII'): ...
208
def savez(file, *args, **kwds): ...
209
def loadtxt(fname, dtype=float, comments='#', delimiter=None, converters=None): ...
210
def savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\n', header='', footer=''): ...
211
```
212

213
[Input/Output Functions](./io-functions.md)
214

215
### Polynomial Functions
216

217
Polynomial operations including fitting, evaluation, arithmetic, and root finding. Provides both functional interface and object-oriented poly1d class for polynomial manipulation.
218

219
```python { .api }
220
def poly(seq_of_zeros): ...
221
def polyval(p, x): ...
222
def polyfit(x, y, deg, rcond=None, full=False, w=None, cov=False): ...
223
def roots(p): ...
224
def polyadd(a1, a2): ...
225
def polymul(a1, a2): ...
226
class poly1d: ...
227
```
228

229
[Polynomial Functions](./polynomial.md)
230

231
### Data Type Functions
232

233
Data type utilities for type checking, conversion, and promotion. Essential functions for managing data types and ensuring compatibility between GPU array operations.
234

235
```python { .api }
236
def can_cast(from_, to, casting='safe'): ...
237
def result_type(*arrays_and_dtypes): ...
238
def common_type(*arrays): ...
239
def promote_types(type1, type2): ...
240
def finfo(dtype): ...
241
def iinfo(int_type): ...
242
```
243

244
[Data Type Functions](./data-types.md)
245

246
### Utility Functions
247

248
General utility functions for array inspection, memory management, and CuPy-specific operations. Includes functions for memory transfer, debugging, and functional programming patterns.
249

250
```python { .api }
251
def get_array_module(*args): ...
252
def asnumpy(a, stream=None, blocking=True): ...
253
def get_default_memory_pool(): ...
254
def vectorize(pyfunc, otypes=None, doc=None, excluded=None, cache=False): ...
255
def show_config(): ...
256
def who(vardict=None): ...
257
```
258

259
[Utility Functions](./utilities.md)
260

261
### Logic Functions
262

263
Logical operations, comparisons, and truth value testing. Includes element-wise logical operations, array comparisons, content testing for special values, and set operations.
264

265
```python { .api }
266
def logical_and(x1, x2): ...
267
def logical_or(x1, x2): ...
268
def equal(x1, x2): ...
269
def less(x1, x2): ...
270
def all(a, axis=None, out=None, keepdims=False): ...
271
def isfinite(x): ...
272
def in1d(ar1, ar2, assume_unique=False, invert=False): ...
273
```
274

275
[Logic Functions](./logic-functions.md)
276

277
## Types
278

279
```python { .api }
280
class ndarray:
281
    """GPU array class compatible with numpy.ndarray"""
282
    def __init__(self, shape, dtype=float, memptr=None, strides=None, order='C'): ...
283
    def get(self, stream=None, order='C', out=None): ...
284
    def set(self, arr, stream=None): ...
285
    def copy(self, order='K'): ...
286
    def astype(self, dtype, order='K', casting='unsafe', subok=True, copy=True): ...
287
    
288
    # Properties
289
    shape: tuple
290
    dtype: numpy.dtype
291
    size: int
292
    ndim: int
293
    data: cupy.cuda.MemoryPointer
294

295
class ufunc:
296
    """Universal function class for element-wise operations"""
297
    def __call__(self, *args, **kwargs): ...
298
    def reduce(self, a, axis=0, dtype=None, out=None, keepdims=False): ...
299
    def accumulate(self, array, axis=0, dtype=None, out=None): ...
300
```