or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

array-operations.mdcuda-integration.mdfft-operations.mdindex.mdio-operations.mdlinear-algebra.mdmathematical-functions.mdpolynomial-functions.mdrandom-generation.mdscipy-compatibility.md

index.mddocs/

0

# CuPy

1

2

CuPy is a NumPy/SciPy-compatible array library that enables GPU-accelerated computing with Python. It provides identical APIs to NumPy and SciPy while leveraging GPU parallelism for significant performance improvements on NVIDIA CUDA platforms. CuPy serves as a drop-in replacement for NumPy operations, featuring seamless CPU/GPU data transfer, custom CUDA kernel integration, and comprehensive mathematical operations including linear algebra, FFT, sparse matrices, and random number generation.

3

4

## Package Information

5

6

- **Package Name**: cupy-cuda111

7

- **Language**: Python

8

- **Installation**: `pip install cupy-cuda111`

9

- **CUDA Version**: 11.1

10

- **Homepage**: https://cupy.dev/

11

- **Documentation**: https://docs.cupy.dev/en/stable/

12

13

## Core Imports

14

15

```python

16

import cupy as cp

17

```

18

19

Common imports for specific functionality:

20

21

```python

22

# Core array operations (main namespace)

23

import cupy as cp

24

25

# GPU memory management

26

import cupy.cuda as cuda

27

28

# Linear algebra

29

import cupy.linalg as linalg

30

31

# Random number generation

32

import cupy.random as random

33

34

# Fast Fourier Transform

35

import cupy.fft as fft

36

37

# SciPy-compatible functions

38

import cupyx.scipy as scipy

39

40

# Sparse matrices (updated path)

41

import cupyx.scipy.sparse as sparse

42

43

# Testing utilities

44

import cupy.testing as testing

45

```

46

47

## Basic Usage

48

49

```python

50

import cupy as cp

51

import numpy as np

52

53

# Create arrays on GPU

54

gpu_array = cp.array([1, 2, 3, 4, 5])

55

gpu_zeros = cp.zeros((3, 4))

56

gpu_random = cp.random.random((100, 100))

57

58

# NumPy-compatible operations on GPU

59

result = cp.sin(gpu_array) + cp.cos(gpu_array)

60

matrix_mult = cp.dot(gpu_random, gpu_random.T)

61

62

# Transfer between CPU and GPU

63

cpu_data = np.array([1, 2, 3, 4, 5])

64

gpu_data = cp.asarray(cpu_data) # CPU to GPU

65

back_to_cpu = cp.asnumpy(gpu_data) # GPU to CPU

66

67

# Memory management

68

mempool = cp.get_default_memory_pool()

69

print(f"Used bytes: {mempool.used_bytes()}")

70

print(f"Total bytes: {mempool.total_bytes()}")

71

72

# Check GPU availability

73

if cp.cuda.is_available():

74

print(f"GPU device: {cp.cuda.Device().id}")

75

```

76

77

## Architecture

78

79

CuPy's architecture mirrors NumPy while adding GPU acceleration:

80

81

- **ndarray**: Core GPU array class providing NumPy-compatible interface

82

- **CUDA Integration**: Direct access to CUDA runtime, memory management, and custom kernels

83

- **Universal Functions (ufuncs)**: Element-wise operations optimized for GPU execution

84

- **Memory Pools**: Efficient GPU memory allocation and reuse

85

- **Stream Management**: Asynchronous execution and multi-stream operations

86

- **Custom Kernels**: Integration of user-defined CUDA kernels via ElementwiseKernel, ReductionKernel, and RawKernel

87

88

This design provides seamless NumPy compatibility while unlocking GPU performance for scientific computing, machine learning, and data analysis workloads.

89

90

## Capabilities

91

92

### Array Creation and Manipulation

93

94

Comprehensive array creation functions, shape manipulation, indexing, and data type operations. Provides all NumPy array creation patterns with GPU acceleration.

95

96

```python { .api }

97

# Basic creation

98

def zeros(shape, dtype=float, order='C'): ...

99

def ones(shape, dtype=None, order='C'): ...

100

def empty(shape, dtype=float, order='C'): ...

101

def full(shape, fill_value, dtype=None, order='C'): ...

102

def arange(start, stop=None, step=1, dtype=None): ...

103

def linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0): ...

104

105

# From data

106

def array(obj, dtype=None, copy=True, order='K', subok=False, ndmin=0): ...

107

def asarray(a, dtype=None, order=None): ...

108

def asanyarray(a, dtype=None, order=None): ...

109

110

# Shape manipulation

111

def reshape(a, newshape, order='C'): ...

112

def ravel(a, order='C'): ...

113

def transpose(a, axes=None): ...

114

def moveaxis(a, source, destination): ...

115

def expand_dims(a, axis): ...

116

def squeeze(a, axis=None): ...

117

```

118

119

[Array Operations](./array-operations.md)

120

121

### Mathematical Functions

122

123

Complete mathematical function library including trigonometric, hyperbolic, exponential, logarithmic, arithmetic, and special functions optimized for GPU execution.

124

125

```python { .api }

126

# Trigonometric

127

def sin(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True): ...

128

def cos(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True): ...

129

def tan(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True): ...

130

131

# Exponential and logarithmic

132

def exp(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True): ...

133

def log(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True): ...

134

def sqrt(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True): ...

135

136

# Arithmetic

137

def add(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True): ...

138

def multiply(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True): ...

139

def power(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True): ...

140

```

141

142

[Mathematical Functions](./mathematical-functions.md)

143

144

### Linear Algebra

145

146

GPU-accelerated linear algebra operations including matrix multiplication, decompositions, eigenvalue problems, and solving linear systems using cuBLAS and cuSOLVER.

147

148

```python { .api }

149

# Matrix products

150

def dot(a, b, out=None): ...

151

def matmul(x1, x2, /, out=None, *, casting='same_kind', order='K', dtype=None, subok=True): ...

152

def einsum(subscripts, *operands, **kwargs): ...

153

def tensordot(a, b, axes=2): ...

154

155

# Decompositions

156

def svd(a, full_matrices=True, compute_uv=True, hermitian=False): ...

157

def qr(a, mode='reduced'): ...

158

def cholesky(a): ...

159

160

# Eigenvalues

161

def eigh(a, UPLO='L'): ...

162

def eigvalsh(a, UPLO='L'): ...

163

164

# Linear systems

165

def solve(a, b): ...

166

def inv(a): ...

167

def pinv(a, rcond=1e-15, hermitian=False): ...

168

```

169

170

[Linear Algebra](./linear-algebra.md)

171

172

### Random Number Generation

173

174

Comprehensive random number generation using GPU-optimized generators with support for various distributions, modern generator APIs, and advanced bit generators.

175

176

```python { .api }

177

# Modern generator API

178

def default_rng(seed=None): ...

179

class Generator:

180

def random(self, size=None, dtype=float32, out=None): ...

181

def integers(self, low, high=None, size=None, dtype=int64, endpoint=False): ...

182

183

class BitGenerator: ...

184

class XORWOW(BitGenerator): ...

185

class MRG32k3a(BitGenerator): ...

186

class Philox4x3210(BitGenerator): ...

187

188

# Legacy API

189

def seed(seed=None): ...

190

def get_random_state(): ...

191

class RandomState: ...

192

193

# Simple random data

194

def rand(*args): ...

195

def randn(*args): ...

196

def randint(low, high=None, size=None, dtype=int): ...

197

def random_sample(size=None): ...

198

def choice(a, size=None, replace=True, p=None): ...

199

200

# Distributions

201

def normal(loc=0.0, scale=1.0, size=None): ...

202

def uniform(low=0.0, high=1.0, size=None): ...

203

def exponential(scale=1.0, size=None): ...

204

def poisson(lam=1.0, size=None): ...

205

def gamma(shape, scale=1.0, size=None): ...

206

def beta(a, b, size=None): ...

207

def binomial(n, p, size=None): ...

208

209

# Multivariate distributions

210

def multivariate_normal(mean, cov, size=None, check_valid='warn', tol=1e-8): ...

211

def dirichlet(alpha, size=None): ...

212

213

# Permutations

214

def shuffle(x): ...

215

def permutation(x): ...

216

```

217

218

[Random Number Generation](./random-generation.md)

219

220

### Fast Fourier Transform

221

222

GPU-accelerated FFT operations using cuFFT for high-performance frequency domain analysis with comprehensive support for real and complex transforms in 1D, 2D, and N-dimensional cases.

223

224

```python { .api }

225

# 1D complex transforms

226

def fft(a, n=None, axis=-1, norm=None): ...

227

def ifft(a, n=None, axis=-1, norm=None): ...

228

229

# 1D real transforms (optimized for real input)

230

def rfft(a, n=None, axis=-1, norm=None): ...

231

def irfft(a, n=None, axis=-1, norm=None): ...

232

233

# 1D Hermitian transforms

234

def hfft(a, n=None, axis=-1, norm=None): ...

235

def ihfft(a, n=None, axis=-1, norm=None): ...

236

237

# 2D transforms

238

def fft2(a, s=None, axes=(-2, -1), norm=None): ...

239

def ifft2(a, s=None, axes=(-2, -1), norm=None): ...

240

def rfft2(a, s=None, axes=(-2, -1), norm=None): ...

241

def irfft2(a, s=None, axes=(-2, -1), norm=None): ...

242

243

# N-D transforms

244

def fftn(a, s=None, axes=None, norm=None): ...

245

def ifftn(a, s=None, axes=None, norm=None): ...

246

def rfftn(a, s=None, axes=None, norm=None): ...

247

def irfftn(a, s=None, axes=None, norm=None): ...

248

249

# Helper functions

250

def fftfreq(n, d=1.0): ...

251

def rfftfreq(n, d=1.0): ...

252

def fftshift(x, axes=None): ...

253

def ifftshift(x, axes=None): ...

254

255

# Configuration

256

import cupy.fft.config # FFT planning and optimization

257

```

258

259

[FFT Operations](./fft-operations.md)

260

261

### CUDA Integration

262

263

Direct CUDA functionality including memory management, device control, custom kernels, streams, and low-level GPU programming capabilities.

264

265

```python { .api }

266

# Device management

267

class Device:

268

def __init__(self, device=None): ...

269

def use(self): ...

270

271

def get_device_id(): ...

272

def is_available(): ...

273

274

# Memory management

275

def alloc(size): ...

276

class MemoryPool:

277

def malloc(self, size): ...

278

def free_all_blocks(self): ...

279

def used_bytes(self): ...

280

281

# Stream management

282

class Stream:

283

def __init__(self, null=False, non_blocking=False, ptds=False): ...

284

def use(self): ...

285

286

# Custom kernels

287

class ElementwiseKernel:

288

def __init__(self, in_params, out_params, operation, name='kernel'): ...

289

290

class RawKernel:

291

def __init__(self, code, name, **kwargs): ...

292

```

293

294

[CUDA Integration](./cuda-integration.md)

295

296

### Input/Output Operations

297

298

Comprehensive file I/O operations for loading, saving, and formatting array data with support for binary files, compressed archives, text files, and custom formatting.

299

300

```python { .api }

301

# Binary file operations

302

def save(file, arr, allow_pickle=True, fix_imports=True): ...

303

def load(file, mmap_mode=None, allow_pickle=False, fix_imports=True, encoding='ASCII'): ...

304

def savez(file, *args, **kwds): ...

305

def savez_compressed(file, *args, **kwds): ...

306

307

# Text file operations

308

def loadtxt(fname, dtype=float, comments='#', delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0, encoding='bytes', max_rows=None): ...

309

def savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\n', header='', footer='', comments='# ', encoding=None): ...

310

def genfromtxt(fname, dtype=float, comments='#', delimiter=None, skip_header=0, skip_footer=0, converters=None, missing_values=None, filling_values=None, usecols=None, names=None): ...

311

312

# Data conversion

313

def frombuffer(buffer, dtype=float, count=-1, offset=0): ...

314

def fromstring(string, dtype=float, count=-1, sep=''): ...

315

def fromfunction(func, shape, dtype=float, **kwargs): ...

316

def fromiter(iterable, dtype, count=-1): ...

317

318

# Array formatting

319

def array_repr(arr, max_line_width=None, precision=None, suppress_small=None): ...

320

def array_str(a, max_line_width=None, precision=None, suppress_small=None): ...

321

def array2string(a, max_line_width=None, precision=None, suppress_small=None, separator=' ', prefix='', formatter=None, threshold=None, edgeitems=None): ...

322

```

323

324

[I/O Operations](./io-operations.md)

325

326

### Polynomial Functions

327

328

Comprehensive polynomial operations including fitting, evaluation, arithmetic, root finding, and advanced polynomial manipulation with support for various polynomial types.

329

330

```python { .api }

331

# Basic operations

332

def poly(seq_of_zeros): ...

333

def roots(p): ...

334

def polyval(p, x): ...

335

def polyder(p, m=1): ...

336

def polyint(p, m=1, k=None): ...

337

338

# Arithmetic operations

339

def polyadd(a1, a2): ...

340

def polysub(a1, a2): ...

341

def polymul(a1, a2): ...

342

def polydiv(u, v): ...

343

344

# Curve fitting

345

def polyfit(x, y, deg, rcond=None, full=False, w=None, cov=False): ...

346

def polyvander(x, deg): ...

347

348

# Object-oriented interface

349

class poly1d:

350

def __init__(self, c_or_r, r=False, variable=None): ...

351

def __call__(self, val): ...

352

def deriv(self, m=1): ...

353

def integ(self, m=1, k=0): ...

354

@property

355

def roots(self): ...

356

357

# Specialized polynomial types

358

class Chebyshev: ...

359

class Legendre: ...

360

class Hermite: ...

361

class Laguerre: ...

362

```

363

364

[Polynomial Functions](./polynomial-functions.md)

365

366

### Statistics and Aggregation

367

368

Statistical functions, sorting, searching, and data aggregation operations optimized for GPU computation.

369

370

```python { .api }

371

# Aggregation

372

def sum(a, axis=None, dtype=None, out=None, keepdims=False, initial=0, where=True): ...

373

def mean(a, axis=None, dtype=None, out=None, keepdims=False): ...

374

def std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False): ...

375

def var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False): ...

376

377

# Order statistics

378

def max(a, axis=None, out=None, keepdims=False, initial=None, where=None): ...

379

def min(a, axis=None, out=None, keepdims=False, initial=None, where=None): ...

380

def percentile(a, q, axis=None, out=None, overwrite_input=False, interpolation='linear', keepdims=False): ...

381

382

# Sorting and searching

383

def sort(a, axis=-1, kind=None, order=None): ...

384

def argsort(a, axis=-1, kind=None, order=None): ...

385

def searchsorted(a, v, side='left', sorter=None): ...

386

```

387

388

*Note: Statistical functions available throughout cupy namespace*

389

390

### SciPy Compatibility

391

392

Comprehensive SciPy-compatible functions through cupyx.scipy including sparse matrices, signal processing, image processing, special functions, statistics, and advanced linear algebra.

393

394

```python { .api }

395

# Sparse matrices (cupyx.scipy.sparse)

396

class csr_matrix:

397

def __init__(self, arg1, shape=None, dtype=None, copy=False): ...

398

def dot(self, other): ...

399

400

class csc_matrix:

401

def __init__(self, arg1, shape=None, dtype=None, copy=False): ...

402

403

# Signal processing (cupyx.scipy.signal)

404

def convolve(in1, in2, mode='full', method='auto'): ...

405

def correlate(in1, in2, mode='full', method='auto'): ...

406

407

# Image processing (cupyx.scipy.ndimage)

408

def gaussian_filter(input, sigma, order=0, output=None, mode='reflect', cval=0.0, truncate=4.0): ...

409

def rotate(input, angle, axes=(1, 0), reshape=True, output=None, order=1, mode='constant', cval=0.0, prefilter=True): ...

410

411

# Special functions (cupyx.scipy.special)

412

def gamma(x): ...

413

def erf(x): ...

414

def betaln(a, b): ...

415

416

# Statistics (cupyx.scipy.stats)

417

def ttest_ind(a, b, axis=0, equal_var=True, nan_policy='propagate', alternative='two-sided'): ...

418

def pearsonr(x, y): ...

419

```

420

421

[SciPy Compatibility](./scipy-compatibility.md)

422

423

### Testing and Validation

424

425

Comprehensive testing utilities for GPU/CPU comparison, parameterized testing, and numerical accuracy validation with specialized decorators for scientific computing workflows.

426

427

```python { .api }

428

# Array comparison functions

429

def assert_allclose(actual, desired, rtol=1e-7, atol=0, err_msg='', verbose=True): ...

430

def assert_array_equal(x, y, err_msg='', verbose=True): ...

431

def assert_array_almost_equal(x, y, decimal=6, err_msg='', verbose=True): ...

432

def assert_array_less(x, y, err_msg='', verbose=True): ...

433

434

# Parameterized testing decorators

435

def parameterize(*params, **named_params): ...

436

def for_all_dtypes(name='dtype', no_bool=False, no_float16=False, no_complex=False): ...

437

def for_float_dtypes(name='dtype', no_float16=False): ...

438

def for_complex_dtypes(name='dtype'): ...

439

def for_signed_dtypes(name='dtype'): ...

440

def for_unsigned_dtypes(name='dtype'): ...

441

442

# NumPy compatibility testing

443

def numpy_cupy_allclose(rtol=1e-7, atol=0, err_msg='', verbose=True, name='xp', type_check=True, accept_error=False, contiguous_check=True, sp_name=None): ...

444

def numpy_cupy_array_equal(err_msg='', verbose=True, name='xp', type_check=True, accept_error=False, contiguous_check=True, sp_name=None): ...

445

446

# Test data generation

447

def shaped_random(shape, xp=None, dtype=float32, scale=1): ...

448

def shaped_arange(shape, xp=None, dtype=float32): ...

449

def generate_seed(): ...

450

451

# Error testing decorators

452

def numpy_cupy_raises(name='xp', sp_name=None, accept_error=Exception): ...

453

```

454

455

*Note: Comprehensive testing framework available in cupy.testing module*

456

457

## Core Classes

458

459

```python { .api }

460

class ndarray:

461

"""GPU array class providing NumPy-compatible interface"""

462

def __init__(self): ...

463

@property

464

def shape(self): ...

465

@property

466

def dtype(self): ...

467

@property

468

def size(self): ...

469

def get(self, stream=None, order='C', out=None): ...

470

def set(self, arr, stream=None): ...

471

472

class ufunc:

473

"""Universal function for element-wise operations"""

474

def __call__(self, *args, **kwargs): ...

475

def reduce(self, a, axis=0, dtype=None, out=None, keepdims=False, initial=None, where=True): ...

476

```

477

478

## Data Conversion

479

480

```python { .api }

481

def asnumpy(a, stream=None, order='C', out=None):

482

"""Convert CuPy array to NumPy array on CPU"""

483

484

def asarray(a, dtype=None, order=None):

485

"""Convert input to CuPy array"""

486

487

def get_array_module(*args):

488

"""Get appropriate array module (cupy/numpy) based on input types"""

489

```

490

491

## Memory Management

492

493

```python { .api }

494

def get_default_memory_pool():

495

"""Get the default GPU memory pool"""

496

497

def get_default_pinned_memory_pool():

498

"""Get the default pinned memory pool"""

499

500

class MemoryPool:

501

def malloc(self, size): ...

502

def free_all_blocks(self): ...

503

def used_bytes(self): ...

504

def total_bytes(self): ...

505

```

506

507

## Utilities

508

509

```python { .api }

510

def is_available():

511

"""Check if CUDA is available"""

512

513

def show_config(*, _full=False):

514

"""Print runtime configuration"""

515

516

def clear_memo():

517

"""Clear memoization cache"""

518

```