or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-cupy-cuda101

CuPy: NumPy & SciPy for GPU (CUDA 10.1 version)

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/cupy-cuda101@9.6.x

To install, run

npx @tessl/cli install tessl/pypi-cupy-cuda101@9.6.0

0

# CuPy

1

2

CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated computing with Python. It acts as a drop-in replacement to run existing NumPy/SciPy code on NVIDIA CUDA platforms, providing comprehensive GPU acceleration for scientific computing, machine learning, and data analysis workflows while maintaining full compatibility with existing NumPy-based codebases.

3

4

## Package Information

5

6

- **Package Name**: cupy-cuda101

7

- **Language**: Python

8

- **Installation**: `pip install cupy-cuda101`

9

- **CUDA Version**: 10.1

10

11

## Core Imports

12

13

```python

14

import cupy as cp

15

```

16

17

Common for extending functionality:

18

19

```python

20

import cupyx

21

```

22

23

For CUDA-specific operations:

24

25

```python

26

import cupy.cuda

27

```

28

29

## Basic Usage

30

31

```python

32

import cupy as cp

33

import numpy as np

34

35

# Create arrays on GPU

36

x = cp.arange(6).reshape(2, 3).astype('f')

37

y = cp.ones((2, 3), dtype='float32')

38

39

# GPU array operations (NumPy-compatible API)

40

result = cp.dot(x, y.T)

41

print(result)

42

43

# Convert between CPU and GPU

44

gpu_array = cp.array([1, 2, 3, 4, 5])

45

cpu_array = cp.asnumpy(gpu_array) # Transfer to CPU

46

gpu_array2 = cp.asarray(cpu_array) # Transfer to GPU

47

48

# Memory management

49

mempool = cp.get_default_memory_pool()

50

print(f"Used bytes: {mempool.used_bytes()}")

51

print(f"Total bytes: {mempool.total_bytes()}")

52

```

53

54

## Architecture

55

56

CuPy provides a comprehensive GPU computing framework:

57

58

- **Core Arrays**: `cupy.ndarray` objects that mirror NumPy arrays but reside in GPU memory

59

- **NumPy Compatibility**: Full API compatibility with NumPy for seamless code migration

60

- **CUDA Integration**: Direct access to CUDA features including kernels, streams, and memory management

61

- **Extended Functionality**: `cupyx` module provides SciPy-compatible functions and advanced GPU optimizations

62

- **Memory Management**: Automatic memory pooling with customizable allocators

63

- **Kernel Fusion**: Automatic optimization of element-wise operations for improved performance

64

65

This design enables high-performance scientific computing by seamlessly transferring array operations to GPU while maintaining full compatibility with existing NumPy-based codebases.

66

67

## Capabilities

68

69

### Array Creation and Manipulation

70

71

Core array creation functions including basic arrays, ranges, matrices, and data conversion. These functions mirror NumPy's array creation API while creating arrays on GPU memory.

72

73

```python { .api }

74

def array(obj, dtype=None, copy=True, order='K', subok=False, ndmin=0): ...

75

def zeros(shape, dtype=float, order='C'): ...

76

def ones(shape, dtype=None, order='C'): ...

77

def empty(shape, dtype=float, order='C'): ...

78

def arange(start, stop=None, step=None, dtype=None): ...

79

def linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0): ...

80

```

81

82

[Array Creation and Manipulation](./array-creation.md)

83

84

### Mathematical Functions

85

86

Comprehensive mathematical operations including trigonometric, hyperbolic, exponential, logarithmic, arithmetic, and special functions. All operations are performed on GPU with NumPy-compatible interfaces.

87

88

```python { .api }

89

def sin(x, out=None, **kwargs): ...

90

def cos(x, out=None, **kwargs): ...

91

def exp(x, out=None, **kwargs): ...

92

def log(x, out=None, **kwargs): ...

93

def sqrt(x, out=None, **kwargs): ...

94

def add(x1, x2, out=None, **kwargs): ...

95

def multiply(x1, x2, out=None, **kwargs): ...

96

```

97

98

[Mathematical Functions](./math-functions.md)

99

100

### Linear Algebra

101

102

GPU-accelerated linear algebra operations including matrix operations, decompositions, eigenvalue problems, and solving linear systems. Powered by cuBLAS and cuSOLVER libraries.

103

104

```python { .api }

105

def dot(a, b, out=None): ...

106

def matmul(x1, x2, out=None, **kwargs): ...

107

def einsum(subscripts, *operands, out=None, **kwargs): ...

108

```

109

110

[Linear Algebra](./linalg.md)

111

112

### Fast Fourier Transform

113

114

GPU-accelerated FFT operations for 1D, 2D, and N-dimensional transforms. Supports real and complex transforms with comprehensive frequency domain processing capabilities.

115

116

```python { .api }

117

def fft(a, n=None, axis=-1, norm=None): ...

118

def ifft(a, n=None, axis=-1, norm=None): ...

119

def fft2(a, s=None, axes=(-2, -1), norm=None): ...

120

def fftn(a, s=None, axes=None, norm=None): ...

121

```

122

123

[Fast Fourier Transform](./fft.md)

124

125

### Random Number Generation

126

127

GPU-based random number generation with comprehensive probability distributions. Supports both legacy RandomState interface and modern Generator API with various bit generators.

128

129

```python { .api }

130

def random(size=None): ...

131

def randn(*size): ...

132

def randint(low, high=None, size=None, dtype=int): ...

133

def normal(loc=0.0, scale=1.0, size=None): ...

134

def uniform(low=0.0, high=1.0, size=None): ...

135

```

136

137

[Random Number Generation](./random.md)

138

139

### CUDA Programming Interface

140

141

Direct access to CUDA features including custom kernels, memory management, streams, and device control. Enables low-level GPU programming within Python.

142

143

```python { .api }

144

class RawKernel: ...

145

class ElementwiseKernel: ...

146

class ReductionKernel: ...

147

class Stream: ...

148

class Device: ...

149

```

150

151

[CUDA Programming Interface](./cuda.md)

152

153

### Statistics and Aggregation

154

155

Statistical functions and array aggregation operations including descriptive statistics, histograms, and correlation analysis. All operations are GPU-accelerated with NumPy-compatible interfaces.

156

157

```python { .api }

158

def mean(a, axis=None, dtype=None, out=None, keepdims=False): ...

159

def std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False): ...

160

def histogram(a, bins=10, range=None, weights=None, density=None): ...

161

def corrcoef(x, y=None, rowvar=True, bias=False, ddof=None): ...

162

```

163

164

[Statistics and Aggregation](./statistics.md)

165

166

### Memory Management and Performance

167

168

Memory management functions, performance optimization utilities, and kernel fusion capabilities for maximizing GPU performance and managing memory usage efficiently.

169

170

```python { .api }

171

def get_default_memory_pool(): ...

172

def get_default_pinned_memory_pool(): ...

173

def fuse(*args, **kwargs): ...

174

def asnumpy(a, stream=None, order='C'): ...

175

```

176

177

[Memory Management and Performance](./memory-performance.md)

178

179

### Array Manipulation

180

181

Array manipulation operations including reshaping, transposing, joining, splitting, and rearranging arrays. Provides comprehensive tools for transforming array structure and organization.

182

183

```python { .api }

184

def reshape(a, newshape, order='C'): ...

185

def transpose(a, axes=None): ...

186

def concatenate(arrays, axis=0, out=None, dtype=None, casting="same_kind"): ...

187

def split(ary, indices_or_sections, axis=0): ...

188

def stack(arrays, axis=0, out=None): ...

189

def expand_dims(a, axis): ...

190

```

191

192

[Array Manipulation](./array-manipulation.md)

193

194

### Binary Operations

195

196

Bitwise operations for integer and boolean arrays including AND, OR, XOR, NOT operations and bit shifting. Essential for low-level data manipulation and boolean logic.

197

198

```python { .api }

199

def bitwise_and(x1, x2, out=None, **kwargs): ...

200

def bitwise_or(x1, x2, out=None, **kwargs): ...

201

def bitwise_xor(x1, x2, out=None, **kwargs): ...

202

def bitwise_not(x, out=None, **kwargs): ...

203

def left_shift(x1, x2, out=None, **kwargs): ...

204

def right_shift(x1, x2, out=None, **kwargs): ...

205

```

206

207

[Binary Operations](./binary-operations.md)

208

209

### Logic Functions

210

211

Logical operations and comparison functions including element-wise and array-wise logical operations, truth value testing, and type checking functions.

212

213

```python { .api }

214

def logical_and(x1, x2, out=None, **kwargs): ...

215

def logical_or(x1, x2, out=None, **kwargs): ...

216

def equal(x1, x2, out=None, **kwargs): ...

217

def greater(x1, x2, out=None, **kwargs): ...

218

def isfinite(x, out=None, **kwargs): ...

219

def all(a, axis=None, out=None, keepdims=False): ...

220

```

221

222

[Logic Functions](./logic-functions.md)

223

224

### Indexing and Searching

225

226

Advanced indexing, searching, and selection operations including multi-dimensional indexing, conditional selection, and array searching functions.

227

228

```python { .api }

229

def take(a, indices, axis=None, out=None, mode='raise'): ...

230

def choose(a, choices, out=None, mode='raise'): ...

231

def where(condition, x=None, y=None): ...

232

def nonzero(a): ...

233

def argmax(a, axis=None, out=None): ...

234

def searchsorted(a, v, side='left', sorter=None): ...

235

```

236

237

[Indexing and Searching](./indexing-searching.md)

238

239

### Sorting and Counting

240

241

Sorting algorithms, search operations, and counting functions for array organization and analysis including various sort methods and element counting.

242

243

```python { .api }

244

def sort(a, axis=-1, kind=None, order=None): ...

245

def argsort(a, axis=-1, kind=None, order=None): ...

246

def lexsort(keys, axis=-1): ...

247

def partition(a, kth, axis=-1, kind=None, order=None): ...

248

def count_nonzero(a, axis=None, keepdims=False): ...

249

```

250

251

[Sorting and Counting](./sorting-counting.md)

252

253

### Input and Output

254

255

File I/O operations for saving and loading arrays in various formats including NumPy's .npy and .npz formats with GPU memory optimization.

256

257

```python { .api }

258

def save(file, arr, allow_pickle=True, fix_imports=True): ...

259

def load(file, mmap_mode=None, allow_pickle=False, fix_imports=True, encoding='ASCII'): ...

260

def savez(file, *args, **kwds): ...

261

def savez_compressed(file, *args, **kwds): ...

262

```

263

264

[Input and Output](./io-operations.md)

265

266

### Polynomial Functions

267

268

Polynomial operations including polynomial arithmetic, fitting, evaluation, and root finding with full GPU acceleration for mathematical analysis.

269

270

```python { .api }

271

def polyval(p, x): ...

272

def polyfit(x, y, deg, rcond=None, full=False, w=None, cov=False): ...

273

def polyadd(a1, a2): ...

274

def polymul(a1, a2): ...

275

def roots(p): ...

276

```

277

278

[Polynomial Functions](./polynomial-functions.md)

279

280

## Types

281

282

```python { .api }

283

class ndarray:

284

"""N-dimensional array object on GPU.

285

286

CuPy's main data structure that mirrors NumPy's ndarray but resides in GPU memory.

287

Supports all NumPy array operations and attributes.

288

"""

289

def __init__(self, shape, dtype=float, order='C'): ...

290

291

# Properties

292

shape: tuple

293

dtype: numpy.dtype

294

size: int

295

ndim: int

296

data: cupy.cuda.MemoryPointer

297

298

# Methods

299

def get(self, stream=None, order='C'): ... # Transfer to CPU

300

def set(self, arr, stream=None): ... # Transfer from CPU

301

def copy(self, order='C'): ...

302

def astype(self, dtype, order='K', casting='unsafe', subok=True, copy=True): ...

303

304

class ufunc:

305

"""Universal function object for element-wise array operations."""

306

def __call__(self, *args, **kwargs): ...

307

def reduce(self, a, axis=0, dtype=None, out=None, keepdims=False, initial=None, where=True): ...

308

309

# Memory and Device Types

310

class MemoryPointer:

311

"""Pointer to GPU memory location."""

312

ptr: int

313

size: int

314

device: Device

315

316

class Device:

317

"""CUDA device representation."""

318

id: int

319

320

class Stream:

321

"""CUDA stream for asynchronous operations."""

322

def __init__(self, non_blocking=False): ...

323

def synchronize(self): ...

324

325

# Custom Kernel Types

326

class RawKernel:

327

"""Raw CUDA kernel wrapper."""

328

def __init__(self, code, name, options=(), backend='nvcc', translate_cucomplex=True): ...

329

def __call__(self, grid, block, args, **kwargs): ...

330

331

class ElementwiseKernel:

332

"""Element-wise operation kernel."""

333

def __init__(self, in_params, out_params, operation, name='kernel', **kwargs): ...

334

def __call__(self, *args, **kwargs): ...

335

336

class ReductionKernel:

337

"""Reduction operation kernel."""

338

def __init__(self, in_params, out_params, map_expr, reduce_expr, post_map_expr='', **kwargs): ...

339

def __call__(self, *args, **kwargs): ...

340

```