or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-cupy-cuda12x

CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated computing with Python

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/cupy-cuda12x@12.3.x

To install, run

npx @tessl/cli install tessl/pypi-cupy-cuda12x@12.3.0

0

# CuPy

1

2

CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated computing with Python. It acts as a drop-in replacement to run existing NumPy/SciPy code on NVIDIA CUDA and AMD ROCm platforms, enabling high-performance scientific computing by leveraging GPU parallelism while maintaining full compatibility with existing codebases.

3

4

## Package Information

5

6

- **Package Name**: cupy-cuda12x

7

- **Language**: Python

8

- **Installation**: `pip install cupy-cuda12x`

9

- **CUDA Requirement**: CUDA Toolkit 12.x

10

- **License**: MIT

11

12

## Core Imports

13

14

```python

15

import cupy as cp

16

```

17

18

For specific submodules:

19

20

```python

21

import cupy.cuda as cuda

22

import cupy.random as random

23

import cupy.linalg as linalg

24

import cupy.fft as fft

25

```

26

27

## Basic Usage

28

29

```python

30

import cupy as cp

31

import numpy as np

32

33

# Create arrays on GPU

34

gpu_array = cp.array([1, 2, 3, 4, 5])

35

gpu_zeros = cp.zeros((1000, 1000))

36

gpu_random = cp.random.random((100, 100))

37

38

# NumPy compatibility - same API

39

result = cp.sum(gpu_array)

40

matrix_mult = cp.dot(gpu_random, gpu_random.T)

41

42

# Transfer between GPU and CPU

43

cpu_array = cp.asnumpy(gpu_array) # GPU to CPU

44

gpu_from_numpy = cp.asarray(np.array([1, 2, 3])) # CPU to GPU

45

46

# Memory management

47

memory_pool = cp.get_default_memory_pool()

48

print(f"Used bytes: {memory_pool.used_bytes()}")

49

50

# Context management

51

with cp.cuda.Device(0): # Use specific GPU

52

data = cp.random.random((1000, 1000))

53

result = cp.linalg.svd(data)

54

```

55

56

## Architecture

57

58

CuPy's architecture mirrors NumPy while leveraging GPU acceleration:

59

60

- **cupy.ndarray**: GPU-accelerated multi-dimensional arrays with NumPy-compatible interface

61

- **cupy.cuda**: Low-level CUDA interface for device management, memory allocation, and kernel execution

62

- **cupy.random**: GPU-accelerated random number generation compatible with numpy.random

63

- **cupy.linalg**: Linear algebra operations using cuBLAS and cuSOLVER

64

- **cupy.fft**: Fast Fourier Transform operations using cuFFT

65

- **Custom Kernels**: ElementwiseKernel, ReductionKernel, and RawKernel for custom GPU operations

66

67

The design provides seamless NumPy compatibility while offering direct access to CUDA features for performance optimization.

68

69

## Capabilities

70

71

### Array Creation and Manipulation

72

73

Comprehensive array creation functions matching NumPy's API, including basic creation (zeros, ones, empty), data conversion (array, asarray), ranges (arange, linspace), and matrix creation (eye, diag). All functions create arrays directly on GPU memory.

74

75

```python { .api }

76

def array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0): ...

77

def zeros(shape, dtype=float, order='C'): ...

78

def ones(shape, dtype=None, order='C'): ...

79

def empty(shape, dtype=float, order='C'): ...

80

def arange(start, stop=None, step=1, dtype=None): ...

81

def linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0): ...

82

```

83

84

[Array Operations](./array-operations.md)

85

86

### Mathematical Functions

87

88

Element-wise mathematical operations including trigonometric, hyperbolic, exponential, logarithmic, arithmetic, and special functions. All functions are GPU-accelerated and maintain NumPy compatibility.

89

90

```python { .api }

91

def sin(x, out=None, **kwargs): ...

92

def cos(x, out=None, **kwargs): ...

93

def exp(x, out=None, **kwargs): ...

94

def log(x, out=None, **kwargs): ...

95

def add(x1, x2, out=None, **kwargs): ...

96

def multiply(x1, x2, out=None, **kwargs): ...

97

```

98

99

[Mathematical Functions](./math-functions.md)

100

101

### Linear Algebra

102

103

GPU-accelerated linear algebra operations using cuBLAS and cuSOLVER, including matrix multiplication, decompositions, eigenvalue problems, and system solving.

104

105

```python { .api }

106

def dot(a, b, out=None): ...

107

def matmul(x1, x2, out=None, **kwargs): ...

108

def solve(a, b): ...

109

def svd(a, full_matrices=True, compute_uv=True, hermitian=False): ...

110

def eigh(a, UPLO='L'): ...

111

```

112

113

[Linear Algebra](./linear-algebra.md)

114

115

### Random Number Generation

116

117

GPU-accelerated random number generation compatible with numpy.random, supporting multiple distributions and modern Generator API with various bit generators (XORWOW, MRG32k3a, Philox).

118

119

```python { .api }

120

def random(size=None): ...

121

def normal(loc=0.0, scale=1.0, size=None): ...

122

def uniform(low=0.0, high=1.0, size=None): ...

123

def choice(a, size=None, replace=True, p=None): ...

124

def default_rng(seed=None): ...

125

```

126

127

[Random Numbers](./random-numbers.md)

128

129

### Fast Fourier Transform

130

131

GPU-accelerated FFT operations using cuFFT, including 1D, 2D, and N-D transforms for both complex-to-complex and real-to-complex transformations.

132

133

```python { .api }

134

def fft(a, n=None, axis=-1, norm=None): ...

135

def ifft(a, n=None, axis=-1, norm=None): ...

136

def rfft(a, n=None, axis=-1, norm=None): ...

137

def fft2(a, s=None, axes=(-2, -1), norm=None): ...

138

def fftn(a, s=None, axes=None, norm=None): ...

139

```

140

141

[FFT Operations](./fft-operations.md)

142

143

### CUDA Interface

144

145

Direct access to CUDA functionality including device management, memory allocation, streams, events, and custom kernel compilation. Enables fine-grained control over GPU resources and performance optimization.

146

147

```python { .api }

148

def is_available(): ...

149

class Device:

150

def __init__(self, device=None): ...

151

class MemoryPool:

152

def __init__(self, allocator=None): ...

153

class Stream:

154

def __init__(self, null=False, non_blocking=False, ptds=False): ...

155

```

156

157

[CUDA Interface](./cuda-interface.md)

158

159

### Custom Kernels

160

161

Framework for creating custom GPU kernels including ElementwiseKernel for element-wise operations, ReductionKernel for reduction operations, and RawKernel for arbitrary CUDA code.

162

163

```python { .api }

164

class ElementwiseKernel:

165

def __init__(self, in_params, out_params, operation, name='kernel', **kwargs): ...

166

class ReductionKernel:

167

def __init__(self, in_params, out_params, map_expr, reduce_expr, **kwargs): ...

168

class RawKernel:

169

def __init__(self, code, name, options=(), **kwargs): ...

170

```

171

172

[Custom Kernels](./custom-kernels.md)

173

174

### Statistics and Sorting

175

176

Statistical functions including descriptive statistics, correlations, histograms, and sorting operations. All functions handle NaN values appropriately and support axis-specific operations.

177

178

```python { .api }

179

def mean(a, axis=None, dtype=None, out=None, keepdims=False): ...

180

def std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False): ...

181

def sort(a, axis=-1, kind=None, order=None): ...

182

def argsort(a, axis=-1, kind=None, order=None): ...

183

def histogram(a, bins=10, range=None, weights=None, density=None): ...

184

```

185

186

[Statistics and Sorting](./statistics-sorting.md)

187

188

## Core Utilities

189

190

Essential utility functions for GPU/CPU data transfer and array module selection.

191

192

```python { .api }

193

def asnumpy(a, stream=None, order='C', out=None):

194

"""

195

Convert CuPy array to NumPy array on CPU.

196

197

Parameters:

198

- a: input CuPy array or array-like

199

- stream: CUDA stream for async transfer

200

- order: memory layout ('C', 'F', 'A')

201

- out: output NumPy array

202

203

Returns:

204

numpy.ndarray: array on CPU memory

205

"""

206

207

def get_array_module(*args):

208

"""

209

Return array module (cupy or numpy) based on input types.

210

211

Parameters:

212

- args: values to determine module

213

214

Returns:

215

module: cupy or numpy module

216

"""

217

218

def is_available():

219

"""

220

Check if CUDA is available.

221

222

Returns:

223

bool: True if CUDA devices are available

224

"""

225

```

226

227

## Types

228

229

```python { .api }

230

class ndarray:

231

"""GPU-accelerated multi-dimensional array."""

232

def __init__(self): ...

233

@property

234

def shape(self): ...

235

@property

236

def dtype(self): ...

237

@property

238

def size(self): ...

239

def get(self, stream=None, order='C', out=None): ...

240

def set(self, arr, stream=None): ...

241

242

class ufunc:

243

"""Universal function for element-wise operations."""

244

def __call__(self, *args, **kwargs): ...

245

246

# Data types (from NumPy)

247

bool_ = numpy.bool_

248

int8 = numpy.int8

249

int16 = numpy.int16

250

int32 = numpy.int32

251

int64 = numpy.int64

252

uint8 = numpy.uint8

253

uint16 = numpy.uint16

254

uint32 = numpy.uint32

255

uint64 = numpy.uint64

256

float16 = numpy.float16

257

float32 = numpy.float32

258

float64 = numpy.float64

259

complex64 = numpy.complex64

260

complex128 = numpy.complex128

261

```