0
# CuPy-CUDA110
1
2
A NumPy/SciPy-compatible GPU-accelerated array library for Python. CuPy provides NumPy-like API for GPU computing, enabling existing NumPy/SciPy code to run on NVIDIA CUDA GPUs with minimal changes. This CUDA 11.0 compatible package offers comprehensive GPU acceleration for mathematical operations, linear algebra, and scientific computing workflows.
3
4
## Package Information
5
6
- **Package Name**: cupy-cuda110
7
- **Package Type**: pypi
8
- **Language**: Python
9
- **Installation**: `pip install cupy-cuda110`
10
- **Import Name**: cupy
11
12
## Core Imports
13
14
```python
15
import cupy as cp
16
import cupy.cuda as cuda
17
```
18
19
For specific functionality:
20
21
```python
22
from cupy import ndarray, array, arange, zeros, ones, asnumpy
23
from cupy.cuda import Device, Stream, MemoryPool
24
from cupy.linalg import solve, inv, svd
25
from cupy.random import random, normal
26
from cupy import fft, linalg, polynomial, sparse, testing
27
import cupyx.scipy as scipy
28
```
29
30
## Basic Usage
31
32
```python
33
import cupy as cp
34
import numpy as np
35
36
# Create arrays on GPU
37
x_gpu = cp.array([1, 2, 3, 4, 5])
38
y_gpu = cp.zeros((3, 4))
39
40
# NumPy-like operations run on GPU
41
z_gpu = cp.sin(x_gpu) * 2 + cp.cos(x_gpu)
42
43
# Mathematical operations
44
matrix_gpu = cp.random.random((1000, 1000))
45
result_gpu = cp.dot(matrix_gpu, matrix_gpu.T)
46
47
# Transfer between CPU and GPU
48
cpu_array = cp.asnumpy(result_gpu) # GPU to CPU
49
gpu_array = cp.asarray(cpu_array) # CPU to GPU
50
51
# Memory management
52
with cp.cuda.Device(0): # Select GPU device
53
data = cp.zeros((1000, 1000))
54
# Operations on selected device
55
```
56
57
## Architecture
58
59
CuPy's design enables seamless GPU acceleration through several key components:
60
61
- **GPU Arrays**: `cupy.ndarray` provides the same interface as NumPy arrays but operates on GPU memory
62
- **CUDA Integration**: Deep integration with CUDA libraries (cuBLAS, cuSOLVER, cuSPARSE, cuRAND, cuFFT)
63
- **Memory Management**: Advanced GPU memory pooling system for efficient allocation and deallocation
64
- **Kernel Creation**: Multiple approaches for custom GPU kernels (ElementwiseKernel, RawKernel, JIT compilation)
65
- **Device Management**: Multi-GPU support with device contexts and streams for concurrent execution
66
- **NumPy Compatibility**: Drop-in replacement maintaining API compatibility while leveraging GPU parallelization
67
68
This architecture makes CuPy the foundation for GPU-accelerated scientific computing in Python, supporting the entire NumPy/SciPy ecosystem on GPU hardware.
69
70
## Capabilities
71
72
### Array Creation and Manipulation
73
74
Comprehensive array creation functions and manipulation operations that mirror NumPy's API while operating on GPU memory.
75
76
```python { .api }
77
def array(obj, dtype=None, copy=True, order='K', subok=False, ndmin=0): ...
78
def asarray(a, dtype=None, order=None): ...
79
def zeros(shape, dtype=float, order='C'): ...
80
def ones(shape, dtype=float, order='C'): ...
81
def empty(shape, dtype=float, order='C'): ...
82
def arange(start, stop=None, step=1, dtype=None): ...
83
def linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None): ...
84
def asnumpy(a, stream=None, order='C', out=None): ...
85
```
86
87
[Array Operations](./array-operations.md)
88
89
### Mathematical Functions
90
91
Complete set of mathematical functions including trigonometric, hyperbolic, exponential, logarithmic, and arithmetic operations, all GPU-accelerated.
92
93
```python { .api }
94
def sin(x, out=None, **kwargs): ...
95
def cos(x, out=None, **kwargs): ...
96
def exp(x, out=None, **kwargs): ...
97
def log(x, out=None, **kwargs): ...
98
def add(x1, x2, out=None, **kwargs): ...
99
def multiply(x1, x2, out=None, **kwargs): ...
100
def sqrt(x, out=None, **kwargs): ...
101
def power(x1, x2, out=None, **kwargs): ...
102
```
103
104
[Mathematical Functions](./mathematical-functions.md)
105
106
### Linear Algebra
107
108
GPU-accelerated linear algebra operations leveraging optimized CUDA libraries for matrix operations, decompositions, and solving linear systems.
109
110
```python { .api }
111
def dot(a, b, out=None): ...
112
def matmul(x1, x2, out=None): ...
113
def solve(a, b): ...
114
def inv(a): ...
115
def svd(a, full_matrices=True, compute_uv=True): ...
116
def eig(a): ...
117
```
118
119
[Linear Algebra](./linear-algebra.md)
120
121
### Statistics and Reductions
122
123
Statistical functions and reduction operations for data analysis and aggregation across array dimensions.
124
125
```python { .api }
126
def sum(a, axis=None, dtype=None, out=None, keepdims=False): ...
127
def mean(a, axis=None, dtype=None, out=None, keepdims=False): ...
128
def std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False): ...
129
def var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False): ...
130
def max(a, axis=None, out=None, keepdims=False): ...
131
def min(a, axis=None, out=None, keepdims=False): ...
132
```
133
134
[Statistics](./statistics.md)
135
136
### CUDA Interface
137
138
Low-level CUDA functionality for device management, memory allocation, stream control, and integration with CUDA libraries.
139
140
```python { .api }
141
class Device:
142
def __init__(self, device=None): ...
143
def __enter__(self): ...
144
def __exit__(self, *args): ...
145
146
class Stream:
147
def __init__(self, null=False, non_blocking=False, ptds=False): ...
148
def synchronize(self): ...
149
150
class MemoryPool:
151
def malloc(self, size): ...
152
def free_all_blocks(self): ...
153
```
154
155
[CUDA Interface](./cuda-interface.md)
156
157
### Random Number Generation
158
159
GPU-accelerated random number generation supporting various probability distributions and random sampling operations.
160
161
```python { .api }
162
def random(size=None, dtype=float): ...
163
def normal(loc=0.0, scale=1.0, size=None): ...
164
def uniform(low=0.0, high=1.0, size=None): ...
165
def randint(low, high=None, size=None, dtype=int): ...
166
def choice(a, size=None, replace=True, p=None): ...
167
```
168
169
[Random Generation](./random-generation.md)
170
171
### Custom Kernels
172
173
Advanced kernel creation mechanisms for implementing custom GPU operations using CUDA C/C++ code or element-wise operations.
174
175
```python { .api }
176
class ElementwiseKernel:
177
def __init__(self, in_params, out_params, operation, name='kernel', **kwargs): ...
178
def __call__(self, *args, **kwargs): ...
179
180
class RawKernel:
181
def __init__(self, code, name, options=()): ...
182
def __call__(self, grid, block, args, **kwargs): ...
183
184
class ReductionKernel:
185
def __init__(self, in_params, out_params, map_expr, reduce_expr, **kwargs): ...
186
```
187
188
[Custom Kernels](./custom-kernels.md)
189
190
### SciPy Extensions
191
192
Comprehensive SciPy-compatible functions through cupyx.scipy, providing GPU acceleration for scientific computing workflows.
193
194
```python { .api }
195
# cupyx.scipy.linalg
196
def solve(a, b, **kwargs): ...
197
def lu_factor(a, **kwargs): ...
198
def cholesky(a, **kwargs): ...
199
200
# cupyx.scipy.sparse
201
def csr_matrix(arg1, shape=None, dtype=None, copy=False): ...
202
def csc_matrix(arg1, shape=None, dtype=None, copy=False): ...
203
204
# cupyx.scipy.fft
205
def fft(x, n=None, axis=-1, norm=None): ...
206
def ifft(x, n=None, axis=-1, norm=None): ...
207
```
208
209
[SciPy Extensions](./scipy-extensions.md)
210
211
## Types
212
213
```python { .api }
214
class ndarray:
215
"""GPU array class providing NumPy-compatible interface."""
216
def __init__(self, shape, dtype=float, buffer=None, offset=0, strides=None, order='C'): ...
217
218
# Properties
219
@property
220
def shape(self) -> tuple: ...
221
@property
222
def dtype(self) -> numpy.dtype: ...
223
@property
224
def device(self) -> Device: ...
225
226
# Methods
227
def get(self, stream=None, order='C', out=None): ... # Transfer to CPU
228
def astype(self, dtype, order='K', casting='unsafe', subok=True, copy=True): ...
229
def reshape(self, *shape, order='C'): ...
230
def transpose(self, *axes): ...
231
def sum(self, axis=None, dtype=None, out=None, keepdims=False): ...
232
233
class ufunc:
234
"""Universal function for element-wise operations."""
235
def __call__(self, *args, **kwargs): ...
236
def reduce(self, array, axis=0, dtype=None, out=None, keepdims=False): ...
237
238
# Memory management types
239
class MemoryPointer:
240
def __init__(self, mem, offset): ...
241
@property
242
def device(self) -> Device: ...
243
244
# Kernel types
245
ElementwiseKernel = typing.Callable[..., ndarray]
246
RawKernel = typing.Callable[[tuple, tuple, tuple], None]
247
ReductionKernel = typing.Callable[..., ndarray]
248
```