CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-cupy-cuda112

NumPy & SciPy-compatible GPU-accelerated computing library for CUDA 11.2 environments

Pending
Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Pending

The risk profile of this skill

Overview
Eval results
Files

index.mddocs/

CuPy

CuPy is a NumPy & SciPy-compatible GPU-accelerated computing library that enables high-performance array operations on NVIDIA CUDA GPUs. It provides a drop-in replacement for NumPy, allowing existing NumPy/SciPy code to run on GPUs with minimal modifications while delivering significant performance improvements for large-scale numerical computations.

Package Information

  • Package Name: cupy-cuda112
  • Language: Python
  • Installation: pip install cupy-cuda112
  • GPU Requirements: NVIDIA CUDA 11.2 or compatible
  • Homepage: https://cupy.dev/
  • Documentation: https://docs.cupy.dev/

Core Imports

import cupy as cp

For CUDA-specific functionality:

import cupy.cuda

For SciPy-compatible extensions:

import cupyx.scipy

Basic Usage

import cupy as cp
import numpy as np

# Create arrays on GPU
gpu_array = cp.array([1, 2, 3, 4, 5])
gpu_zeros = cp.zeros((3, 4))
gpu_random = cp.random.random((1000, 1000))

# Array operations (executed on GPU)
result = cp.sqrt(gpu_array)
matrix_mult = cp.dot(gpu_random, gpu_random.T)

# Convert back to NumPy for CPU operations
cpu_result = cp.asnumpy(result)

# Memory pool management
mempool = cp.get_default_memory_pool()
print(f"Used bytes: {mempool.used_bytes()}")
print(f"Total bytes: {mempool.total_bytes()}")

# Check GPU availability
if cp.cuda.is_available():
    print(f"CUDA devices available: {cp.cuda.runtime.getDeviceCount()}")

Architecture

CuPy's architecture mirrors NumPy while adding GPU-specific capabilities:

  • Core Arrays: cupy.ndarray provides GPU-accelerated N-dimensional arrays with NumPy-compatible interface
  • Universal Functions: GPU-accelerated element-wise operations through cupy.ufunc
  • Memory Management: Automatic memory pooling with configurable allocators for optimal GPU memory usage
  • CUDA Integration: Direct access to CUDA streams, events, memory management, and custom kernel compilation
  • Custom Kernels: Support for user-defined CUDA kernels through RawKernel, ElementwiseKernel, and ReductionKernel
  • Multi-GPU: Support for multi-GPU computation and memory management
  • CuPy Extensions (cupyx): Additional functionality including SciPy compatibility, profiling, JIT compilation, and advanced linear algebra

This design enables seamless migration from NumPy-based code to GPU-accelerated computation while providing advanced CUDA programming capabilities for performance-critical applications.

Capabilities

Array Creation and Manipulation

Core functionality for creating, reshaping, and manipulating N-dimensional arrays on GPU, providing NumPy-compatible array creation routines with GPU memory allocation.

def array(obj, dtype=None, copy=True, order='K', subok=False, ndmin=0): ...
def zeros(shape, dtype=float, order='C'): ...
def ones(shape, dtype=float, order='C'): ...
def empty(shape, dtype=float, order='C'): ...
def arange(start, stop=None, step=1, dtype=None): ...
def linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None): ...
def reshape(a, newshape, order='C'): ...
def transpose(a, axes=None): ...
def concatenate(arrays, axis=0, out=None): ...

Array Operations

Mathematical Functions

Comprehensive collection of mathematical operations including trigonometric, hyperbolic, exponential, logarithmic, and arithmetic functions optimized for GPU execution.

def sin(x, out=None, **kwargs): ...
def cos(x, out=None, **kwargs): ...
def exp(x, out=None, **kwargs): ...
def log(x, out=None, **kwargs): ...
def sqrt(x, out=None, **kwargs): ...
def add(x1, x2, out=None, **kwargs): ...
def multiply(x1, x2, out=None, **kwargs): ...
def sum(a, axis=None, dtype=None, out=None, keepdims=False): ...
def mean(a, axis=None, dtype=None, out=None, keepdims=False): ...

Mathematical Operations

Linear Algebra

GPU-accelerated linear algebra operations including matrix multiplication, decompositions, eigenvalue computation, and equation solving using cuBLAS and cuSOLVER.

def dot(a, b, out=None): ...
def matmul(x1, x2, out=None): ...
def linalg.svd(a, full_matrices=True, compute_uv=True, hermitian=False): ...
def linalg.eigh(a, UPLO='L'): ...
def linalg.solve(a, b): ...
def linalg.inv(a): ...
def linalg.norm(x, ord=None, axis=None, keepdims=False): ...
def einsum(subscripts, *operands, **kwargs): ...

Linear Algebra

Random Number Generation

GPU-accelerated random number generation supporting multiple bit generators and probability distributions for statistical computing and simulation.

def random.random(size=None, dtype=float): ...
def random.rand(*args): ...
def random.randn(*args): ...
def random.randint(low, high=None, size=None, dtype=int): ...
def random.normal(loc=0.0, scale=1.0, size=None): ...
def random.uniform(low=0.0, high=1.0, size=None): ...
class random.Generator: ...
def random.default_rng(seed=None): ...

Random Number Generation

CUDA Integration

Direct interface to CUDA runtime, memory management, stream processing, and custom kernel development for advanced GPU programming.

class cuda.Device: ...
def cuda.get_device_id(): ...
class cuda.MemoryPool: ...
class cuda.Stream: ...
class cuda.Event: ...
def cuda.compile_with_cache(source, options=(), **kwargs): ...
class ElementwiseKernel: ...
class RawKernel: ...

CUDA Interface

Fast Fourier Transform

GPU-accelerated FFT operations for signal processing and frequency domain analysis using cuFFT library.

def fft.fft(a, n=None, axis=-1, norm=None): ...
def fft.ifft(a, n=None, axis=-1, norm=None): ...
def fft.fft2(a, s=None, axes=(-2, -1), norm=None): ...
def fft.fftn(a, s=None, axes=None, norm=None): ...
def fft.rfft(a, n=None, axis=-1, norm=None): ...
def fft.fftfreq(n, d=1.0): ...

FFT Operations

SciPy Compatibility

Extended functionality providing SciPy-compatible operations for sparse matrices, signal processing, image processing, and specialized mathematical functions.

import cupyx.scipy.sparse
import cupyx.scipy.ndimage
import cupyx.scipy.signal
import cupyx.scipy.special
import cupyx.scipy.linalg
def cupyx.scipy.sparse.csr_matrix(arg1, shape=None, dtype=None, copy=False): ...
def cupyx.scipy.ndimage.gaussian_filter(input, sigma, **kwargs): ...

SciPy Extensions

Input/Output Operations

File I/O operations for saving and loading arrays in various formats including NumPy's .npy and .npz formats.

def save(file, arr, allow_pickle=True, fix_imports=True): ...
def load(file, mmap_mode=None, allow_pickle=False, fix_imports=True, encoding='ASCII'): ...
def savez(file, *args, **kwds): ...
def savez_compressed(file, *args, **kwds): ...
def savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\\n', header='', footer='', comments='# ', encoding=None): ...

Input/Output

Types

class ndarray:
    """N-dimensional array object on GPU memory"""
    def __init__(self, shape, dtype=float, buffer=None, offset=0, strides=None, order=None): ...
    def get(self, stream=None, order='C', out=None): ...  # Transfer to CPU
    def set(self, arr, stream=None): ...  # Transfer from CPU
    @property
    def device(self): ...
    @property
    def data(self): ...
    @property
    def shape(self): ...
    @property
    def dtype(self): ...

class ufunc:
    """Universal function for element-wise operations"""
    def __call__(self, *args, **kwargs): ...
    def reduce(self, a, axis=0, dtype=None, out=None, keepdims=False): ...
    def accumulate(self, a, axis=0, dtype=None, out=None): ...

# Memory management types
class cuda.MemoryPointer: ...
class cuda.Memory: ...
class cuda.MemoryPool: ...
class cuda.PinnedMemory: ...

# Stream and event types  
class cuda.Stream: ...
class cuda.Event: ...
class cuda.Device: ...

# Custom kernel types
class ElementwiseKernel: ...
class ReductionKernel: ...
class RawKernel: ...
class RawModule: ...

docs

array-operations.md

cuda-interface.md

fft-operations.md

index.md

input-output.md

linear-algebra.md

math-operations.md

random-generation.md

scipy-extensions.md

tile.json