or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-pyopencl

Python wrapper for OpenCL enabling GPU and parallel computing with comprehensive array operations and mathematical functions

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/pyopencl@2025.2.x

To install, run

npx @tessl/cli install tessl/pypi-pyopencl@2025.2.0

0

# PyOpenCL

1

2

PyOpenCL is a comprehensive Python wrapper for OpenCL that provides pythonic access to parallel computing capabilities on GPUs and other massively parallel devices. It offers both low-level OpenCL API access with automatic error checking and high-level convenience functions for array operations, mathematical functions, and algorithm primitives, making GPU computing accessible for scientific computing, machine learning, and high-performance applications.

3

4

## Package Information

5

6

- **Package Name**: pyopencl

7

- **Language**: Python

8

- **Installation**: `pip install pyopencl`

9

10

## Core Imports

11

12

```python

13

import pyopencl as cl

14

```

15

16

Common patterns for array operations:

17

18

```python

19

import pyopencl.array as cl_array

20

import pyopencl.clmath as clmath

21

```

22

23

For algorithm primitives:

24

25

```python

26

from pyopencl.scan import InclusiveScanKernel

27

from pyopencl.reduction import ReductionKernel

28

from pyopencl.elementwise import ElementwiseKernel

29

```

30

31

## Basic Usage

32

33

```python

34

import pyopencl as cl

35

import pyopencl.array as cl_array

36

import numpy as np

37

38

# Create OpenCL context and queue

39

ctx = cl.create_some_context()

40

queue = cl.CommandQueue(ctx)

41

42

# Create arrays on device

43

a_host = np.random.randn(50000).astype(np.float32)

44

b_host = np.random.randn(50000).astype(np.float32)

45

46

a_gpu = cl_array.to_device(queue, a_host)

47

b_gpu = cl_array.to_device(queue, b_host)

48

49

# Perform operations on GPU

50

result_gpu = a_gpu + b_gpu

51

result_host = result_gpu.get()

52

53

print(f"Result shape: {result_host.shape}")

54

print(f"First 5 elements: {result_host[:5]}")

55

```

56

57

## Architecture

58

59

PyOpenCL follows OpenCL's hierarchical structure while providing pythonic interfaces:

60

61

- **Platform/Device Management**: Discover and select compute devices (GPUs, CPUs)

62

- **Context/CommandQueue**: Execution environment and command scheduling

63

- **Memory Objects**: Buffers, images, and shared virtual memory (SVM) for data transfer

64

- **Program/Kernel**: Compile and execute OpenCL kernels on devices

65

- **Array Operations**: High-level NumPy-like interface for GPU arrays

66

- **Algorithm Primitives**: Pre-built parallel algorithms (scan, reduction, sorting)

67

68

This design enables everything from simple array operations to complex custom kernel development, serving as the foundation for GPU computing in the Python scientific ecosystem.

69

70

## Capabilities

71

72

### Core OpenCL Objects and Management

73

74

Platform discovery, device selection, context creation, command queue management, program compilation, and kernel execution. These form the foundation of OpenCL computing and provide complete control over parallel execution.

75

76

```python { .api }

77

def get_platforms(): ...

78

def create_some_context(interactive=None, answers=None): ...

79

def choose_devices(interactive=None, answers=None): ...

80

81

class Platform: ...

82

class Device: ...

83

class Context: ...

84

class CommandQueue: ...

85

class Program: ...

86

class Kernel: ...

87

```

88

89

[Core OpenCL](./core-opencl.md)

90

91

### Memory Management and Data Transfer

92

93

Buffer creation, image handling, memory mapping, and data transfer between host and device. Includes advanced shared virtual memory (SVM) support for zero-copy operations in OpenCL 2.0+.

94

95

```python { .api }

96

class Buffer: ...

97

class Image: ...

98

def create_image(context, flags, format, shape=None, pitches=None, hostbuf=None): ...

99

def enqueue_copy(queue, dest, src, **kwargs): ...

100

def enqueue_fill(queue, dest, pattern, size, *, offset=0, wait_for=None): ...

101

102

# SVM (OpenCL 2.0+)

103

class SVM: ...

104

class SVMAllocation: ...

105

def svm_empty(ctx, flags, shape, dtype, order="C", alignment=None): ...

106

def csvm_empty(ctx, shape, dtype, order="C", alignment=None): ...

107

```

108

109

[Memory Management](./memory-management.md)

110

111

### Array Operations

112

113

High-level NumPy-like GPU array interface providing familiar array operations, mathematical functions, and data manipulation. Enables seamless transition from CPU to GPU computing.

114

115

```python { .api }

116

class Array: ...

117

def to_device(queue, ary, **kwargs): ...

118

def zeros(queue, shape, dtype=float, order="C", allocator=None): ...

119

def arange(queue, *args, **kwargs): ...

120

121

def sum(a, dtype=None, queue=None, slice=None): ...

122

def dot(a_gpu, b_gpu, dtype=None, queue=None): ...

123

def concatenate(arrays, axis=0, queue=None, allocator=None): ...

124

def transpose(a_gpu, axes=None): ...

125

```

126

127

[Array Operations](./array-operations.md)

128

129

### Mathematical Functions

130

131

Comprehensive set of mathematical functions optimized for GPU execution, including trigonometric, exponential, logarithmic, and special functions that operate element-wise on arrays.

132

133

```python { .api }

134

# Trigonometric functions

135

def sin(x, queue=None): ...

136

def cos(x, queue=None): ...

137

def tan(x, queue=None): ...

138

def asin(x, queue=None): ...

139

140

# Exponential/logarithmic functions

141

def exp(x, queue=None): ...

142

def log(x, queue=None): ...

143

def sqrt(x, queue=None): ...

144

145

# Special functions

146

def erf(x, queue=None): ...

147

def tgamma(x, queue=None): ...

148

```

149

150

[Mathematical Functions](./mathematical-functions.md)

151

152

### Algorithm Primitives

153

154

Pre-built parallel algorithms including scan (prefix sum), reduction, element-wise operations, and sorting. These provide building blocks for complex parallel computations.

155

156

```python { .api }

157

class ReductionKernel: ...

158

class InclusiveScanKernel: ...

159

class ExclusiveScanKernel: ...

160

class ElementwiseKernel: ...

161

162

class RadixSort: ...

163

class BitonicSort: ...

164

```

165

166

[Algorithm Primitives](./algorithm-primitives.md)

167

168

### Random Number Generation

169

170

High-quality parallel random number generation using cryptographically secure algorithms (Philox, Threefry) suitable for Monte Carlo simulations and stochastic computations.

171

172

```python { .api }

173

class PhiloxGenerator: ...

174

class ThreefryGenerator: ...

175

176

def rand(queue, shape, dtype=float, luxury=None, generator=None): ...

177

def fill_rand(result, queue=None, luxury=None, generator=None): ...

178

```

179

180

[Random Number Generation](./random-number-generation.md)

181

182

### Tools and Utilities

183

184

Memory allocators, kernel argument handling, type management, device characterization, and debugging utilities that support efficient GPU computing and development workflows.

185

186

```python { .api }

187

class MemoryPool: ...

188

class ImmediateAllocator: ...

189

class DeferredAllocator: ...

190

191

def dtype_to_ctype(dtype): ...

192

def get_or_register_dtype(name, dtype): ...

193

194

# Device characterization

195

def has_double_support(device): ...

196

def get_simd_group_size(device, kernel): ...

197

```

198

199

[Tools and Utilities](./tools-and-utilities.md)

200

201

### OpenGL Interoperability

202

203

Integration with OpenGL for graphics/compute workflows, allowing sharing of buffers, textures, and renderbuffers between OpenGL and OpenCL contexts.

204

205

```python { .api }

206

class GLBuffer: ...

207

class GLRenderBuffer: ...

208

class GLTexture: ...

209

210

def enqueue_acquire_gl_objects(queue, mem_objects, wait_for=None): ...

211

def enqueue_release_gl_objects(queue, mem_objects, wait_for=None): ...

212

def have_gl(): ...

213

```

214

215

[OpenGL Interoperability](./opengl-interop.md)

216

217

## Error Handling

218

219

```python { .api }

220

class Error(Exception): ...

221

class MemoryError(Error): ...

222

class LogicError(Error): ...

223

class RuntimeError(Error): ...

224

```

225

226

PyOpenCL provides comprehensive error handling with automatic OpenCL error code translation to Python exceptions, enabling proper error recovery and debugging.

227

228

## Types

229

230

```python { .api }

231

# Type aliases for function signatures

232

WaitList = Sequence[Event] | None

233

KernelArg = Buffer | Array | LocalMemory | np.number | SVM

234

Allocator = Callable[[int], Buffer]

235

236

# OpenCL constants and enumerations

237

class mem_flags: ...

238

class device_type: ...

239

class command_queue_properties: ...

240

```