or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

bandwidth-selection.mdindex.mdkde-estimators.mdkernel-functions.mdutilities.md

kde-estimators.mddocs/

0

# KDE Estimators

1

2

Three high-performance kernel density estimation algorithms with unified API, each optimized for different use cases while providing consistent interface for fitting data and evaluating probability densities.

3

4

## Capabilities

5

6

### NaiveKDE

7

8

Direct computation KDE with maximum flexibility for bandwidth, weights, norms, and grids. Suitable for datasets under 1000 points where flexibility is more important than speed.

9

10

```python { .api }

11

class NaiveKDE:

12

def __init__(self, kernel="gaussian", bw=1, norm=2):

13

"""

14

Initialize naive KDE estimator.

15

16

Parameters:

17

- kernel: str or callable, kernel function name or custom function

18

- bw: float, str, or array-like, bandwidth specification

19

- norm: float, p-norm for distance computation (default: 2)

20

"""

21

22

def fit(self, data, weights=None):

23

"""

24

Fit KDE to data.

25

26

Parameters:

27

- data: array-like, shape (obs,) or (obs, dims), input data

28

- weights: array-like or None, optional weights for data points

29

30

Returns:

31

- self: NaiveKDE instance for method chaining

32

"""

33

34

def evaluate(self, grid_points=None):

35

"""

36

Evaluate KDE on grid points.

37

38

Parameters:

39

- grid_points: int, tuple, array-like, or None, grid specification

40

41

Returns:

42

- tuple (x, y) for auto-generated grid, or array y for user grid

43

"""

44

45

def __call__(self, grid_points=None):

46

"""

47

Callable interface (equivalent to evaluate).

48

49

Parameters:

50

- grid_points: int, tuple, array-like, or None, grid specification

51

52

Returns:

53

- tuple (x, y) for auto-generated grid, or array y for user grid

54

"""

55

```

56

57

**Usage Example:**

58

59

```python

60

import numpy as np

61

from KDEpy import NaiveKDE

62

63

# Sample data

64

data = np.random.randn(500)

65

weights = np.random.exponential(1, 500)

66

67

# Variable bandwidth per point

68

bw_array = np.random.uniform(0.1, 1.0, 500)

69

70

# Flexible KDE with custom parameters

71

kde = NaiveKDE(kernel='triweight', bw=bw_array, norm=1.5)

72

kde.fit(data, weights=weights)

73

x, y = kde.evaluate()

74

75

# Custom grid evaluation

76

custom_grid = np.linspace(-4, 4, 200)

77

y_custom = kde.evaluate(custom_grid)

78

```

79

80

### TreeKDE

81

82

Tree-based KDE using k-d tree data structure for efficient nearest neighbor queries. Provides good balance between speed and flexibility for medium-sized datasets.

83

84

```python { .api }

85

class TreeKDE:

86

def __init__(self, kernel="gaussian", bw=1, norm=2.0):

87

"""

88

Initialize tree-based KDE estimator.

89

90

Parameters:

91

- kernel: str or callable, kernel function name or custom function

92

- bw: float, str, or array-like, bandwidth specification

93

- norm: float, p-norm for distance computation (default: 2.0)

94

"""

95

96

def fit(self, data, weights=None):

97

"""

98

Fit KDE to data and build k-d tree structure.

99

100

Parameters:

101

- data: array-like, shape (obs,) or (obs, dims), input data

102

- weights: array-like or None, optional weights for data points

103

104

Returns:

105

- self: TreeKDE instance for method chaining

106

"""

107

108

def evaluate(self, grid_points=None, eps=10e-4):

109

"""

110

Evaluate KDE using tree-based queries.

111

112

Parameters:

113

- grid_points: int, tuple, array-like, or None, grid specification

114

- eps: float, numerical precision parameter (default: 10e-4)

115

116

Returns:

117

- tuple (x, y) for auto-generated grid, or array y for user grid

118

"""

119

120

def __call__(self, grid_points=None, eps=10e-4):

121

"""

122

Callable interface (equivalent to evaluate).

123

124

Parameters:

125

- grid_points: int, tuple, array-like, or None, grid specification

126

- eps: float, numerical precision parameter (default: 10e-4)

127

128

Returns:

129

- tuple (x, y) for auto-generated grid, or array y for user grid

130

"""

131

```

132

133

**Usage Example:**

134

135

```python

136

import numpy as np

137

from KDEpy import TreeKDE

138

139

# Multi-dimensional data

140

data = np.random.multivariate_normal([0, 0], [[1, 0.5], [0.5, 1]], 1000)

141

142

# Tree-based KDE with automatic bandwidth

143

kde = TreeKDE(kernel='gaussian', bw='ISJ')

144

kde.fit(data)

145

146

# Evaluate on 2D grid

147

x, y = kde.evaluate((64, 64)) # 64x64 grid

148

149

# High precision evaluation

150

y_precise = kde.evaluate(grid_points, eps=1e-6)

151

```

152

153

### FFTKDE

154

155

FFT-based convolution KDE for ultra-fast computation on equidistant grids. Scales to millions of data points but requires constant bandwidth and equidistant evaluation grids.

156

157

```python { .api }

158

class FFTKDE:

159

def __init__(self, kernel="gaussian", bw=1, norm=2):

160

"""

161

Initialize FFT-based KDE estimator.

162

163

Parameters:

164

- kernel: str or callable, kernel function name or custom function

165

- bw: float or str, bandwidth (must be constant) or selection method

166

- norm: float, p-norm for distance computation (default: 2)

167

"""

168

169

def fit(self, data, weights=None):

170

"""

171

Fit KDE to data for FFT computation.

172

173

Parameters:

174

- data: array-like, shape (obs,) or (obs, dims), input data

175

- weights: array-like or None, optional weights for data points

176

177

Returns:

178

- self: FFTKDE instance for method chaining

179

"""

180

181

def evaluate(self, grid_points=None):

182

"""

183

Evaluate KDE using FFT convolution on equidistant grid.

184

185

Parameters:

186

- grid_points: int, tuple, or None, grid specification (must be equidistant)

187

188

Returns:

189

- tuple (x, y) for auto-generated grid, or array y for user grid

190

191

Note: User-supplied grids must be equidistant for FFT computation

192

"""

193

194

def __call__(self, grid_points=None):

195

"""

196

Callable interface (equivalent to evaluate).

197

198

Parameters:

199

- grid_points: int, tuple, or None, grid specification (must be equidistant)

200

201

Returns:

202

- tuple (x, y) for auto-generated grid, or array y for user grid

203

204

Note: User-supplied grids must be equidistant for FFT computation

205

"""

206

```

207

208

**Usage Example:**

209

210

```python

211

import numpy as np

212

from KDEpy import FFTKDE

213

214

# Large dataset

215

data = np.random.randn(100000)

216

217

# Ultra-fast FFT-based KDE

218

kde = FFTKDE(kernel='gaussian', bw='scott')

219

kde.fit(data)

220

221

# Fast evaluation on fine grid

222

x, y = kde.evaluate(2048) # 2048 equidistant points

223

224

# Weighted data

225

weights = np.random.exponential(1, 100000)

226

kde_weighted = FFTKDE(bw=0.5).fit(data, weights)

227

x, y = kde_weighted.evaluate()

228

```

229

230

## Common Usage Patterns

231

232

### Method Chaining

233

234

All KDE estimators support method chaining for concise usage:

235

236

```python

237

# Concise single-line KDE

238

x, y = FFTKDE(bw='ISJ').fit(data).evaluate(512)

239

240

# With weights

241

x, y = NaiveKDE(kernel='epa').fit(data, weights).evaluate()

242

243

# Custom evaluation

244

result = TreeKDE(bw=1.5).fit(data).evaluate(custom_grid)

245

```

246

247

### Callable Interface

248

249

KDE instances can be called directly (equivalent to evaluate):

250

251

```python

252

kde = TreeKDE().fit(data)

253

y = kde(grid_points) # Same as kde.evaluate(grid_points)

254

```

255

256

### Grid Specifications

257

258

All estimators accept flexible grid specifications:

259

260

```python

261

# Integer: number of equidistant points

262

x, y = kde.evaluate(256)

263

264

# Tuple: points per dimension for multi-dimensional data

265

x, y = kde.evaluate((64, 64, 32))

266

267

# Array: explicit grid points

268

grid = np.linspace(-3, 3, 100)

269

y = kde.evaluate(grid)

270

271

# None: automatic grid generation

272

x, y = kde.evaluate()

273

```

274

275

## Types

276

277

```python { .api }

278

from typing import Union, Optional, Sequence, Tuple

279

import numpy as np

280

281

# Constructor parameter types

282

KernelSpec = Union[str, callable]

283

BandwidthSpec = Union[float, str, np.ndarray, Sequence]

284

NormSpec = float

285

286

# Method parameter types

287

DataSpec = Union[np.ndarray, Sequence]

288

WeightsSpec = Optional[Union[np.ndarray, Sequence]]

289

GridSpec = Optional[Union[int, Tuple[int, ...], np.ndarray, Sequence]]

290

291

# Return types

292

GridResult = Tuple[np.ndarray, np.ndarray] # (x, y)

293

ValueResult = np.ndarray # y values only

294

EvaluateResult = Union[GridResult, ValueResult]

295

```