or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

bandwidth-selection.mdindex.mdkde-estimators.mdkernel-functions.mdutilities.md

index.mddocs/

0

# KDEpy

1

2

A comprehensive kernel density estimation library for Python that implements three high-performance algorithms through a unified API: NaiveKDE for accurate d-dimensional data with variable bandwidth support, TreeKDE for fast tree-based computation with arbitrary grid evaluation, and FFTKDE for ultra-fast convolution-based computation on equidistant grids.

3

4

## Package Information

5

6

- **Package Name**: KDEpy

7

- **Language**: Python

8

- **Installation**: `pip install KDEpy`

9

- **Requires**: numpy>=1.14.2, scipy>=1.0.1

10

11

## Core Imports

12

13

```python

14

import KDEpy

15

```

16

17

Import specific estimators:

18

19

```python

20

from KDEpy import FFTKDE, NaiveKDE, TreeKDE

21

```

22

23

## Basic Usage

24

25

```python

26

import numpy as np

27

from KDEpy import FFTKDE

28

29

# Generate sample data

30

data = np.random.randn(1000)

31

32

# Create and fit KDE with automatic bandwidth selection

33

kde = FFTKDE(kernel='gaussian', bw='ISJ')

34

kde.fit(data)

35

36

# Evaluate on automatic grid

37

x, y = kde.evaluate()

38

39

# Or evaluate on custom grid

40

grid_points = np.linspace(-3, 3, 100)

41

y_custom = kde.evaluate(grid_points)

42

43

# Chain operations for concise usage

44

x, y = FFTKDE(bw='scott').fit(data).evaluate(256)

45

```

46

47

## Architecture

48

49

KDEpy provides three complementary algorithms optimized for different use cases:

50

51

- **NaiveKDE**: Direct computation with maximum flexibility for bandwidth, weights, norms, and grids. Suitable for <1000 data points.

52

- **TreeKDE**: k-d tree-based computation using scipy's cKDTree for efficient nearest neighbor queries. Good balance of speed and flexibility.

53

- **FFTKDE**: FFT-based convolution for ultra-fast computation on equidistant grids. Requires constant bandwidth but scales to millions of points.

54

55

All estimators inherit from BaseKDE, providing a consistent API while allowing algorithm-specific optimizations. The modular design enables easy bandwidth selection method integration and kernel function customization.

56

57

## Capabilities

58

59

### KDE Estimators

60

61

Three high-performance kernel density estimation algorithms with unified API for fitting data and evaluating probability densities.

62

63

```python { .api }

64

class NaiveKDE:

65

def __init__(self, kernel="gaussian", bw=1, norm=2): ...

66

def fit(self, data, weights=None): ...

67

def evaluate(self, grid_points=None): ...

68

def __call__(self, grid_points=None): ...

69

70

class TreeKDE:

71

def __init__(self, kernel="gaussian", bw=1, norm=2.0): ...

72

def fit(self, data, weights=None): ...

73

def evaluate(self, grid_points=None, eps=10e-4): ...

74

def __call__(self, grid_points=None): ...

75

76

class FFTKDE:

77

def __init__(self, kernel="gaussian", bw=1, norm=2): ...

78

def fit(self, data, weights=None): ...

79

def evaluate(self, grid_points=None): ...

80

def __call__(self, grid_points=None): ...

81

```

82

83

[KDE Estimators](./kde-estimators.md)

84

85

### Bandwidth Selection

86

87

Automatic bandwidth selection methods for optimal kernel density estimation without manual parameter tuning.

88

89

```python { .api }

90

def improved_sheather_jones(data, weights=None): ...

91

def scotts_rule(data, weights=None): ...

92

def silvermans_rule(data, weights=None): ...

93

```

94

95

[Bandwidth Selection](./bandwidth-selection.md)

96

97

### Kernel Functions

98

99

Built-in kernel functions with finite and infinite support for probability density estimation.

100

101

```python { .api }

102

# Available kernel names for use in KDE constructors

103

AVAILABLE_KERNELS = [

104

"gaussian", "exponential", "box", "tri", "epa",

105

"biweight", "triweight", "tricube", "cosine"

106

]

107

108

class Kernel:

109

def __init__(self, function, var=1, support=3): ...

110

def evaluate(self, x, bw=1, norm=2): ...

111

```

112

113

[Kernel Functions](./kernel-functions.md)

114

115

### Utility Functions

116

117

Helper functions for grid generation, array manipulation, and data processing in kernel density estimation workflows.

118

119

```python { .api }

120

def autogrid(data, boundary_abs=3, num_points=None, boundary_rel=0.05): ...

121

def cartesian(arrays): ...

122

def linear_binning(data, grid_points, weights=None): ...

123

```

124

125

[Utilities](./utilities.md)

126

127

## Types

128

129

```python { .api }

130

from typing import Union, Optional, Sequence

131

import numpy as np

132

133

# Data types

134

DataType = Union[np.ndarray, Sequence]

135

WeightsType = Optional[Union[np.ndarray, Sequence]]

136

GridType = Union[int, tuple, np.ndarray, Sequence]

137

138

# Bandwidth specification

139

BandwidthType = Union[

140

float, # Explicit bandwidth value

141

str, # Selection method: "ISJ", "scott", "silverman"

142

np.ndarray, # Per-point bandwidth array

143

Sequence # Per-point bandwidth sequence

144

]

145

146

# Kernel specification

147

KernelType = Union[str, callable] # Kernel name or custom function

148

149

# Return types

150

EvaluationResult = Union[

151

tuple[np.ndarray, np.ndarray], # (x, y) for auto-generated grid

152

np.ndarray # y values for user-supplied grid

153

]

154

```