CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-kdepy

Kernel Density Estimation in Python with three high-performance algorithms through a unified API.

Pending
Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Pending

The risk profile of this skill

Overview
Eval results
Files

KDEpy

A comprehensive kernel density estimation library for Python that implements three high-performance algorithms through a unified API: NaiveKDE for accurate d-dimensional data with variable bandwidth support, TreeKDE for fast tree-based computation with arbitrary grid evaluation, and FFTKDE for ultra-fast convolution-based computation on equidistant grids.

Package Information

  • Package Name: KDEpy
  • Language: Python
  • Installation: pip install KDEpy
  • Requires: numpy>=1.14.2, scipy>=1.0.1

Core Imports

import KDEpy

Import specific estimators:

from KDEpy import FFTKDE, NaiveKDE, TreeKDE

Basic Usage

import numpy as np
from KDEpy import FFTKDE

# Generate sample data
data = np.random.randn(1000)

# Create and fit KDE with automatic bandwidth selection
kde = FFTKDE(kernel='gaussian', bw='ISJ')
kde.fit(data)

# Evaluate on automatic grid
x, y = kde.evaluate()

# Or evaluate on custom grid
grid_points = np.linspace(-3, 3, 100)
y_custom = kde.evaluate(grid_points)

# Chain operations for concise usage
x, y = FFTKDE(bw='scott').fit(data).evaluate(256)

Architecture

KDEpy provides three complementary algorithms optimized for different use cases:

  • NaiveKDE: Direct computation with maximum flexibility for bandwidth, weights, norms, and grids. Suitable for <1000 data points.
  • TreeKDE: k-d tree-based computation using scipy's cKDTree for efficient nearest neighbor queries. Good balance of speed and flexibility.
  • FFTKDE: FFT-based convolution for ultra-fast computation on equidistant grids. Requires constant bandwidth but scales to millions of points.

All estimators inherit from BaseKDE, providing a consistent API while allowing algorithm-specific optimizations. The modular design enables easy bandwidth selection method integration and kernel function customization.

Capabilities

KDE Estimators

Three high-performance kernel density estimation algorithms with unified API for fitting data and evaluating probability densities.

class NaiveKDE:
    def __init__(self, kernel="gaussian", bw=1, norm=2): ...
    def fit(self, data, weights=None): ...
    def evaluate(self, grid_points=None): ...
    def __call__(self, grid_points=None): ...

class TreeKDE:
    def __init__(self, kernel="gaussian", bw=1, norm=2.0): ...
    def fit(self, data, weights=None): ...
    def evaluate(self, grid_points=None, eps=10e-4): ...
    def __call__(self, grid_points=None): ...

class FFTKDE:
    def __init__(self, kernel="gaussian", bw=1, norm=2): ...
    def fit(self, data, weights=None): ...
    def evaluate(self, grid_points=None): ...
    def __call__(self, grid_points=None): ...

KDE Estimators

Bandwidth Selection

Automatic bandwidth selection methods for optimal kernel density estimation without manual parameter tuning.

def improved_sheather_jones(data, weights=None): ...
def scotts_rule(data, weights=None): ...
def silvermans_rule(data, weights=None): ...

Bandwidth Selection

Kernel Functions

Built-in kernel functions with finite and infinite support for probability density estimation.

# Available kernel names for use in KDE constructors
AVAILABLE_KERNELS = [
    "gaussian", "exponential", "box", "tri", "epa", 
    "biweight", "triweight", "tricube", "cosine"
]

class Kernel:
    def __init__(self, function, var=1, support=3): ...
    def evaluate(self, x, bw=1, norm=2): ...

Kernel Functions

Utility Functions

Helper functions for grid generation, array manipulation, and data processing in kernel density estimation workflows.

def autogrid(data, boundary_abs=3, num_points=None, boundary_rel=0.05): ...
def cartesian(arrays): ...
def linear_binning(data, grid_points, weights=None): ...

Utilities

Types

from typing import Union, Optional, Sequence
import numpy as np

# Data types
DataType = Union[np.ndarray, Sequence]
WeightsType = Optional[Union[np.ndarray, Sequence]]
GridType = Union[int, tuple, np.ndarray, Sequence]

# Bandwidth specification
BandwidthType = Union[
    float,                    # Explicit bandwidth value
    str,                     # Selection method: "ISJ", "scott", "silverman"
    np.ndarray,              # Per-point bandwidth array
    Sequence                 # Per-point bandwidth sequence
]

# Kernel specification  
KernelType = Union[str, callable]  # Kernel name or custom function

# Return types
EvaluationResult = Union[
    tuple[np.ndarray, np.ndarray],  # (x, y) for auto-generated grid
    np.ndarray                      # y values for user-supplied grid
]
Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/kdepy@1.1.x
Publish Source
CLI
Badge
tessl/pypi-kdepy badge