tessl/pypi-bottleneck

Fast NumPy array functions written in C for high-performance numerical computing

—

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview

Eval results

Files

Reduction Functions

Name: tessl/pypi-bottleneck
Author: tessl

Statistical and aggregation functions that reduce arrays along specified axes. These functions provide optimized NaN handling and support for all common statistical operations, delivering significant performance improvements over standard NumPy implementations.

Capabilities

Sum Functions

Compute sums of array elements with optional NaN handling and axis specification.

def nansum(a, axis=None):
    """
    Sum of array elements over given axis, ignoring NaNs.

    Parameters:
    - a: array_like, input array
    - axis: None or int or tuple of ints, axis along which to sum

    Returns:
    ndarray or scalar, sum of array elements
    """

Mean Functions

Calculate arithmetic means with NaN-aware implementations.

def nanmean(a, axis=None):
    """
    Compute arithmetic mean along specified axis, ignoring NaNs.

    Parameters:
    - a: array_like, input array
    - axis: None or int or tuple of ints, axis along which to compute mean

    Returns:
    ndarray or scalar, arithmetic mean
    """

Standard Deviation and Variance

Statistical dispersion measures with delta degrees of freedom support.

def nanstd(a, axis=None, ddof=0):
    """
    Compute standard deviation along specified axis, ignoring NaNs.

    Parameters:
    - a: array_like, input array
    - axis: None or int or tuple of ints, axis along which to compute std
    - ddof: int, delta degrees of freedom (default 0)

    Returns:
    ndarray or scalar, standard deviation
    """

def nanvar(a, axis=None, ddof=0):
    """
    Compute variance along specified axis, ignoring NaNs.

    Parameters:
    - a: array_like, input array  
    - axis: None or int or tuple of ints, axis along which to compute variance
    - ddof: int, delta degrees of freedom (default 0)

    Returns:
    ndarray or scalar, variance
    """

Minimum and Maximum Functions

Find extreme values with NaN handling and index location support.

def nanmin(a, axis=None):
    """
    Minimum values along axis, ignoring NaNs.

    Parameters:
    - a: array_like, input array
    - axis: None or int or tuple of ints, axis along which to find minimum

    Returns:
    ndarray or scalar, minimum values
    """

def nanmax(a, axis=None):
    """
    Maximum values along axis, ignoring NaNs.

    Parameters:
    - a: array_like, input array
    - axis: None or int or tuple of ints, axis along which to find maximum

    Returns:
    ndarray or scalar, maximum values
    """

def nanargmin(a, axis=None):
    """
    Indices of minimum values along axis, ignoring NaNs.

    Parameters:
    - a: array_like, input array
    - axis: None or int or tuple of ints, axis along which to find indices

    Returns:
    ndarray or scalar, indices of minimum values
    """

def nanargmax(a, axis=None):
    """
    Indices of maximum values along axis, ignoring NaNs.

    Parameters:
    - a: array_like, input array
    - axis: None or int or tuple of ints, axis along which to find indices

    Returns:
    ndarray or scalar, indices of maximum values
    """

Median Functions

Robust central tendency measures with NaN support.

def median(a, axis=None):
    """
    Compute median along specified axis.

    Parameters:
    - a: array_like, input array
    - axis: None or int or tuple of ints, axis along which to compute median

    Returns:
    ndarray or scalar, median values
    """

def nanmedian(a, axis=None):
    """
    Compute median along specified axis, ignoring NaNs.

    Parameters:
    - a: array_like, input array
    - axis: None or int or tuple of ints, axis along which to compute median

    Returns:
    ndarray or scalar, median values
    """

Utility Functions

Specialized reduction operations for specific use cases.

def ss(a, axis=None):
    """
    Sum of squares of array elements.

    Parameters:
    - a: array_like, input array
    - axis: None or int or tuple of ints, axis along which to sum squares

    Returns:
    ndarray or scalar, sum of squares
    """

def anynan(a, axis=None):
    """
    Test whether any array element along axis is NaN.

    Parameters:
    - a: array_like, input array
    - axis: None or int or tuple of ints, axis along which to test

    Returns:
    ndarray or bool, True if any element is NaN
    """

def allnan(a, axis=None):
    """
    Test whether all array elements along axis are NaN.

    Parameters:
    - a: array_like, input array
    - axis: None or int or tuple of ints, axis along which to test

    Returns:
    ndarray or bool, True if all elements are NaN
    """

Usage Examples

Basic Statistical Analysis

import bottleneck as bn
import numpy as np

# Create data with missing values
data = np.array([[1.0, 2.0, np.nan],
                 [4.0, np.nan, 6.0],
                 [7.0, 8.0, 9.0]])

# Compute statistics ignoring NaN
mean_val = bn.nanmean(data)        # Overall mean: 5.25
row_means = bn.nanmean(data, axis=1)  # Per-row means: [1.5, 5.0, 8.0]
col_means = bn.nanmean(data, axis=0)  # Per-column means: [4.0, 5.0, 7.5]

# Find extremes with their locations
min_val = bn.nanmin(data)          # 1.0
min_idx = bn.nanargmin(data)       # 0 (flattened index)
max_val = bn.nanmax(data)          # 9.0
max_idx = bn.nanargmax(data)       # 8 (flattened index)

Checking for Missing Data

import bottleneck as bn
import numpy as np

# Sample data with various NaN patterns
complete_row = np.array([1, 2, 3, 4, 5])
partial_nans = np.array([1, np.nan, 3, np.nan, 5])
all_nans = np.array([np.nan, np.nan, np.nan])

# Test for any NaN presence
bn.anynan(complete_row)  # False
bn.anynan(partial_nans)  # True
bn.anynan(all_nans)      # True

# Test if all values are NaN
bn.allnan(complete_row)  # False
bn.allnan(partial_nans)  # False
bn.allnan(all_nans)      # True

Robust Statistical Measures

import bottleneck as bn
import numpy as np

# Time series data with outliers and missing values
timeseries = np.array([10, 12, np.nan, 15, 100, 11, 13, np.nan, 14])

# Robust measures less affected by outliers
median_val = bn.nanmedian(timeseries)  # 13.0 (robust central tendency)
mean_val = bn.nanmean(timeseries)      # 25.0 (affected by outlier 100)

# Dispersion measures
std_val = bn.nanstd(timeseries)        # Standard deviation
var_val = bn.nanvar(timeseries)        # Variance

# Population vs sample statistics (using ddof parameter)
pop_std = bn.nanstd(timeseries, ddof=0)  # Population standard deviation
sample_std = bn.nanstd(timeseries, ddof=1)  # Sample standard deviation

Install with Tessl CLI