Fast numerical expression evaluator for NumPy that accelerates array operations through optimized implementations and multi-threading
—
Configuration of multi-threading behavior and performance optimization settings for CPU-intensive computations. NumExpr automatically parallelizes operations across available CPU cores and provides fine-grained control over threading behavior.
Control the number of threads used for NumExpr operations, balancing performance with system resource usage.
def set_num_threads(nthreads):
"""
Set the number of threads to use for operations.
Controls the parallelization level for NumExpr computations. The
virtual machine distributes array chunks across the specified number
of threads for parallel execution.
Parameters:
- nthreads (int): Number of threads to use (1 to MAX_THREADS)
Returns:
int: Previous thread count setting
Raises:
ValueError: If nthreads exceeds MAX_THREADS or is less than 1
"""
def get_num_threads():
"""
Get the current number of threads in use for operations.
Returns:
int: Current thread count configuration
"""Usage Examples:
import numexpr as ne
import numpy as np
# Check current thread configuration
print(f"Current threads: {ne.get_num_threads()}")
print(f"Max threads supported: {ne.MAX_THREADS}")
# Set specific thread count
old_threads = ne.set_num_threads(4)
print(f"Changed from {old_threads} to {ne.get_num_threads()} threads")
# Benchmark with different thread counts
data = np.random.random((1000000, 10))
expr = "sum(data**2 + sqrt(data), axis=1)"
for threads in [1, 2, 4, 8]:
ne.set_num_threads(threads)
# Time the operation...
result = ne.evaluate(expr, local_dict={'data': data})Automatically detect optimal threading configuration based on system capabilities and environment variables.
def detect_number_of_cores():
"""
Detect the number of CPU cores available on the system.
Uses platform-specific methods to determine the number of logical
CPU cores, providing a basis for automatic thread configuration.
Returns:
int: Number of detected CPU cores
"""
def detect_number_of_threads():
"""
DEPRECATED: Detect optimal number of threads.
This function is deprecated. Use _init_num_threads() instead for
environment-based thread initialization.
Returns:
int: Suggested thread count based on system and environment
"""
def _init_num_threads():
"""
Initialize thread count based on environment variables.
Checks environment variables in order of precedence:
1. NUMEXPR_MAX_THREADS - maximum thread pool size
2. NUMEXPR_NUM_THREADS - initial thread count
3. OMP_NUM_THREADS - OpenMP thread count
4. Defaults to detected core count (limited to safe maximum)
Returns:
int: Initialized thread count
"""Usage Examples:
# Detect system capabilities
cores = ne.detect_number_of_cores()
print(f"System has {cores} CPU cores")
# Initialize with environment-based settings
import os
os.environ['NUMEXPR_MAX_THREADS'] = '8'
os.environ['NUMEXPR_NUM_THREADS'] = '4'
# This happens automatically on import, but can be called manually
threads = ne._init_num_threads()
print(f"Initialized with {threads} threads")Access to system-level constants that control NumExpr's performance characteristics.
# Threading limits
MAX_THREADS: int # Maximum number of threads supported by the C extension
# Virtual machine configuration
__BLOCK_SIZE1__: int # Block size used for chunking array operations
# Runtime state
ncores: int # Number of detected CPU cores (set at import)
nthreads: int # Current configured thread count (set at import)Usage Examples:
print(f"Hardware threads: {ne.ncores}")
print(f"Configured threads: {ne.nthreads}")
print(f"Max supported: {ne.MAX_THREADS}")
print(f"Block size: {ne.__BLOCK_SIZE1__}")
# Ensure we don't exceed limits
desired_threads = min(16, ne.MAX_THREADS, ne.ncores)
ne.set_num_threads(desired_threads)NUMEXPR_MAX_THREADS: Maximum size of the thread pool
NUMEXPR_NUM_THREADS: Initial number of active threads
set_num_threads()OMP_NUM_THREADS: OpenMP-compatible thread setting
# Example environment setup
export NUMEXPR_MAX_THREADS=8 # Allow up to 8 threads
export NUMEXPR_NUM_THREADS=4 # Start with 4 active threads
# Alternative using OMP standard
export OMP_NUM_THREADS=6 # Use 6 threads (if NUMEXPR_NUM_THREADS not set)Optimal Thread Count:
Array Size Considerations:
SPARC Systems: Automatically limited to 1 thread due to known threading issues Memory-Constrained Systems: NumExpr enforces safe limits (max 16 threads by default) NUMA Systems: Thread affinity may affect performance on multi-socket systems
import time
import numpy as np
import numexpr as ne
def benchmark_threads(expression, data_dict, thread_counts):
"""Benchmark expression with different thread configurations."""
results = {}
for num_threads in thread_counts:
ne.set_num_threads(num_threads)
# Warm up
ne.evaluate(expression, local_dict=data_dict)
# Time multiple evaluations
start = time.time()
for _ in range(10):
ne.evaluate(expression, local_dict=data_dict)
elapsed = time.time() - start
results[num_threads] = elapsed / 10
print(f"{num_threads} threads: {elapsed/10:.4f}s per evaluation")
return results
# Example usage
large_arrays = {
'a': np.random.random(1000000),
'b': np.random.random(1000000),
'c': np.random.random(1000000)
}
benchmark_threads("a * b + sin(c) * exp(-a/100)",
large_arrays,
[1, 2, 4, 8])NumExpr operations are thread-safe in the following contexts:
Not thread-safe:
set_num_threads() affect all threadsInstall with Tessl CLI
npx tessl i tessl/pypi-numexpr