Tessl Tile for pypi/shap@0.48.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

explainers.md index.md utilities.md visualization.md

explainers.mddocs/

0
# Model Explainers
1

2
SHAP provides specialized explainer algorithms optimized for different model types, each offering unique performance characteristics and mathematical guarantees. All explainers implement both modern (`__call__`) and legacy (`shap_values`) interfaces.
3

4
## Capabilities
5

6
### Tree Ensemble Explainers
7

8
High-speed exact algorithms for tree-based models including XGBoost, LightGBM, CatBoost, and scikit-learn tree ensembles.
9

10
```python { .api }
11
class TreeExplainer:
12
    """
13
    Exact SHAP values for tree ensemble models using optimized algorithms.
14
    
15
    Supports XGBoost, LightGBM, CatBoost, scikit-learn tree models with
16
    polynomial time complexity and exact mathematical guarantees.
17
    """
18
    def __init__(
19
        self, 
20
        model, 
21
        data=None, 
22
        model_output="raw", 
23
        feature_perturbation="auto", 
24
        feature_names=None,
25
        link=None,
26
        linearize_link=None
27
    ):
28
        """
29
        Parameters:
30
        - model: Tree-based ML model (XGBoost, LightGBM, CatBoost, sklearn)
31
        - data: Background dataset for feature integration (optional)
32
        - model_output: Output format ("raw", "probability", "log_loss")
33
        - feature_perturbation: Perturbation method ("auto", "interventional", "tree_path_dependent")  
34
        - feature_names: List of feature names (optional)
35
        """
36
    
37
    def __call__(self, X, y=None, interactions=False, check_additivity=True, approximate=False) -> Explanation:
38
        """
39
        Compute SHAP values for input samples.
40
        
41
        Parameters:
42
        - X: Input samples (array-like, DataFrame)
43
        - y: Target values for multi-output models (optional)
44
        - interactions: Compute interaction values (bool)
45
        - check_additivity: Verify SHAP values sum correctly (bool)
46
        - approximate: Use approximation for speed (bool)
47
        
48
        Returns:
49
        Explanation object with SHAP values and metadata
50
        """
51
    
52
    def shap_values(self, X, y=None, tree_limit=None, approximate=False, check_additivity=True):
53
        """Legacy interface returning raw numpy arrays."""
54
    
55
    def shap_interaction_values(self, X, y=None, tree_limit=None):
56
        """Compute SHAP interaction values (pairwise feature interactions)."""
57
    
58
    @property
59
    def expected_value(self):
60
        """Expected value of model output (baseline)."""
61

62
class GPUTreeExplainer(TreeExplainer):
63
    """
64
    GPU-accelerated tree explanations (experimental).
65
    
66
    Requires CUDA build with 'CUDA_PATH' environment variable.
67
    """
68
    def __init__(self, model):
69
        """Initialize GPU tree explainer for supported tree models."""
70
```
71

72
**Usage Example:**
73

74
```python
75
import shap
76
from xgboost import XGBClassifier
77

78
# Train model
79
model = XGBClassifier()
80
model.fit(X_train, y_train)
81

82
# Create explainer and compute SHAP values
83
explainer = shap.TreeExplainer(model)
84
shap_values = explainer(X_test)
85

86
# Access components
87
print(f"Expected value: {explainer.expected_value}")
88
print(f"SHAP values shape: {shap_values.values.shape}")
89
```
90

91
### Model-Agnostic Explainers
92

93
Universal explainers that work with any model type through sampling-based approaches.
94

95
```python { .api }
96
class KernelExplainer:
97
    """
98
    Model-agnostic explainer using weighted linear regression.
99
    
100
    Works with any model by sampling around input and solving
101
    optimization problem for SHAP values. Provides theoretical guarantees.
102
    """
103
    def __init__(self, model, data, feature_names=None, link="identity"):
104
        """
105
        Parameters:
106
        - model: Function/model taking samples and returning predictions
107
        - data: Background dataset for masking (array, DataFrame, sparse matrix)
108
        - feature_names: List of feature names (optional)
109
        - link: Link function ("identity" or "logit")
110
        """
111
    
112
    def __call__(self, X, l1_reg="num_features(10)", silent=False) -> Explanation:
113
        """
114
        Compute SHAP values through sampling and optimization.
115
        
116
        Parameters:
117
        - X: Input samples to explain
118
        - l1_reg: Regularization ("num_features(int)", "aic", "bic", or float)
119
        - silent: Hide progress bar (bool)
120
        """
121
    
122
    def shap_values(self, X, nsamples="auto", l1_reg="num_features(10)", silent=False):
123
        """
124
        Legacy interface with additional parameters.
125
        
126
        Parameters:
127
        - nsamples: Number of samples ("auto" or int)
128
        - l1_reg: Regularization method
129
        - silent: Hide progress bar
130
        - gc_collect: Run garbage collection
131
        """
132
    
133
    @property
134
    def expected_value(self):
135
        """Expected value of model output."""
136

137
class PermutationExplainer:
138
    """
139
    Model-agnostic explainer using permutation sampling.
140
    
141
    Approximates SHAP values by iterating through feature permutations.
142
    Guarantees local accuracy with hierarchical structure support.
143
    """
144
    def __init__(self, model, masker, link="identity", feature_names=None, seed=None):
145
        """
146
        Parameters:
147
        - model: Model function to explain
148
        - masker: Masker object for feature perturbation
149
        - link: Link function for output transformation
150
        - seed: Random seed for reproducibility
151
        """
152
    
153
    def __call__(self, *args, max_evals=500, main_effects=False, error_bounds=False, 
154
                 batch_size="auto", outputs=None, silent=False):
155
        """Compute SHAP values using permutation sampling."""
156

157
class SamplingExplainer(KernelExplainer):
158
    """
159
    Extension of Shapley sampling (IME) method.
160
    
161
    Assumes feature independence and works well with large background datasets.
162
    """
163
    def __init__(self, model, data, **kwargs):
164
        """Initialize sampling explainer with feature independence assumption."""
165
    
166
    def __call__(self, X, y=None, nsamples=2000):
167
        """Compute SHAP values under feature independence."""
168
```
169

170
### Deep Learning Explainers
171

172
Specialized explainers for neural networks using gradient-based and compositional approaches.
173

174
```python { .api }
175
class DeepExplainer:
176
    """
177
    Deep learning explainer using compositional rules from DeepLIFT.
178
    
179
    Supports TensorFlow and PyTorch models with automatic framework detection.
180
    Uses backpropagation for efficient computation.
181
    """
182
    def __init__(self, model, data, session=None, learning_phase_flags=None):
183
        """
184
        Parameters:
185
        - model: Neural network model
186
          - TensorFlow: (input_tensors, output_tensor) or tf.keras.Model
187
          - PyTorch: nn.Module or (model, layer) tuple
188
        - data: Background dataset matching model input format
189
        - session: TensorFlow session (optional)
190
        - learning_phase_flags: Custom learning phase flags (TensorFlow)
191
        """
192
    
193
    def __call__(self, X) -> Explanation:
194
        """Compute SHAP values using compositional rules."""
195
    
196
    def shap_values(self, X, ranked_outputs=None, output_rank_order="max", check_additivity=True):
197
        """
198
        Legacy interface with output ranking.
199
        
200
        Parameters:
201
        - ranked_outputs: Number of top outputs to explain
202
        - output_rank_order: Ranking method ("max", "min", "max_abs")
203
        - check_additivity: Verify SHAP values sum correctly
204
        """
205

206
class GradientExplainer:
207
    """
208
    Gradient-based explainer for neural networks.
209
    
210
    Uses integration over straight-line paths in input space.
211
    Supports both TensorFlow and PyTorch.
212
    """
213
    def __init__(self, model, data, session=None, batch_size=50, local_smoothing=0):
214
        """
215
        Parameters:
216
        - model: Neural network model (TensorFlow or PyTorch)
217
        - data: Background dataset for integration
218
        - batch_size: Batch size for gradient computation
219
        - local_smoothing: Local smoothing parameter
220
        """
221
    
222
    def __call__(self, X, nsamples=200) -> Explanation:
223
        """
224
        Compute SHAP values using gradient integration.
225
        
226
        Parameters:
227
        - X: Input samples (framework-specific tensor format)
228
        - nsamples: Number of background samples for integration
229
        """
230
    
231
    def shap_values(self, X, nsamples=200, ranked_outputs=None, 
232
                    output_rank_order="max", rseed=None, return_variances=False):
233
        """
234
        Legacy interface with variance estimation.
235
        
236
        Parameters:
237
        - return_variances: Return variance estimates along with SHAP values
238
        - rseed: Random seed for reproducibility
239
        """
240
```
241

242
### Linear Model Explainers
243

244
Optimized explainers for linear models with correlation handling.
245

246
```python { .api }
247
class LinearExplainer:
248
    """
249
    Explainer for linear models with feature correlation support.
250
    
251
    Handles sklearn linear models or (coefficients, intercept) tuples
252
    with efficient computation and correlation-aware masking.
253
    """
254
    def __init__(self, model, masker, link="identity", nsamples=1000, feature_perturbation=None):
255
        """
256
        Parameters:
257
        - model: Linear model (sklearn model or (coef, intercept) tuple)
258
        - masker: Masker object, data matrix, or (mean, covariance) tuple
259
        - link: Link function for output transformation
260
        - nsamples: Samples for correlation estimation
261
        - feature_perturbation: "interventional" or "correlation_dependent" (deprecated)
262
        """
263
    
264
    def shap_values(self, X):
265
        """
266
        Compute SHAP values for linear model.
267
        
268
        Parameters:
269
        - X: Input samples (array, DataFrame, or sparse matrix)
270
        
271
        Returns:
272
        Array of SHAP values matching input shape
273
        """
274
    
275
    @property
276
    def expected_value(self):
277
        """Expected value of model output."""
278
    
279
    @property
280
    def coef(self):
281
        """Model coefficients."""
282
    
283
    @property 
284
    def intercept(self):
285
        """Model intercept."""
286

287
class AdditiveExplainer:
288
    """
289
    Explainer for generalized additive models.
290
    
291
    Optimized for models with only first-order effects (no interactions).
292
    Assumes additive structure for efficient computation.
293
    """
294
    def __init__(self, model, masker, link=None, feature_names=None):
295
        """Initialize explainer for additive models without interactions."""
296
    
297
    def __call__(self, *args, max_evals=None, silent=False):
298
        """Compute SHAP values assuming additive model structure."""
299
```
300

301
### Exact and Advanced Explainers
302

303
Specialized explainers for specific use cases and mathematical guarantees.
304

305
```python { .api }
306
class ExactExplainer:
307
    """
308
    Exact SHAP computation via optimized enumeration.
309
    
310
    Computes exact SHAP values for models with small feature sets (<15 features).
311
    Uses gray codes for efficient evaluation ordering.
312
    """
313
    def __init__(self, model, masker, link="identity", linearize_link=True, feature_names=None):
314
        """Initialize exact explainer for small feature sets."""
315
    
316
    def __call__(self, *args, max_evals=100000, main_effects=False, 
317
                 error_bounds=False, batch_size="auto", interactions=1, silent=False):
318
        """
319
        Compute exact SHAP values.
320
        
321
        Parameters:
322
        - max_evals: Maximum model evaluations before stopping
323
        - main_effects: Compute main effects separately
324
        - error_bounds: Compute confidence bounds
325
        - interactions: Interaction order to compute (1 for main effects only)
326
        """
327

328
class PartitionExplainer:
329
    """
330
    Partition SHAP using hierarchical Owen values.
331
    
332
    Computes Owen values through feature hierarchy with quadratic runtime.
333
    Handles correlated features via hierarchical clustering.
334
    """
335
    def __init__(self, model, masker, output_names=None, link="identity", 
336
                 linearize_link=True, feature_names=None):
337
        """Initialize partition explainer with hierarchical feature grouping."""
338
    
339
    def __call__(self, *args, max_evals=500, fixed_context=None, main_effects=False,
340
                 error_bounds=False, batch_size="auto", outputs=None, silent=False):
341
        """Compute Owen values through feature partitioning."""
342

343
class CoalitionExplainer:
344
    """
345
    Coalition-based explanations using Winter values.
346
    
347
    Recursive Owen values for predefined feature coalitions.
348
    """
349
    def __init__(self, model, masker, output_names=None, link="identity", 
350
                 linearize_link=True, feature_names=None, partition_tree=None):
351
        """
352
        Initialize coalition explainer.
353
        
354
        Parameters:
355
        - partition_tree: Dictionary defining hierarchical feature groups
356
        """
357
```
358

359
## Usage Patterns
360

361
### Choosing the Right Explainer
362

363
- **TreeExplainer**: Use for XGBoost, LightGBM, CatBoost, sklearn tree models (fastest, exact)
364
- **KernelExplainer**: Universal fallback for any model (slower, model-agnostic)
365
- **DeepExplainer**: Neural networks with TensorFlow/PyTorch (fast, compositional rules)
366
- **GradientExplainer**: Neural networks requiring gradient information
367
- **LinearExplainer**: Linear models with correlation handling
368
- **ExactExplainer**: Small feature sets requiring mathematical guarantees
369
- **PartitionExplainer**: Correlated features requiring hierarchical explanations
370

371
### Common Interface Pattern
372

373
All explainers follow consistent patterns:
374

375
```python
376
# Modern interface (recommended)
377
explainer = shap.TreeExplainer(model)
378
explanation = explainer(X)  # Returns Explanation object
379

380
# Legacy interface (backward compatibility)  
381
shap_values = explainer.shap_values(X)  # Returns numpy array
382

383
# Access baseline
384
baseline = explainer.expected_value
385
```
386

387
### Alternative and Benchmark Explainers
388

389
Additional explainers for specialized use cases, benchmarking, and integration with other explanation libraries.
390

391
```python { .api }
392
class Coefficient:
393
    """
394
    Returns model coefficients as feature attributions.
395
    
396
    Benchmark explainer that simply returns model coefficients for each
397
    sample. Only works with linear models having a coef_ attribute.
398
    """
399
    def __init__(self, model):
400
        """
401
        Parameters:
402
        - model: Linear model with coef_ attribute (sklearn linear models)
403
        """
404
    
405
    def attributions(self, X):
406
        """
407
        Return tiled coefficients as attributions.
408
        
409
        Parameters:
410
        - X: Input samples (array-like)
411
        
412
        Returns:
413
        numpy.ndarray: Coefficients tiled for each sample
414
        """
415

416
class LimeTabular:
417
    """
418
    LIME integration wrapper for tabular data explanations.
419
    
420
    Wraps lime.lime_tabular.LimeTabularExplainer into SHAP interface.
421
    Requires lime package installation.
422
    """
423
    def __init__(self, model, data, mode="classification"):
424
        """
425
        Parameters:
426
        - model: Model function taking samples and returning predictions
427
        - data: Background dataset for LIME (array-like or DataFrame)
428
        - mode: "classification" or "regression"
429
        """
430
    
431
    def attributions(self, X, nsamples=5000, num_features=None):
432
        """
433
        Compute LIME explanations through SHAP interface.
434
        
435
        Parameters:
436
        - X: Input samples to explain
437
        - nsamples: Number of samples for LIME perturbation
438
        - num_features: Number of features to include in explanation
439
        
440
        Returns:
441
        Attributions array(s) for each output dimension
442
        """
443

444
class Maple:
445
    """
446
    Model-Agnostic Locally-Accurate Explanations (MAPLE).
447
    
448
    Local linear approximation method that builds decision trees
449
    around query points for explanations.
450
    """
451
    def __init__(self, model, data, verbose=False):
452
        """
453
        Parameters:
454
        - model: Model function to explain
455
        - data: Training data for building local models
456
        - verbose: Print debugging information
457
        """
458
    
459
    def attributions(self, X):
460
        """Compute MAPLE attributions using local linear models."""
461

462
class TreeMaple(Maple):
463
    """
464
    Tree-based variant of MAPLE explainer.
465
    
466
    Uses tree ensemble models as local approximators instead
467
    of linear models for complex decision boundaries.
468
    """
469
    def __init__(self, model, data, verbose=False):
470
        """Initialize TreeMaple with tree-based local models."""
471

472
class Random:
473
    """
474
    Random baseline explainer for benchmarking.
475
    
476
    Returns random attributions for comparison with actual explainers.
477
    Used to establish baseline performance in evaluation studies.
478
    """
479
    def __init__(self, model, data, seed=None):
480
        """
481
        Parameters:
482
        - model: Model to explain (used for output dimensionality)
483
        - data: Background data (used for feature dimensionality)
484
        - seed: Random seed for reproducibility
485
        """
486
    
487
    def attributions(self, X):
488
        """
489
        Generate random attributions.
490
        
491
        Returns:
492
        Random attributions matching input shape
493
        """
494

495
class TreeGain:
496
    """
497
    Tree gain-based feature importance as explanations.
498
    
499
    Uses feature importance from tree models as local attributions.
500
    Benchmark method that doesn't provide true SHAP values.
501
    """
502
    def __init__(self, model, data=None):
503
        """
504
        Parameters:
505
        - model: Tree-based model with feature_importances_ attribute
506
        - data: Background data (optional, used for baseline estimation)
507
        """
508
    
509
    def attributions(self, X):
510
        """
511
        Return tree feature importances as attributions.
512
        
513
        Returns:
514
        Feature importances tiled for each sample
515
        """
516
```
517

518
**Alternative Explainers Usage:**
519

520
```python
521
import shap
522

523
# Coefficient explainer for linear models
524
from sklearn.linear_model import LogicalRegression
525
model = LogisticRegression().fit(X_train, y_train)
526
explainer = shap.explainers.other.Coefficient(model)
527
attributions = explainer.attributions(X_test)
528

529
# LIME integration (requires: pip install lime)
530
explainer = shap.explainers.other.LimeTabular(model, X_train, mode="classification")
531
attributions = explainer.attributions(X_test, nsamples=1000)
532

533
# MAPLE local linear explanations
534
explainer = shap.explainers.other.Maple(model, X_train)
535
attributions = explainer.attributions(X_test)
536

537
# Random baseline for benchmarking
538
explainer = shap.explainers.other.Random(model, X_train, seed=42)
539
random_attributions = explainer.attributions(X_test)
540
```
541

542
### Error Handling
543

544
Common exceptions and error conditions:
545

546
- **InvalidModelError**: Unsupported model type for specific explainer
547
- **DimensionError**: Input dimension mismatch with training data
548
- **ConvergenceError**: Optimization failed to converge (exact explainers)
549
- **ImportError**: Missing optional dependencies (TensorFlow, PyTorch, lime)

Version

Tile

Files

explainers.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

explainers.mddocs/