Tessl Tile for pypi/fairlearn@0.12.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

adversarial.md assessment.md datasets.md index.md postprocessing.md preprocessing.md reductions.md

assessment.mddocs/

0
# Fairness Assessment
1

2
Comprehensive tools for measuring fairness through disaggregated metrics across sensitive groups. The assessment module provides the MetricFrame class for computing metrics across subgroups and specialized fairness functions for measuring specific fairness criteria.
3

4
## Capabilities
5

6
### MetricFrame
7

8
The central class for fairness assessment that computes metrics across subgroups defined by sensitive features. Provides disaggregated views of any metric function and supports comparison methods for fairness evaluation.
9

10
```python { .api }
11
class MetricFrame:
12
    def __init__(self, *, metrics, y_true, y_pred, 
13
                 sensitive_features, control_features=None,
14
                 sample_params=None, n_boot=None, ci_quantiles=None,
15
                 random_state=None):
16
        """
17
        Collection of disaggregated metric values.
18

19
        Parameters:
20
        - metrics: callable or dict, metric functions to compute
21
        - y_true: array-like, true target values
22
        - y_pred: array-like, predicted values
23
        - sensitive_features: array-like, sensitive feature values for grouping
24
        - control_features: array-like, optional control feature values
25
        - sample_params: dict, optional parameters for metric functions
26
        - n_boot: int, number of bootstrap samples for confidence intervals
27
        - ci_quantiles: list[float], quantiles for confidence intervals
28
        - random_state: int or RandomState, controls bootstrap sample generation
29
        """
30
    
31
    @property
32
    def overall(self): 
33
        """Overall metrics computed on entire dataset."""
34
        
35
    @property  
36
    def by_group(self):
37
        """Metrics computed for each sensitive feature group."""
38
        
39
    def group_max(self):
40
        """Maximum metric value across groups."""
41
        
42
    def group_min(self):
43
        """Minimum metric value across groups."""
44
        
45
    def difference(self, method="between_groups"):
46
        """Difference between group metrics."""
47
        
48
    def ratio(self, method="between_groups"):
49
        """Ratio between group metrics."""
50
        
51
    @property
52
    def overall_ci(self):
53
        """Confidence intervals for overall metrics."""
54
        
55
    @property
56
    def by_group_ci(self):
57
        """Confidence intervals for group metrics."""
58
        
59
    def group_max_ci(self):
60
        """Confidence intervals for maximum metric values."""
61
        
62
    def group_min_ci(self):
63
        """Confidence intervals for minimum metric values."""
64
        
65
    def difference_ci(self, method="between_groups"):
66
        """Confidence intervals for differences between groups."""
67
        
68
    def ratio_ci(self, method="between_groups"):
69
        """Confidence intervals for ratios between groups."""
70
```
71

72
#### Usage Example
73

74
```python
75
from fairlearn.metrics import MetricFrame
76
from sklearn.metrics import accuracy_score, precision_score
77

78
# Define multiple metrics
79
metrics = {
80
    'accuracy': accuracy_score,
81
    'precision': precision_score
82
}
83

84
# Create MetricFrame
85
mf = MetricFrame(
86
    metrics=metrics,
87
    y_true=y_test,
88
    y_pred=y_pred,
89
    sensitive_features=sensitive_features
90
)
91

92
# Access results
93
print(mf.overall)  # Overall metrics
94
print(mf.by_group)  # Metrics by group
95
print(mf.difference())  # Differences between groups
96
print(mf.ratio())  # Ratios between groups
97
```
98

99
### Demographic Parity Metrics
100

101
Functions for measuring demographic parity, which requires equal positive prediction rates across groups.
102

103
```python { .api }
104
def demographic_parity_difference(y_true, y_pred, *, sensitive_features, 
105
                                  method="between_groups", sample_weight=None):
106
    """
107
    Calculate difference in selection rates between groups.
108
    
109
    Parameters:
110
    - y_true: array-like, true target values (ignored for selection rate)
111
    - y_pred: array-like, predicted values (binary)
112
    - sensitive_features: array-like, sensitive feature values
113
    - method: str, comparison method ("between_groups" or "to_overall")
114
    - sample_weight: array-like, optional sample weights
115
    
116
    Returns:
117
    float: Maximum difference in selection rates between any two groups
118
    """
119

120
def demographic_parity_ratio(y_true, y_pred, *, sensitive_features,
121
                             method="between_groups", sample_weight=None):
122
    """
123
    Calculate ratio of selection rates between groups.
124
    
125
    Parameters:
126
    - y_true: array-like, true target values (ignored for selection rate)
127
    - y_pred: array-like, predicted values (binary)
128
    - sensitive_features: array-like, sensitive feature values
129
    - method: str, comparison method ("between_groups" or "to_overall")
130
    - sample_weight: array-like, optional sample weights
131
    
132
    Returns:
133
    float: Minimum ratio of selection rates between any two groups
134
    """
135
```
136

137
### Equalized Odds Metrics
138

139
Functions for measuring equalized odds, which requires equal true positive and false positive rates across groups.
140

141
```python { .api }
142
def equalized_odds_difference(y_true, y_pred, *, sensitive_features,
143
                               method="between_groups", sample_weight=None,
144
                               agg="worst_case"):
145
    """
146
    Calculate maximum difference in true positive and false positive rates.
147
    
148
    Parameters:
149
    - y_true: array-like, true target values (binary)
150
    - y_pred: array-like, predicted values (binary)
151
    - sensitive_features: array-like, sensitive feature values
152
    - method: str, comparison method ("between_groups" or "to_overall")
153
    - sample_weight: array-like, optional sample weights
154
    - agg: str, aggregation method ("worst_case" or "mean")
155
    
156
    Returns:
157
    float: Maximum difference in TPR and FPR between any two groups
158
    """
159

160
def equalized_odds_ratio(y_true, y_pred, *, sensitive_features,
161
                         method="between_groups", sample_weight=None,
162
                         agg="worst_case"):
163
    """
164
    Calculate minimum ratio in true positive and false positive rates.
165
    
166
    Parameters:
167
    - y_true: array-like, true target values (binary)
168
    - y_pred: array-like, predicted values (binary)
169
    - sensitive_features: array-like, sensitive feature values
170
    - method: str, comparison method ("between_groups" or "to_overall")
171
    - sample_weight: array-like, optional sample weights
172
    - agg: str, aggregation method ("worst_case" or "mean")
173
    
174
    Returns:
175
    float: Minimum ratio in TPR and FPR between any two groups
176
    """
177
```
178

179
### Equal Opportunity Metrics
180

181
Functions for measuring equal opportunity, which requires equal true positive rates across groups.
182

183
```python { .api }
184
def equal_opportunity_difference(y_true, y_pred, *, sensitive_features,
185
                                  method="between_groups", sample_weight=None):
186
    """
187
    Calculate difference in true positive rates between groups.
188
    
189
    Parameters:
190
    - y_true: array-like, true target values (binary)
191
    - y_pred: array-like, predicted values (binary)
192
    - sensitive_features: array-like, sensitive feature values
193
    - method: str, comparison method ("between_groups" or "to_overall")
194
    - sample_weight: array-like, optional sample weights
195
    
196
    Returns:
197
    float: Maximum difference in TPR between any two groups
198
    """
199

200
def equal_opportunity_ratio(y_true, y_pred, *, sensitive_features,
201
                            method="between_groups", sample_weight=None):
202
    """
203
    Calculate ratio of true positive rates between groups.
204
    
205
    Parameters:
206
    - y_true: array-like, true target values (binary)
207
    - y_pred: array-like, predicted values (binary)
208
    - sensitive_features: array-like, sensitive feature values
209
    - method: str, comparison method ("between_groups" or "to_overall")
210
    - sample_weight: array-like, optional sample weights
211
    
212
    Returns:
213
    float: Minimum ratio in TPR between any two groups
214
    """
215
```
216

217
### Base Metrics
218

219
Fundamental metric functions that can be used with MetricFrame or independently.
220

221
```python { .api }
222
def true_positive_rate(y_true, y_pred, *, sample_weight=None, pos_label=1):
223
    """
224
    Calculate true positive rate (sensitivity/recall).
225
    
226
    Parameters:
227
    - y_true: array-like, true target values
228
    - y_pred: array-like, predicted values
229
    - sample_weight: array-like, optional sample weights
230
    - pos_label: label considered as positive
231
    
232
    Returns:
233
    float: True positive rate
234
    """
235

236
def false_positive_rate(y_true, y_pred, *, sample_weight=None, pos_label=1):
237
    """Calculate false positive rate."""
238

239
def true_negative_rate(y_true, y_pred, *, sample_weight=None, pos_label=1):
240
    """Calculate true negative rate (specificity)."""
241

242
def false_negative_rate(y_true, y_pred, *, sample_weight=None, pos_label=1):
243
    """Calculate false negative rate."""
244

245
def selection_rate(y_true, y_pred, *, sample_weight=None, pos_label=1):
246
    """Calculate selection rate (positive prediction rate)."""
247

248
def mean_prediction(y_true, y_pred, *, sample_weight=None):
249
    """Calculate mean of predictions."""
250

251
def count(y_true, y_pred, *, sample_weight=None):
252
    """Count number of samples."""
253
```
254

255
### Derived Metrics
256

257
Create new fairness metrics from existing metric functions.
258

259
```python { .api }
260
def make_derived_metric(*, metric, transform, sample_weight_names=None):
261
    """
262
    Create a derived metric with specified aggregation method.
263
    
264
    Parameters:
265
    - metric: callable, base metric function
266
    - transform: str, aggregation method ('difference', 'ratio', 'group_min', 'group_max')
267
    - sample_weight_names: list, parameter names for sample weights
268
    
269
    Returns:
270
    callable: New derived metric function
271
    """
272
```
273

274
### Visualization
275

276
Plotting functions for visualizing model comparison across fairness metrics.
277

278
```python { .api }
279
def plot_model_comparison(dashboard_predicted, *, 
280
                         sensitive_features,
281
                         conf_intervals=False):
282
    """
283
    Plot radar chart comparing multiple models across fairness and performance metrics.
284
    
285
    Parameters:
286
    - dashboard_predicted: dict, mapping of model names to prediction dictionaries
287
    - sensitive_features: array-like, sensitive feature values
288
    - conf_intervals: bool, whether to show confidence intervals
289
    
290
    Returns:
291
    matplotlib figure object
292
    """
293
```
294

295
## Generated Metrics
296

297
The metrics module dynamically generates additional fairness metrics for many base metrics using the pattern `<metric>_{difference,ratio,group_min,group_max}`. For example:
298

299
- `accuracy_score_difference`
300
- `precision_score_ratio`
301
- `recall_score_group_min`
302
- `f1_score_group_max`
303

304
These generated metrics provide convenient access to common fairness assessments without manually using `make_derived_metric`.

Version

Tile

Files

assessment.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

assessment.mddocs/