or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

adversarial.mdassessment.mddatasets.mdindex.mdpostprocessing.mdpreprocessing.mdreductions.md

adversarial.mddocs/

0

# Adversarial Training

1

2

Neural network-based approaches using adversarial training to learn fair representations while maintaining predictive utility. These methods use adversarial networks to remove sensitive information from learned representations.

3

4

## Capabilities

5

6

### AdversarialFairnessClassifier

7

8

Implements adversarial fairness for classification tasks using neural networks. Trains a predictor network alongside an adversary network that tries to predict sensitive attributes from the predictor's internal representations.

9

10

```python { .api }

11

class AdversarialFairnessClassifier:

12

def __init__(self, backend="torch", *, predictor_model=None, adversary_model=None,

13

alpha=1.0, epochs=1, batch_size=32, shuffle=True, progress_updates=None,

14

skip_validation=False, callbacks=None, random_state=None):

15

"""

16

Adversarial fairness classifier using neural networks.

17

18

Parameters:

19

- backend: str, neural network backend ("torch" or "tensorflow")

20

- predictor_model: neural network model for prediction task

21

- adversary_model: neural network model for adversary task

22

- alpha: float, strength of adversarial training (higher = more fairness emphasis)

23

- epochs: int, number of training epochs

24

- batch_size: int, batch size for training

25

- shuffle: bool, whether to shuffle training data

26

- progress_updates: callable, callback for training progress updates

27

- skip_validation: bool, whether to skip input validation

28

- callbacks: list, training callbacks

29

- random_state: int, random seed for reproducibility

30

"""

31

32

def fit(self, X, y, *, sensitive_features, sample_weight=None):

33

"""

34

Fit the adversarial fairness classifier.

35

36

Parameters:

37

- X: array-like, feature matrix

38

- y: array-like, target values

39

- sensitive_features: array-like, sensitive feature values

40

- sample_weight: array-like, optional sample weights

41

42

Returns:

43

self

44

"""

45

46

def predict(self, X):

47

"""

48

Make predictions using the trained fair classifier.

49

50

Parameters:

51

- X: array-like, feature matrix

52

53

Returns:

54

array-like: Predicted class labels

55

"""

56

57

def predict_proba(self, X):

58

"""

59

Predict class probabilities.

60

61

Parameters:

62

- X: array-like, feature matrix

63

64

Returns:

65

array-like: Predicted class probabilities, shape (n_samples, n_classes)

66

"""

67

```

68

69

#### Usage Example

70

71

```python

72

from fairlearn.adversarial import AdversarialFairnessClassifier

73

import numpy as np

74

75

# Create adversarial fairness classifier

76

afc = AdversarialFairnessClassifier(

77

backend="torch", # or "tensorflow"

78

alpha=1.0, # Fairness strength

79

epochs=50, # Training epochs

80

batch_size=64,

81

random_state=42

82

)

83

84

# Fit the model

85

afc.fit(X_train, y_train, sensitive_features=A_train)

86

87

# Make predictions

88

predictions = afc.predict(X_test)

89

probabilities = afc.predict_proba(X_test)

90

```

91

92

### AdversarialFairnessRegressor

93

94

Implements adversarial fairness for regression tasks, training a predictor to minimize prediction error while preventing an adversary from predicting sensitive attributes.

95

96

```python { .api }

97

class AdversarialFairnessRegressor:

98

def __init__(self, backend="torch", *, predictor_model=None, adversary_model=None,

99

alpha=1.0, epochs=1, batch_size=32, shuffle=True, progress_updates=None,

100

skip_validation=False, callbacks=None, random_state=None):

101

"""

102

Adversarial fairness regressor using neural networks.

103

104

Parameters:

105

- backend: str, neural network backend ("torch" or "tensorflow")

106

- predictor_model: neural network model for regression task

107

- adversary_model: neural network model for adversary task

108

- alpha: float, strength of adversarial training

109

- epochs: int, number of training epochs

110

- batch_size: int, batch size for training

111

- shuffle: bool, whether to shuffle training data

112

- progress_updates: callable, callback for training progress updates

113

- skip_validation: bool, whether to skip input validation

114

- callbacks: list, training callbacks

115

- random_state: int, random seed for reproducibility

116

"""

117

118

def fit(self, X, y, *, sensitive_features, sample_weight=None):

119

"""

120

Fit the adversarial fairness regressor.

121

122

Parameters:

123

- X: array-like, feature matrix

124

- y: array-like, continuous target values

125

- sensitive_features: array-like, sensitive feature values

126

- sample_weight: array-like, optional sample weights

127

128

Returns:

129

self

130

"""

131

132

def predict(self, X):

133

"""

134

Make regression predictions.

135

136

Parameters:

137

- X: array-like, feature matrix

138

139

Returns:

140

array-like: Predicted continuous values

141

"""

142

```

143

144

## Backend Support

145

146

### PyTorch Backend

147

148

The default backend uses PyTorch for neural network implementation:

149

150

```python

151

# Using PyTorch backend (default)

152

classifier = AdversarialFairnessClassifier(

153

backend="torch",

154

epochs=100,

155

batch_size=128

156

)

157

```

158

159

### TensorFlow Backend

160

161

Alternative backend using TensorFlow:

162

163

```python

164

# Using TensorFlow backend

165

classifier = AdversarialFairnessClassifier(

166

backend="tensorflow",

167

epochs=100,

168

batch_size=128

169

)

170

```

171

172

## Custom Neural Network Models

173

174

### Custom Predictor Models

175

176

You can provide custom neural network architectures:

177

178

```python

179

import torch

180

import torch.nn as nn

181

182

# Define custom predictor model

183

class CustomPredictor(nn.Module):

184

def __init__(self, input_dim, hidden_dim=64):

185

super().__init__()

186

self.layers = nn.Sequential(

187

nn.Linear(input_dim, hidden_dim),

188

nn.ReLU(),

189

nn.Dropout(0.2),

190

nn.Linear(hidden_dim, hidden_dim),

191

nn.ReLU(),

192

nn.Dropout(0.2),

193

nn.Linear(hidden_dim, 1),

194

nn.Sigmoid()

195

)

196

197

def forward(self, x):

198

return self.layers(x)

199

200

# Use custom model

201

predictor = CustomPredictor(input_dim=X_train.shape[1])

202

203

classifier = AdversarialFairnessClassifier(

204

backend="torch",

205

predictor_model=predictor,

206

alpha=2.0,

207

epochs=200

208

)

209

```

210

211

### Custom Adversary Models

212

213

Customize the adversary network architecture:

214

215

```python

216

class CustomAdversary(nn.Module):

217

def __init__(self, input_dim, n_sensitive_classes):

218

super().__init__()

219

self.layers = nn.Sequential(

220

nn.Linear(input_dim, 32),

221

nn.ReLU(),

222

nn.Linear(32, 16),

223

nn.ReLU(),

224

nn.Linear(16, n_sensitive_classes),

225

nn.Softmax(dim=1)

226

)

227

228

def forward(self, x):

229

return self.layers(x)

230

231

# Create adversary for binary sensitive attribute

232

adversary = CustomAdversary(

233

input_dim=64, # Should match predictor's representation size

234

n_sensitive_classes=2

235

)

236

237

classifier = AdversarialFairnessClassifier(

238

backend="torch",

239

predictor_model=predictor,

240

adversary_model=adversary,

241

alpha=1.5

242

)

243

```

244

245

## Training Configuration

246

247

### Hyperparameter Tuning

248

249

Key hyperparameters to tune for adversarial training:

250

251

```python

252

# Alpha controls fairness-accuracy trade-off

253

alphas = [0.1, 0.5, 1.0, 2.0, 5.0]

254

255

results = {}

256

for alpha in alphas:

257

classifier = AdversarialFairnessClassifier(

258

alpha=alpha,

259

epochs=100,

260

batch_size=64,

261

random_state=42

262

)

263

264

classifier.fit(X_train, y_train, sensitive_features=A_train)

265

predictions = classifier.predict(X_test)

266

267

# Evaluate fairness and accuracy

268

results[alpha] = evaluate_model(predictions, y_test, A_test)

269

```

270

271

### Training Callbacks

272

273

Monitor training progress with custom callbacks:

274

275

```python

276

def progress_callback(epoch, predictor_loss, adversary_loss, adversary_accuracy):

277

"""Callback to monitor training progress."""

278

if epoch % 10 == 0:

279

print(f"Epoch {epoch}: Predictor Loss={predictor_loss:.4f}, "

280

f"Adversary Loss={adversary_loss:.4f}, "

281

f"Adversary Acc={adversary_accuracy:.4f}")

282

283

classifier = AdversarialFairnessClassifier(

284

progress_updates=progress_callback,

285

epochs=200

286

)

287

```

288

289

## Advanced Usage

290

291

### Multi-class Sensitive Features

292

293

Handle sensitive attributes with multiple categories:

294

295

```python

296

# Sensitive feature with 3 categories

297

sensitive_features = ['group_A', 'group_B', 'group_C'] * (len(X_train) // 3)

298

299

classifier = AdversarialFairnessClassifier(

300

alpha=1.0,

301

epochs=150

302

)

303

304

classifier.fit(X_train, y_train, sensitive_features=sensitive_features)

305

```

306

307

### Batch Size Selection

308

309

Choose appropriate batch sizes based on dataset size:

310

311

```python

312

# For small datasets

313

small_classifier = AdversarialFairnessClassifier(batch_size=16)

314

315

# For large datasets

316

large_classifier = AdversarialFairnessClassifier(batch_size=256)

317

318

# Adaptive batch size based on data size

319

batch_size = min(128, len(X_train) // 10)

320

adaptive_classifier = AdversarialFairnessClassifier(batch_size=batch_size)

321

```

322

323

### Early Stopping

324

325

Implement custom early stopping:

326

327

```python

328

class EarlyStoppingCallback:

329

def __init__(self, patience=10, min_delta=0.001):

330

self.patience = patience

331

self.min_delta = min_delta

332

self.best_loss = float('inf')

333

self.wait = 0

334

335

def __call__(self, epoch, predictor_loss, adversary_loss, adversary_accuracy):

336

if predictor_loss < self.best_loss - self.min_delta:

337

self.best_loss = predictor_loss

338

self.wait = 0

339

else:

340

self.wait += 1

341

342

if self.wait >= self.patience:

343

print(f"Early stopping at epoch {epoch}")

344

return True # Signal to stop training

345

return False

346

347

early_stopping = EarlyStoppingCallback(patience=15)

348

349

classifier = AdversarialFairnessClassifier(

350

callbacks=[early_stopping],

351

epochs=1000 # Large number, early stopping will control actual epochs

352

)

353

```

354

355

## Integration with Evaluation

356

357

Combine with fairness assessment tools:

358

359

```python

360

from fairlearn.metrics import MetricFrame, equalized_odds_difference

361

362

# Train adversarial model

363

afc = AdversarialFairnessClassifier(alpha=1.0, epochs=100)

364

afc.fit(X_train, y_train, sensitive_features=A_train)

365

366

# Get predictions and evaluate

367

predictions = afc.predict(X_test)

368

probabilities = afc.predict_proba(X_test)

369

370

# Assess fairness

371

fairness_metrics = MetricFrame(

372

metrics={

373

'accuracy': lambda y_true, y_pred: (y_true == y_pred).mean(),

374

'selection_rate': lambda y_true, y_pred: y_pred.mean()

375

},

376

y_true=y_test,

377

y_pred=predictions,

378

sensitive_features=A_test

379

)

380

381

print("Adversarial fairness results:")

382

print(fairness_metrics.by_group)

383

print(f"Equalized odds difference: {equalized_odds_difference(y_test, predictions, sensitive_features=A_test)}")

384

```

385

386

## Best Practices

387

388

### Model Architecture

389

390

1. **Predictor Complexity**: Use sufficiently complex predictors for your task

391

2. **Adversary Simplicity**: Keep adversary simpler than predictor to avoid overfitting

392

3. **Representation Size**: Choose appropriate intermediate representation dimensions

393

394

### Training Strategy

395

396

1. **Learning Rates**: Use different learning rates for predictor and adversary

397

2. **Training Balance**: Monitor that neither predictor nor adversary dominates

398

3. **Convergence**: Look for stable oscillation rather than monotonic convergence

399

400

### Hyperparameter Guidelines

401

402

- **Alpha = 0.1-0.5**: Mild fairness emphasis, preserves most accuracy

403

- **Alpha = 1.0-2.0**: Balanced fairness-accuracy trade-off

404

- **Alpha = 5.0+**: Strong fairness emphasis, may sacrifice accuracy

405

406

```python

407

# Recommended starting configuration

408

classifier = AdversarialFairnessClassifier(

409

backend="torch",

410

alpha=1.0, # Balanced trade-off

411

epochs=100, # Sufficient for convergence

412

batch_size=64, # Good balance for most datasets

413

shuffle=True, # Important for training stability

414

random_state=42 # For reproducibility

415

)

416

```