0
# Regularizers
1
2
Regularization techniques for preventing overfitting in neural networks. Regularizers add penalty terms to the loss function based on layer weights, encouraging simpler models that generalize better. Keras provides standard regularization methods including L1, L2, and orthogonal regularization.
3
4
## Capabilities
5
6
### L1 and L2 Regularization
7
8
Standard weight decay regularization techniques that penalize large weights.
9
10
```python { .api }
11
class L1:
12
"""L1 regularization (Lasso)."""
13
def __init__(self, l1=0.01): ...
14
15
class L2:
16
"""L2 regularization (Ridge)."""
17
def __init__(self, l2=0.01): ...
18
19
class L1L2:
20
"""Combined L1 and L2 regularization (Elastic Net)."""
21
def __init__(self, l1=0.0, l2=0.0): ...
22
```
23
24
### Specialized Regularization
25
26
Advanced regularization techniques for specific architectural needs.
27
28
```python { .api }
29
class OrthogonalRegularizer:
30
"""Orthogonal regularization for weight matrices."""
31
def __init__(self, factor=0.01, mode='rows'): ...
32
```
33
34
### Base Classes and Utilities
35
36
Base classes and utility functions for working with regularizers.
37
38
```python { .api }
39
class Regularizer:
40
"""Base class for all regularizers."""
41
def __call__(self, weights): ...
42
def get_config(self): ...
43
44
def get(identifier):
45
"""Retrieve a regularizer by name or instance."""
46
47
def serialize(regularizer):
48
"""Serialize a regularizer to configuration."""
49
50
def deserialize(config, custom_objects=None):
51
"""Deserialize a regularizer from configuration."""
52
```
53
54
### Function Aliases
55
56
Convenient function aliases for creating regularizers.
57
58
```python { .api }
59
def l1(l1=0.01):
60
"""Create L1 regularizer."""
61
62
def l2(l2=0.01):
63
"""Create L2 regularizer."""
64
65
def l1_l2(l1=0.0, l2=0.0):
66
"""Create L1L2 regularizer."""
67
68
def orthogonal_regularizer(factor=0.01, mode='rows'):
69
"""Create orthogonal regularizer."""
70
```
71
72
## Usage Examples
73
74
### Basic Regularization
75
76
```python
77
from keras import layers, regularizers
78
79
# L2 regularization on Dense layer
80
dense_layer = layers.Dense(64,
81
kernel_regularizer=regularizers.L2(0.01),
82
bias_regularizer=regularizers.L2(0.01))
83
84
# Using string identifier
85
dense_layer = layers.Dense(64, kernel_regularizer='l2')
86
87
# Using function form
88
dense_layer = layers.Dense(64, kernel_regularizer=regularizers.l2(0.001))
89
```
90
91
### Combined Regularization
92
93
```python
94
from keras import layers, regularizers
95
96
# L1 + L2 regularization (Elastic Net)
97
dense_layer = layers.Dense(64,
98
kernel_regularizer=regularizers.L1L2(l1=0.001, l2=0.01),
99
activity_regularizer=regularizers.L1(0.01))
100
101
# Using function form
102
dense_layer = layers.Dense(64,
103
kernel_regularizer=regularizers.l1_l2(l1=0.001, l2=0.01))
104
```
105
106
### Convolutional Layer Regularization
107
108
```python
109
from keras import layers, regularizers
110
111
# Regularized convolutional layer
112
conv_layer = layers.Conv2D(32, (3, 3),
113
kernel_regularizer=regularizers.L2(0.001),
114
bias_regularizer=regularizers.L1(0.001))
115
116
# Activity regularization
117
conv_layer = layers.Conv2D(32, (3, 3),
118
activity_regularizer=regularizers.L1(0.01))
119
```
120
121
### Orthogonal Regularization
122
123
```python
124
from keras import layers, regularizers
125
126
# Orthogonal regularization for recurrent layers
127
lstm_layer = layers.LSTM(128,
128
kernel_regularizer=regularizers.OrthogonalRegularizer(factor=0.01),
129
recurrent_regularizer=regularizers.orthogonal_regularizer(0.01))
130
131
# Different modes for orthogonal regularization
132
dense_layer = layers.Dense(64,
133
kernel_regularizer=regularizers.OrthogonalRegularizer(
134
factor=0.01, mode='columns'))
135
```
136
137
### Custom Regularizer
138
139
```python
140
import keras
141
from keras import regularizers
142
143
class CustomRegularizer(regularizers.Regularizer):
144
def __init__(self, strength=0.01):
145
self.strength = strength
146
147
def __call__(self, weights):
148
# Custom regularization logic (e.g., group sparsity)
149
return self.strength * keras.ops.sum(keras.ops.sqrt(
150
keras.ops.sum(keras.ops.square(weights), axis=0)))
151
152
def get_config(self):
153
return {'strength': self.strength}
154
155
# Use custom regularizer
156
dense_layer = layers.Dense(64, kernel_regularizer=CustomRegularizer(0.01))
157
```
158
159
### Model-wide Regularization
160
161
```python
162
from keras import layers, regularizers, models
163
164
def add_regularization(model, regularizer):
165
"""Add regularization to all layers in a model."""
166
for layer in model.layers:
167
if hasattr(layer, 'kernel_regularizer'):
168
layer.kernel_regularizer = regularizer
169
return model
170
171
# Apply regularization to existing model
172
base_model = models.Sequential([
173
layers.Dense(128, activation='relu'),
174
layers.Dense(64, activation='relu'),
175
layers.Dense(10, activation='softmax')
176
])
177
178
regularized_model = add_regularization(base_model, regularizers.L2(0.01))
179
```
180
181
### Regularization Scheduling
182
183
```python
184
from keras import callbacks, regularizers
185
import keras
186
187
class RegularizationScheduler(callbacks.Callback):
188
def __init__(self, layer_name, initial_strength=0.01, decay_rate=0.1):
189
self.layer_name = layer_name
190
self.initial_strength = initial_strength
191
self.decay_rate = decay_rate
192
193
def on_epoch_begin(self, epoch, logs=None):
194
# Decay regularization strength over time
195
strength = self.initial_strength * (self.decay_rate ** epoch)
196
layer = self.model.get_layer(self.layer_name)
197
layer.kernel_regularizer = regularizers.L2(strength)
198
199
# Usage in training
200
model.fit(x_train, y_train,
201
callbacks=[RegularizationScheduler('dense_1', 0.01, 0.9)])
202
```
203
204
### Regularization Comparison
205
206
```python
207
from keras import layers, regularizers, models
208
import numpy as np
209
210
# Compare different regularization strengths
211
def create_model(regularizer):
212
return models.Sequential([
213
layers.Dense(128, activation='relu', kernel_regularizer=regularizer),
214
layers.Dense(64, activation='relu', kernel_regularizer=regularizer),
215
layers.Dense(10, activation='softmax')
216
])
217
218
# Different regularization approaches
219
l1_model = create_model(regularizers.L1(0.01))
220
l2_model = create_model(regularizers.L2(0.01))
221
l1l2_model = create_model(regularizers.L1L2(l1=0.01, l2=0.01))
222
```
223
224
## Regularization Guidelines
225
226
### Choosing Regularization Type
227
228
- **L1 (Lasso)**: Promotes sparsity, good for feature selection
229
- **L2 (Ridge)**: Shrinks weights uniformly, good for general overfitting
230
- **L1L2 (Elastic Net)**: Combines benefits of L1 and L2
231
- **Orthogonal**: Maintains orthogonality in weight matrices
232
233
### Regularization Strength
234
235
- **Start small**: Begin with 0.001 to 0.01
236
- **Monitor validation**: Increase if overfitting, decrease if underfitting
237
- **Layer-specific**: Different layers may need different strengths
238
- **Dataset-dependent**: Larger datasets typically need less regularization
239
240
### Best Practices
241
242
- Apply to both kernel and bias weights when needed
243
- Use activity regularization sparingly (can harm learning)
244
- Combine with other techniques (dropout, batch normalization)
245
- Consider regularization scheduling for fine-tuning
246
- Monitor regularization loss separately from main loss
247
248
### Common Patterns
249
250
```python
251
# Typical CNN regularization
252
conv_reg = regularizers.L2(0.0001)
253
dense_reg = regularizers.L2(0.001)
254
255
# Typical RNN regularization
256
kernel_reg = regularizers.L2(0.001)
257
recurrent_reg = regularizers.OrthogonalRegularizer(0.01)
258
259
# Strong regularization for small datasets
260
strong_reg = regularizers.L1L2(l1=0.01, l2=0.01)
261
```