or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

activations.mdapplications.mddata-utils.mdindex.mdinitializers.mdlayers.mdmodels.mdoperations.mdrandom.mdregularizers.mdsaving.mdtraining.md

regularizers.mddocs/

0

# Regularizers

1

2

Regularizers apply penalties to layer parameters during training to reduce overfitting by constraining the complexity of the model. They add terms to the loss function that penalize large weights.

3

4

## Capabilities

5

6

### Base Regularizer

7

8

The abstract base class for all regularizers, providing the interface for regularization penalties.

9

10

```python { .api }

11

class Regularizer:

12

"""

13

Base class for weight regularizers.

14

15

All regularizers should inherit from this class and implement the __call__ method.

16

"""

17

def __call__(self, x):

18

"""

19

Compute the regularization penalty.

20

21

Parameters:

22

- x: Weight tensor to regularize

23

24

Returns:

25

Scalar tensor representing the regularization penalty

26

"""

27

```

28

29

### L1 Regularization

30

31

L1 regularization adds a penalty term proportional to the sum of the absolute values of the weights, promoting sparsity.

32

33

```python { .api }

34

class L1(Regularizer):

35

"""

36

L1 regularization penalty.

37

38

Adds a penalty term proportional to the sum of absolute values of weights.

39

Promotes sparsity by driving some weights to exactly zero.

40

41

Usage:

42

```python

43

layer = Dense(10, kernel_regularizer=L1(0.01))

44

# or

45

layer = Dense(10, kernel_regularizer='l1')

46

```

47

"""

48

def __init__(self, l1=0.01):

49

"""

50

Initialize the L1 regularizer.

51

52

Parameters:

53

- l1: L1 regularization factor (default: 0.01)

54

"""

55

56

def l1(l1=0.01):

57

"""

58

Create an L1 regularizer.

59

60

Parameters:

61

- l1: L1 regularization factor (default: 0.01)

62

63

Returns:

64

L1 regularizer instance

65

"""

66

```

67

68

### L2 Regularization

69

70

L2 regularization adds a penalty term proportional to the sum of squares of the weights, promoting small weights.

71

72

```python { .api }

73

class L2(Regularizer):

74

"""

75

L2 regularization penalty.

76

77

Adds a penalty term proportional to the sum of squares of weights.

78

Promotes small weights and smooth solutions.

79

80

Usage:

81

```python

82

layer = Dense(10, kernel_regularizer=L2(0.01))

83

# or

84

layer = Dense(10, kernel_regularizer='l2')

85

```

86

"""

87

def __init__(self, l2=0.01):

88

"""

89

Initialize the L2 regularizer.

90

91

Parameters:

92

- l2: L2 regularization factor (default: 0.01)

93

"""

94

95

def l2(l2=0.01):

96

"""

97

Create an L2 regularizer.

98

99

Parameters:

100

- l2: L2 regularization factor (default: 0.01)

101

102

Returns:

103

L2 regularizer instance

104

"""

105

```

106

107

### Combined L1L2 Regularization

108

109

Combines both L1 and L2 regularization penalties, providing benefits of both sparsity and weight decay.

110

111

```python { .api }

112

class L1L2(Regularizer):

113

"""

114

Combined L1 and L2 regularization penalty.

115

116

Combines both L1 and L2 penalties, providing both sparsity (L1) and

117

weight decay (L2) effects.

118

119

Usage:

120

```python

121

layer = Dense(10, kernel_regularizer=L1L2(l1=0.01, l2=0.01))

122

# or

123

layer = Dense(10, kernel_regularizer='l1_l2')

124

```

125

"""

126

def __init__(self, l1=0.0, l2=0.0):

127

"""

128

Initialize the L1L2 regularizer.

129

130

Parameters:

131

- l1: L1 regularization factor (default: 0.0)

132

- l2: L2 regularization factor (default: 0.0)

133

"""

134

135

def l1_l2(l1=0.01, l2=0.01):

136

"""

137

Create a combined L1L2 regularizer.

138

139

Parameters:

140

- l1: L1 regularization factor (default: 0.01)

141

- l2: L2 regularization factor (default: 0.01)

142

143

Returns:

144

L1L2 regularizer instance

145

"""

146

```

147

148

### Orthogonal Regularization

149

150

Encourages weight matrices to be orthogonal, which can help with gradient flow and representational diversity.

151

152

```python { .api }

153

class OrthogonalRegularizer(Regularizer):

154

"""

155

Orthogonal regularization penalty.

156

157

Encourages weight matrices to be orthogonal by penalizing the deviation

158

from orthogonality. Useful for maintaining diverse representations and

159

improving gradient flow.

160

161

Usage:

162

```python

163

layer = Dense(10, kernel_regularizer=OrthogonalRegularizer(factor=0.01))

164

```

165

"""

166

def __init__(self, factor=0.01, mode='rows'):

167

"""

168

Initialize the OrthogonalRegularizer.

169

170

Parameters:

171

- factor: Regularization strength (default: 0.01)

172

- mode: 'rows' or 'columns' - which dimension to orthogonalize (default: 'rows')

173

"""

174

175

def orthogonal_regularizer(factor=0.01, mode='rows'):

176

"""

177

Create an orthogonal regularizer.

178

179

Parameters:

180

- factor: Regularization strength (default: 0.01)

181

- mode: 'rows' or 'columns' - which dimension to orthogonalize (default: 'rows')

182

183

Returns:

184

OrthogonalRegularizer instance

185

"""

186

```

187

188

### Utility Functions

189

190

Helper functions for regularizer management and serialization.

191

192

```python { .api }

193

def serialize(regularizer):

194

"""

195

Serialize a regularizer to a string or config dict.

196

197

Parameters:

198

- regularizer: Regularizer to serialize

199

200

Returns:

201

String identifier or config dictionary

202

"""

203

204

def deserialize(config, custom_objects=None):

205

"""

206

Deserialize a regularizer from a string or config dict.

207

208

Parameters:

209

- config: String identifier or config dictionary

210

- custom_objects: Optional dict mapping names to custom objects

211

212

Returns:

213

Regularizer instance

214

"""

215

216

def get(identifier):

217

"""

218

Retrieve a regularizer by string identifier.

219

220

Parameters:

221

- identifier: String name or regularizer instance

222

223

Returns:

224

Regularizer instance

225

"""

226

```

227

228

## Usage Examples

229

230

### Basic Regularization

231

232

```python

233

import keras

234

from keras import regularizers

235

236

# Using string identifiers

237

model = keras.Sequential([

238

keras.layers.Dense(64, kernel_regularizer='l2', activation='relu'),

239

keras.layers.Dense(32, kernel_regularizer='l1', activation='relu'),

240

keras.layers.Dense(10, activation='softmax')

241

])

242

243

# Using regularizer classes directly

244

model = keras.Sequential([

245

keras.layers.Dense(64,

246

kernel_regularizer=regularizers.L2(0.01),

247

bias_regularizer=regularizers.L1(0.01),

248

activation='relu'),

249

keras.layers.Dense(32,

250

kernel_regularizer=regularizers.L1L2(l1=0.01, l2=0.01),

251

activation='relu'),

252

keras.layers.Dense(10, activation='softmax')

253

])

254

```

255

256

### Advanced Regularization

257

258

```python

259

import keras

260

from keras import regularizers

261

262

# Orthogonal regularization for maintaining diverse representations

263

layer = keras.layers.Dense(

264

128,

265

kernel_regularizer=regularizers.OrthogonalRegularizer(factor=0.01),

266

activation='relu'

267

)

268

269

# Different regularizers for different parts of the layer

270

layer = keras.layers.Dense(

271

64,

272

kernel_regularizer=regularizers.L2(0.01), # Weight regularization

273

bias_regularizer=regularizers.L1(0.01), # Bias regularization

274

activity_regularizer=regularizers.L1(0.01), # Output regularization

275

activation='relu'

276

)

277

278

# Custom regularization strength

279

strong_l2 = regularizers.L2(0.1) # Strong regularization

280

weak_l1 = regularizers.L1(0.001) # Weak regularization

281

282

model = keras.Sequential([

283

keras.layers.Dense(128, kernel_regularizer=strong_l2, activation='relu'),

284

keras.layers.Dropout(0.5), # Combine with dropout for better regularization

285

keras.layers.Dense(64, kernel_regularizer=weak_l1, activation='relu'),

286

keras.layers.Dense(10, activation='softmax')

287

])

288

```

289

290

### Regularization in Different Layer Types

291

292

```python

293

import keras

294

from keras import regularizers

295

296

# Convolutional layers

297

conv_model = keras.Sequential([

298

keras.layers.Conv2D(32, 3,

299

kernel_regularizer=regularizers.L2(0.01),

300

activation='relu'),

301

keras.layers.Conv2D(64, 3,

302

kernel_regularizer=regularizers.L1L2(l1=0.01, l2=0.01),

303

activation='relu'),

304

keras.layers.GlobalAveragePooling2D(),

305

keras.layers.Dense(10, activation='softmax')

306

])

307

308

# Recurrent layers

309

rnn_model = keras.Sequential([

310

keras.layers.LSTM(64,

311

kernel_regularizer=regularizers.L2(0.01),

312

recurrent_regularizer=regularizers.L1(0.01),

313

return_sequences=True),

314

keras.layers.LSTM(32,

315

kernel_regularizer=regularizers.OrthogonalRegularizer(0.01)),

316

keras.layers.Dense(10, activation='softmax')

317

])

318

```

319

320

## Regularization Guidelines

321

322

### When to Use Each Type:

323

324

- **L1 Regularization**: Use when you want sparse weights (feature selection)

325

- **L2 Regularization**: Use for general overfitting prevention and smooth solutions

326

- **L1L2 Regularization**: Use when you want both sparsity and weight decay

327

- **Orthogonal Regularization**: Use when you want diverse, uncorrelated representations

328

329

### Typical Regularization Strengths:

330

331

- **Light regularization**: 0.001 - 0.01

332

- **Moderate regularization**: 0.01 - 0.1

333

- **Strong regularization**: 0.1 - 1.0

334

335

### Best Practices:

336

337

1. Start with L2 regularization (0.01) as a baseline

338

2. Combine with dropout for better regularization

339

3. Use different strengths for different layers

340

4. Monitor validation loss to tune regularization strength

341

5. Apply regularization primarily to dense layers rather than convolutional layers