0
# Activation Functions
1
2
Activation functions define the output of neural network layers and introduce non-linearity to enable learning complex patterns. Keras provides a comprehensive set of activation functions for various use cases.
3
4
## Capabilities
5
6
### Standard Activation Functions
7
8
Core activation functions commonly used in neural networks for introducing non-linearity and controlling gradient flow.
9
10
```python { .api }
11
def relu(x, negative_slope=0.0, max_value=None, threshold=0.0):
12
"""
13
Rectified Linear Unit activation function.
14
15
Parameters:
16
- x: Input tensor
17
- negative_slope: Slope for values below threshold (default: 0.0)
18
- max_value: Maximum value for saturation (default: None)
19
- threshold: Threshold value below which values are damped (default: 0.0)
20
21
Returns:
22
Tensor with same shape and dtype as input
23
"""
24
25
def sigmoid(x):
26
"""
27
Sigmoid activation function: 1 / (1 + exp(-x)).
28
29
Parameters:
30
- x: Input tensor
31
32
Returns:
33
Tensor with values between 0 and 1
34
"""
35
36
def tanh(x):
37
"""
38
Hyperbolic tangent activation function: (exp(x) - exp(-x)) / (exp(x) + exp(-x)).
39
40
Parameters:
41
- x: Input tensor
42
43
Returns:
44
Tensor with values between -1 and 1
45
"""
46
47
def softmax(x, axis=-1):
48
"""
49
Softmax activation function that normalizes input vector to probability distribution.
50
51
Parameters:
52
- x: Input tensor
53
- axis: Axis along which to apply softmax (default: -1)
54
55
Returns:
56
Tensor with values summing to 1 along specified axis
57
"""
58
59
def linear(x):
60
"""
61
Linear activation function (identity function): returns input unchanged.
62
63
Parameters:
64
- x: Input tensor
65
66
Returns:
67
Input tensor unchanged
68
"""
69
```
70
71
### Advanced Activation Functions
72
73
Modern activation functions that provide improved gradient properties and performance characteristics.
74
75
```python { .api }
76
def gelu(x, approximate=False):
77
"""
78
Gaussian Error Linear Unit activation function.
79
80
Parameters:
81
- x: Input tensor
82
- approximate: Whether to use approximation (default: False)
83
84
Returns:
85
Tensor with GELU activation applied
86
"""
87
88
def silu(x):
89
"""
90
Swish/SiLU activation function: x * sigmoid(x).
91
92
Parameters:
93
- x: Input tensor
94
95
Returns:
96
Tensor with SiLU activation applied
97
"""
98
99
def swish(x):
100
"""
101
Alias for silu activation function.
102
103
Parameters:
104
- x: Input tensor
105
106
Returns:
107
Tensor with Swish activation applied
108
"""
109
110
def mish(x):
111
"""
112
Mish activation function: x * tanh(softplus(x)).
113
114
Parameters:
115
- x: Input tensor
116
117
Returns:
118
Tensor with Mish activation applied
119
"""
120
121
def selu(x):
122
"""
123
Scaled Exponential Linear Unit activation function.
124
125
Parameters:
126
- x: Input tensor
127
128
Returns:
129
Tensor with SELU activation applied
130
"""
131
132
def elu(x, alpha=1.0):
133
"""
134
Exponential Linear Unit activation function.
135
136
Parameters:
137
- x: Input tensor
138
- alpha: Scale factor for negative inputs (default: 1.0)
139
140
Returns:
141
Tensor with ELU activation applied
142
"""
143
144
def leaky_relu(x, negative_slope=0.01):
145
"""
146
Leaky ReLU activation function with small negative slope.
147
148
Parameters:
149
- x: Input tensor
150
- negative_slope: Slope for negative values (default: 0.01)
151
152
Returns:
153
Tensor with Leaky ReLU activation applied
154
"""
155
```
156
157
### Specialized Activation Functions
158
159
Specialized activation functions for specific use cases and architectures.
160
161
```python { .api }
162
def softplus(x):
163
"""
164
Softplus activation function: log(1 + exp(x)).
165
166
Parameters:
167
- x: Input tensor
168
169
Returns:
170
Tensor with Softplus activation applied
171
"""
172
173
def softsign(x):
174
"""
175
Softsign activation function: x / (1 + |x|).
176
177
Parameters:
178
- x: Input tensor
179
180
Returns:
181
Tensor with Softsign activation applied
182
"""
183
184
def exponential(x):
185
"""
186
Exponential activation function: exp(x).
187
188
Parameters:
189
- x: Input tensor
190
191
Returns:
192
Tensor with exponential activation applied
193
"""
194
195
def hard_sigmoid(x):
196
"""
197
Hard sigmoid activation function (piecewise linear approximation).
198
199
Parameters:
200
- x: Input tensor
201
202
Returns:
203
Tensor with hard sigmoid activation applied
204
"""
205
206
def hard_silu(x):
207
"""
208
Hard SiLU activation function (computationally efficient approximation).
209
210
Parameters:
211
- x: Input tensor
212
213
Returns:
214
Tensor with hard SiLU activation applied
215
"""
216
217
def hard_swish(x):
218
"""
219
Alias for hard_silu activation function.
220
221
Parameters:
222
- x: Input tensor
223
224
Returns:
225
Tensor with hard Swish activation applied
226
"""
227
228
def hard_tanh(x):
229
"""
230
Hard tanh activation function (piecewise linear approximation).
231
232
Parameters:
233
- x: Input tensor
234
235
Returns:
236
Tensor with hard tanh activation applied
237
"""
238
239
def relu6(x):
240
"""
241
ReLU activation capped at 6: min(max(x, 0), 6).
242
243
Parameters:
244
- x: Input tensor
245
246
Returns:
247
Tensor with ReLU6 activation applied
248
"""
249
```
250
251
### Shrinkage Functions
252
253
Shrinkage functions that apply thresholding operations for sparse representations.
254
255
```python { .api }
256
def hard_shrink(x, lambd=0.5):
257
"""
258
Hard shrinkage function that zeros values within threshold.
259
260
Parameters:
261
- x: Input tensor
262
- lambd: Threshold value (default: 0.5)
263
264
Returns:
265
Tensor with hard shrinkage applied
266
"""
267
268
def soft_shrink(x, lambd=0.5):
269
"""
270
Soft shrinkage function that applies soft thresholding.
271
272
Parameters:
273
- x: Input tensor
274
- lambd: Threshold value (default: 0.5)
275
276
Returns:
277
Tensor with soft shrinkage applied
278
"""
279
280
def tanh_shrink(x):
281
"""
282
Tanh shrinkage function: x - tanh(x).
283
284
Parameters:
285
- x: Input tensor
286
287
Returns:
288
Tensor with tanh shrinkage applied
289
"""
290
291
def threshold(x, value=0):
292
"""
293
Threshold activation function that sets values below threshold to value.
294
295
Parameters:
296
- x: Input tensor
297
- value: Threshold value (default: 0)
298
299
Returns:
300
Tensor with thresholding applied
301
"""
302
```
303
304
### Sparse Activation Functions
305
306
Specialized activation functions for sparse representations and attention mechanisms.
307
308
```python { .api }
309
def sparsemax(x, axis=-1):
310
"""
311
Sparsemax activation function that produces sparse probability distributions.
312
313
Parameters:
314
- x: Input tensor
315
- axis: Axis along which to apply sparsemax (default: -1)
316
317
Returns:
318
Tensor with sparse probability distribution
319
"""
320
321
def sparse_plus(x):
322
"""
323
Sparse plus activation function for sparse representations.
324
325
Parameters:
326
- x: Input tensor
327
328
Returns:
329
Tensor with sparse plus activation applied
330
"""
331
332
def sparse_sigmoid(x):
333
"""
334
Sparse sigmoid activation function.
335
336
Parameters:
337
- x: Input tensor
338
339
Returns:
340
Tensor with sparse sigmoid activation applied
341
"""
342
343
def squareplus(x, b=4):
344
"""
345
Squareplus activation function: (x + sqrt(x^2 + b)) / 2.
346
347
Parameters:
348
- x: Input tensor
349
- b: Smoothness parameter (default: 4)
350
351
Returns:
352
Tensor with squareplus activation applied
353
"""
354
```
355
356
### Advanced Functions
357
358
Additional specialized activation functions for specific neural network architectures.
359
360
```python { .api }
361
def glu(x, axis=-1):
362
"""
363
Gated Linear Unit activation function.
364
365
Parameters:
366
- x: Input tensor
367
- axis: Axis to split for gating (default: -1)
368
369
Returns:
370
Tensor with GLU activation applied
371
"""
372
373
def celu(x, alpha=1.0):
374
"""
375
Continuously differentiable exponential linear unit.
376
377
Parameters:
378
- x: Input tensor
379
- alpha: Scale parameter (default: 1.0)
380
381
Returns:
382
Tensor with CELU activation applied
383
"""
384
385
def log_sigmoid(x):
386
"""
387
Logarithm of sigmoid function: log(sigmoid(x)).
388
389
Parameters:
390
- x: Input tensor
391
392
Returns:
393
Tensor with log-sigmoid activation applied
394
"""
395
396
def log_softmax(x, axis=-1):
397
"""
398
Logarithm of softmax function: log(softmax(x)).
399
400
Parameters:
401
- x: Input tensor
402
- axis: Axis along which to apply log-softmax (default: -1)
403
404
Returns:
405
Tensor with log-softmax activation applied
406
"""
407
```
408
409
### Utility Functions
410
411
Helper functions for activation function management and serialization.
412
413
```python { .api }
414
def serialize(activation):
415
"""
416
Serialize an activation function to a string or config dict.
417
418
Parameters:
419
- activation: Activation function to serialize
420
421
Returns:
422
String identifier or config dictionary
423
"""
424
425
def deserialize(config, custom_objects=None):
426
"""
427
Deserialize an activation function from a string or config dict.
428
429
Parameters:
430
- config: String identifier or config dictionary
431
- custom_objects: Optional dict mapping names to custom objects
432
433
Returns:
434
Activation function
435
"""
436
437
def get(identifier):
438
"""
439
Retrieve an activation function by string identifier.
440
441
Parameters:
442
- identifier: String name of activation function
443
444
Returns:
445
Activation function
446
"""
447
```
448
449
## Usage Examples
450
451
```python
452
import keras
453
from keras import activations
454
455
# Use activation functions directly
456
x = keras.ops.array([-2.0, -1.0, 0.0, 1.0, 2.0])
457
458
# Apply different activations
459
relu_output = activations.relu(x)
460
sigmoid_output = activations.sigmoid(x)
461
gelu_output = activations.gelu(x)
462
463
# Use in layer definitions
464
model = keras.Sequential([
465
keras.layers.Dense(64, activation='relu'),
466
keras.layers.Dense(32, activation='gelu'),
467
keras.layers.Dense(10, activation='softmax')
468
])
469
470
# Or use activation functions directly
471
model = keras.Sequential([
472
keras.layers.Dense(64, activation=activations.relu),
473
keras.layers.Dense(32, activation=activations.gelu),
474
keras.layers.Dense(10, activation=activations.softmax)
475
])
476
```