0
# Activation Functions
1
2
Comprehensive collection of activation functions for neural networks. Activation functions introduce non-linearity into neural networks, enabling them to learn complex patterns and representations. Keras provides a wide range of activation functions from traditional sigmoid and tanh to modern alternatives like GELU and Swish.
3
4
## Capabilities
5
6
### Standard Activation Functions
7
8
Traditional activation functions commonly used in neural networks.
9
10
```python { .api }
11
def relu(x, negative_slope=0.0, max_value=None, threshold=0.0):
12
"""
13
Rectified Linear Unit activation function.
14
15
Args:
16
x: Input tensor
17
negative_slope: float, slope for values below threshold (default: 0.0)
18
max_value: float, saturation threshold (default: None)
19
threshold: float, threshold value below which values are damped (default: 0.0)
20
21
Returns:
22
Tensor with same shape and dtype as input
23
"""
24
25
def sigmoid(x):
26
"""
27
Sigmoid activation function.
28
29
Args:
30
x: Input tensor
31
32
Returns:
33
Tensor with values in range (0, 1)
34
"""
35
36
def tanh(x):
37
"""
38
Hyperbolic tangent activation function.
39
40
Args:
41
x: Input tensor
42
43
Returns:
44
Tensor with values in range (-1, 1)
45
"""
46
47
def softmax(x, axis=-1):
48
"""
49
Softmax activation function.
50
51
Args:
52
x: Input tensor
53
axis: int, axis along which to apply softmax (default: -1)
54
55
Returns:
56
Tensor with values summing to 1 along specified axis
57
"""
58
59
def linear(x):
60
"""
61
Linear (identity) activation function.
62
63
Args:
64
x: Input tensor
65
66
Returns:
67
Input tensor unchanged
68
"""
69
```
70
71
### Modern Activation Functions
72
73
Contemporary activation functions that often provide better performance than traditional alternatives.
74
75
```python { .api }
76
def gelu(x, approximate=False):
77
"""
78
Gaussian Error Linear Unit activation function.
79
80
Args:
81
x: Input tensor
82
approximate: bool, whether to use approximation (default: False)
83
84
Returns:
85
Tensor with GELU activation applied
86
"""
87
88
def silu(x):
89
"""
90
Sigmoid Linear Unit (SiLU/Swish) activation function.
91
92
Args:
93
x: Input tensor
94
95
Returns:
96
Tensor with SiLU activation applied
97
"""
98
99
def swish(x):
100
"""
101
Swish activation function (alias for SiLU).
102
103
Args:
104
x: Input tensor
105
106
Returns:
107
Tensor with Swish activation applied
108
"""
109
110
def mish(x):
111
"""
112
Mish activation function.
113
114
Args:
115
x: Input tensor
116
117
Returns:
118
Tensor with Mish activation applied
119
"""
120
```
121
122
### Exponential and Logarithmic Functions
123
124
Activation functions based on exponential and logarithmic operations.
125
126
```python { .api }
127
def elu(x, alpha=1.0):
128
"""
129
Exponential Linear Unit activation function.
130
131
Args:
132
x: Input tensor
133
alpha: float, scale for negative part (default: 1.0)
134
135
Returns:
136
Tensor with ELU activation applied
137
"""
138
139
def selu(x):
140
"""
141
Scaled Exponential Linear Unit activation function.
142
143
Args:
144
x: Input tensor
145
146
Returns:
147
Tensor with SELU activation applied
148
"""
149
150
def celu(x, alpha=1.0):
151
"""
152
Continuously Differentiable Exponential Linear Unit.
153
154
Args:
155
x: Input tensor
156
alpha: float, scale parameter (default: 1.0)
157
158
Returns:
159
Tensor with CELU activation applied
160
"""
161
162
def exponential(x):
163
"""
164
Exponential activation function.
165
166
Args:
167
x: Input tensor
168
169
Returns:
170
Tensor with exponential activation applied
171
"""
172
173
def log_sigmoid(x):
174
"""
175
Logarithm of sigmoid activation function.
176
177
Args:
178
x: Input tensor
179
180
Returns:
181
Tensor with log-sigmoid activation applied
182
"""
183
184
def log_softmax(x, axis=-1):
185
"""
186
Log-softmax activation function.
187
188
Args:
189
x: Input tensor
190
axis: int, axis along which to apply log-softmax (default: -1)
191
192
Returns:
193
Tensor with log-softmax activation applied
194
"""
195
```
196
197
### Specialized and Experimental Functions
198
199
Less common activation functions for specific use cases and experimental applications.
200
201
```python { .api }
202
def relu6(x):
203
"""
204
ReLU6 activation function (ReLU capped at 6).
205
206
Args:
207
x: Input tensor
208
209
Returns:
210
Tensor with ReLU6 activation applied
211
"""
212
213
def leaky_relu(x, negative_slope=0.3):
214
"""
215
Leaky ReLU activation function.
216
217
Args:
218
x: Input tensor
219
negative_slope: float, slope for negative values (default: 0.3)
220
221
Returns:
222
Tensor with Leaky ReLU activation applied
223
"""
224
225
def hard_sigmoid(x):
226
"""
227
Hard sigmoid activation function.
228
229
Args:
230
x: Input tensor
231
232
Returns:
233
Tensor with hard sigmoid activation applied
234
"""
235
236
def hard_silu(x):
237
"""
238
Hard SiLU activation function.
239
240
Args:
241
x: Input tensor
242
243
Returns:
244
Tensor with Hard SiLU activation applied
245
"""
246
247
def hard_swish(x):
248
"""
249
Hard Swish activation function (alias for hard_silu).
250
251
Args:
252
x: Input tensor
253
254
Returns:
255
Tensor with Hard Swish activation applied
256
"""
257
258
def hard_tanh(x):
259
"""
260
Hard hyperbolic tangent activation function.
261
262
Args:
263
x: Input tensor
264
265
Returns:
266
Tensor with hard tanh activation applied
267
"""
268
269
def softplus(x):
270
"""
271
Softplus activation function.
272
273
Args:
274
x: Input tensor
275
276
Returns:
277
Tensor with softplus activation applied
278
"""
279
280
def softsign(x):
281
"""
282
Softsign activation function.
283
284
Args:
285
x: Input tensor
286
287
Returns:
288
Tensor with softsign activation applied
289
"""
290
291
def glu(x, axis=-1):
292
"""
293
Gated Linear Unit activation function.
294
295
Args:
296
x: Input tensor
297
axis: int, axis along which to split input (default: -1)
298
299
Returns:
300
Tensor with GLU activation applied
301
"""
302
```
303
304
### Sparse and Shrinkage Functions
305
306
Specialized activation functions for sparsity and value shrinkage.
307
308
```python { .api }
309
def sparsemax(x, axis=-1):
310
"""
311
Sparsemax activation function.
312
313
Args:
314
x: Input tensor
315
axis: int, axis along which to apply sparsemax (default: -1)
316
317
Returns:
318
Tensor with sparsemax activation applied
319
"""
320
321
def sparse_plus(x):
322
"""
323
Sparse plus activation function.
324
325
Args:
326
x: Input tensor
327
328
Returns:
329
Tensor with sparse plus activation applied
330
"""
331
332
def sparse_sigmoid(x):
333
"""
334
Sparse sigmoid activation function.
335
336
Args:
337
x: Input tensor
338
339
Returns:
340
Tensor with sparse sigmoid activation applied
341
"""
342
343
def squareplus(x, b=4.0):
344
"""
345
Squareplus activation function.
346
347
Args:
348
x: Input tensor
349
b: float, smoothness parameter (default: 4.0)
350
351
Returns:
352
Tensor with squareplus activation applied
353
"""
354
355
def hard_shrink(x, lambd=0.5):
356
"""
357
Hard shrinkage activation function.
358
359
Args:
360
x: Input tensor
361
lambd: float, threshold parameter (default: 0.5)
362
363
Returns:
364
Tensor with hard shrinkage applied
365
"""
366
367
def soft_shrink(x, lambd=0.5):
368
"""
369
Soft shrinkage activation function.
370
371
Args:
372
x: Input tensor
373
lambd: float, threshold parameter (default: 0.5)
374
375
Returns:
376
Tensor with soft shrinkage applied
377
"""
378
379
def tanh_shrink(x):
380
"""
381
Tanh shrinkage activation function.
382
383
Args:
384
x: Input tensor
385
386
Returns:
387
Tensor with tanh shrinkage applied
388
"""
389
390
def threshold(x, threshold_value=0.0, value=0.0):
391
"""
392
Threshold activation function.
393
394
Args:
395
x: Input tensor
396
threshold_value: float, threshold value (default: 0.0)
397
value: float, replacement value below threshold (default: 0.0)
398
399
Returns:
400
Tensor with threshold activation applied
401
"""
402
```
403
404
### Utility Functions
405
406
Functions for serialization, deserialization, and retrieval of activation functions.
407
408
```python { .api }
409
def get(identifier):
410
"""
411
Retrieve an activation function by name or function.
412
413
Args:
414
identifier: str or callable, activation function identifier
415
416
Returns:
417
Activation function
418
"""
419
420
def serialize(activation):
421
"""
422
Serialize an activation function to a configuration.
423
424
Args:
425
activation: Activation function to serialize
426
427
Returns:
428
Configuration dictionary
429
"""
430
431
def deserialize(config, custom_objects=None):
432
"""
433
Deserialize an activation function from configuration.
434
435
Args:
436
config: Configuration dictionary
437
custom_objects: dict, custom objects for deserialization
438
439
Returns:
440
Activation function
441
"""
442
```
443
444
## Usage Examples
445
446
### Basic Usage
447
448
```python
449
import keras
450
from keras import activations
451
452
# Apply activation to tensor
453
x = keras.ops.convert_to_tensor([-2.0, -1.0, 0.0, 1.0, 2.0])
454
455
# ReLU activation
456
relu_output = activations.relu(x) # [0., 0., 0., 1., 2.]
457
458
# Sigmoid activation
459
sigmoid_output = activations.sigmoid(x) # [0.119, 0.269, 0.5, 0.731, 0.881]
460
461
# GELU activation
462
gelu_output = activations.gelu(x) # [-0.045, -0.159, 0., 0.841, 1.955]
463
```
464
465
### In Layer Definitions
466
467
```python
468
from keras import layers
469
470
# Using activation as string
471
dense_layer = layers.Dense(64, activation='relu')
472
473
# Using activation function directly
474
dense_layer = layers.Dense(64, activation=activations.gelu)
475
476
# Using activation with parameters
477
def custom_relu(x):
478
return activations.relu(x, negative_slope=0.1, max_value=6.0)
479
480
dense_layer = layers.Dense(64, activation=custom_relu)
481
```
482
483
### Comparing Activation Functions
484
485
```python
486
import numpy as np
487
import matplotlib.pyplot as plt
488
489
x = np.linspace(-5, 5, 100)
490
x_tensor = keras.ops.convert_to_tensor(x)
491
492
# Compare different activations
493
relu_out = activations.relu(x_tensor)
494
gelu_out = activations.gelu(x_tensor)
495
silu_out = activations.silu(x_tensor)
496
tanh_out = activations.tanh(x_tensor)
497
```