0
# Keras High-Level API
1
2
High-level neural network building blocks including models, layers, optimizers, losses, and metrics for rapid prototyping and production. Keras provides an intuitive interface for building and training deep learning models.
3
4
## Capabilities
5
6
### Models
7
8
High-level model classes for building neural networks.
9
10
```python { .api }
11
class Sequential(Model):
12
"""
13
Sequential groups a linear stack of layers into a tf.keras.Model.
14
15
Methods:
16
- add(layer): Adds a layer instance on top of the layer stack
17
- pop(): Removes the last layer in the model
18
- compile(optimizer, loss, metrics): Configures the model for training
19
- fit(x, y, **kwargs): Trains the model for a fixed number of epochs
20
- evaluate(x, y, **kwargs): Returns the loss value & metrics values for the model
21
- predict(x, **kwargs): Generates output predictions for the input samples
22
"""
23
24
class Model:
25
"""
26
Model groups layers into an object with training and inference features.
27
28
Methods:
29
- compile(optimizer, loss, metrics): Configures the model for training
30
- fit(x, y, **kwargs): Trains the model for a fixed number of epochs
31
- evaluate(x, y, **kwargs): Returns the loss value & metrics values for the model
32
- predict(x, **kwargs): Generates output predictions for the input samples
33
- save(filepath, **kwargs): Saves the model to Tensorflow SavedModel or a single HDF5 file
34
- load_model(filepath, **kwargs): Loads a model saved via save()
35
- summary(): Prints a string summary of the network
36
- get_weights(): Retrieves the weights of the model
37
- set_weights(weights): Sets the weights of the model
38
"""
39
40
def load_model(filepath, custom_objects=None, compile=True, options=None):
41
"""
42
Loads a model saved via model.save().
43
44
Parameters:
45
- filepath: One of the following: String or pathlib.Path object, path to the saved model
46
- custom_objects: Optional dictionary mapping names to custom classes or functions
47
- compile: Boolean, whether to compile the model after loading
48
- options: Optional tf.saved_model.LoadOptions object that specifies options for loading from SavedModel
49
50
Returns:
51
A Keras model instance
52
"""
53
54
def save_model(model, filepath, overwrite=True, include_optimizer=True, save_format=None,
55
signatures=None, options=None, save_traces=True):
56
"""
57
Saves a model as a TensorFlow SavedModel or HDF5 file.
58
59
Parameters:
60
- model: Keras model instance to be saved
61
- filepath: One of the following: String or pathlib.Path object, path where to save the model
62
- overwrite: Whether we should overwrite any existing model at the target location
63
- include_optimizer: If True, save optimizer's state together
64
- save_format: Either 'tf' or 'h5', indicating whether to save the model to Tensorflow SavedModel or HDF5
65
- signatures: Signatures to save with the SavedModel
66
- options: Optional tf.saved_model.SaveOptions object that specifies options for saving to SavedModel
67
- save_traces: When enabled, the SavedModel will store the function traces for each layer
68
"""
69
```
70
71
### Core Layers
72
73
Essential layer types for building neural networks.
74
75
```python { .api }
76
class Dense(Layer):
77
"""
78
Just your regular densely-connected NN layer.
79
80
Parameters:
81
- units: Positive integer, dimensionality of the output space
82
- activation: Activation function to use
83
- use_bias: Boolean, whether the layer uses a bias vector
84
- kernel_initializer: Initializer for the kernel weights matrix
85
- bias_initializer: Initializer for the bias vector
86
- kernel_regularizer: Regularizer function applied to the kernel weights matrix
87
- bias_regularizer: Regularizer function applied to the bias vector
88
- activity_regularizer: Regularizer function applied to the output of the layer
89
- kernel_constraint: Constraint function applied to the kernel weights matrix
90
- bias_constraint: Constraint function applied to the bias vector
91
"""
92
93
class Dropout(Layer):
94
"""
95
Applies Dropout to the input.
96
97
Parameters:
98
- rate: Float between 0 and 1. Fraction of the input units to drop
99
- noise_shape: 1D integer tensor representing the shape of the binary dropout mask
100
- seed: A Python integer to use as random seed
101
"""
102
103
class Flatten(Layer):
104
"""
105
Flattens the input. Does not affect the batch size.
106
107
Parameters:
108
- data_format: A string, one of channels_last (default) or channels_first
109
"""
110
111
class Reshape(Layer):
112
"""
113
Reshapes an output to a certain shape.
114
115
Parameters:
116
- target_shape: Target shape. Tuple of integers, does not include the samples dimension (batch size)
117
"""
118
119
class Input:
120
"""
121
Input() is used to instantiate a Keras tensor.
122
123
Parameters:
124
- shape: A shape tuple (integers), not including the batch size
125
- batch_size: optional static batch size (integer)
126
- name: An optional name string for the layer
127
- dtype: The data type expected by the input, as a string
128
- sparse: A boolean specifying whether the placeholder to be created is sparse
129
- tensor: Optional existing tensor to wrap into the Input layer
130
- ragged: A boolean specifying whether the placeholder to be created is ragged
131
"""
132
133
class Lambda(Layer):
134
"""
135
Wraps arbitrary expressions as a Layer object.
136
137
Parameters:
138
- function: The function to be evaluated. Takes input tensor as first argument
139
- output_shape: Expected output shape from function
140
- mask: Either None (no masking) or a callable with the same signature as the compute_mask layer method
141
- arguments: Optional dictionary of keyword arguments to be passed to the function
142
"""
143
```
144
145
### Convolutional Layers
146
147
Layers for processing spatial data such as images.
148
149
```python { .api }
150
class Conv2D(Layer):
151
"""
152
2D convolution layer (e.g. spatial convolution over images).
153
154
Parameters:
155
- filters: Integer, the dimensionality of the output space
156
- kernel_size: An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window
157
- strides: An integer or tuple/list of 2 integers, specifying the strides of the convolution
158
- padding: one of "valid" or "same" (case-insensitive)
159
- data_format: A string, one of channels_last (default) or channels_first
160
- dilation_rate: an integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution
161
- groups: A positive integer specifying the number of groups in which the input is split
162
- activation: Activation function to use
163
- use_bias: Boolean, whether the layer uses a bias vector
164
- kernel_initializer: Initializer for the kernel weights matrix
165
- bias_initializer: Initializer for the bias vector
166
"""
167
168
class Conv2DTranspose(Layer):
169
"""
170
Transposed convolution layer (sometimes called Deconvolution).
171
172
Parameters:
173
- filters: Integer, the dimensionality of the output space
174
- kernel_size: An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window
175
- strides: An integer or tuple/list of 2 integers, specifying the strides of the convolution
176
- padding: one of "valid" or "same" (case-insensitive)
177
- output_padding: An integer or tuple/list of 2 integers, specifying the amount of padding along the height and width
178
- data_format: A string, one of channels_last (default) or channels_first
179
- dilation_rate: an integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution
180
- activation: Activation function to use
181
- use_bias: Boolean, whether the layer uses a bias vector
182
"""
183
184
class MaxPooling2D(Layer):
185
"""
186
Max pooling operation for 2D spatial data.
187
188
Parameters:
189
- pool_size: integer or tuple of 2 integers, window size over which to take the maximum
190
- strides: Integer, tuple of 2 integers, or None. Strides values
191
- padding: One of "valid" or "same" (case-insensitive)
192
- data_format: A string, one of channels_last (default) or channels_first
193
"""
194
195
class AveragePooling2D(Layer):
196
"""
197
Average pooling operation for 2D spatial data.
198
199
Parameters:
200
- pool_size: integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal)
201
- strides: Integer, tuple of 2 integers, or None
202
- padding: One of "valid" or "same" (case-insensitive)
203
- data_format: A string, one of channels_last (default) or channels_first
204
"""
205
```
206
207
### Recurrent Layers
208
209
Layers for processing sequential data.
210
211
```python { .api }
212
class LSTM(Layer):
213
"""
214
Long Short-Term Memory layer - Hochreiter 1997.
215
216
Parameters:
217
- units: Positive integer, dimensionality of the output space
218
- activation: Activation function to use
219
- recurrent_activation: Activation function to use for the recurrent step
220
- use_bias: Boolean (default True), whether the layer uses a bias vector
221
- kernel_initializer: Initializer for the kernel weights matrix
222
- recurrent_initializer: Initializer for the recurrent_kernel weights matrix
223
- bias_initializer: Initializer for the bias vector
224
- unit_forget_bias: Boolean (default True). If True, add 1 to the bias of the forget gate at initialization
225
- kernel_regularizer: Regularizer function applied to the kernel weights matrix
226
- recurrent_regularizer: Regularizer function applied to the recurrent_kernel weights matrix
227
- bias_regularizer: Regularizer function applied to the bias vector
228
- activity_regularizer: Regularizer function applied to the output of the layer
229
- kernel_constraint: Constraint function applied to the kernel weights matrix
230
- recurrent_constraint: Constraint function applied to the recurrent_kernel weights matrix
231
- bias_constraint: Constraint function applied to the bias vector
232
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs
233
- recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state
234
- return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence
235
- return_state: Boolean. Whether to return the last state in addition to the output
236
- go_backwards: Boolean (default False). If True, process the input sequence backwards and return the reversed sequence
237
- stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch
238
- time_major: The shape format of the inputs and outputs. If True, the inputs and outputs will be in shape (timesteps, batch, ...), whereas in the False case, it will be (batch, timesteps, ...)
239
- unroll: Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used
240
"""
241
242
class GRU(Layer):
243
"""
244
Gated Recurrent Unit - Cho et al. 2014.
245
246
Parameters:
247
- units: Positive integer, dimensionality of the output space
248
- activation: Activation function to use
249
- recurrent_activation: Activation function to use for the recurrent step
250
- use_bias: Boolean, (default True), whether the layer uses a bias vector
251
- kernel_initializer: Initializer for the kernel weights matrix
252
- recurrent_initializer: Initializer for the recurrent_kernel weights matrix
253
- bias_initializer: Initializer for the bias vector
254
- kernel_regularizer: Regularizer function applied to the kernel weights matrix
255
- recurrent_regularizer: Regularizer function applied to the recurrent_kernel weights matrix
256
- bias_regularizer: Regularizer function applied to the bias vector
257
- activity_regularizer: Regularizer function applied to the output of the layer
258
- kernel_constraint: Constraint function applied to the kernel weights matrix
259
- recurrent_constraint: Constraint function applied to the recurrent_kernel weights matrix
260
- bias_constraint: Constraint function applied to the bias vector
261
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs
262
- recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state
263
- return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence
264
- return_state: Boolean. Whether to return the last state in addition to the output
265
- go_backwards: Boolean (default False). If True, process the input sequence backwards and return the reversed sequence
266
- stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch
267
- unroll: Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used
268
- time_major: The shape format of the inputs and outputs
269
- reset_after: GRU convention (whether to apply reset gate after or before matrix multiplication)
270
"""
271
272
class SimpleRNN(Layer):
273
"""
274
Fully-connected RNN where the output is to be fed back to input.
275
276
Parameters:
277
- units: Positive integer, dimensionality of the output space
278
- activation: Activation function to use
279
- use_bias: Boolean, (default True), whether the layer uses a bias vector
280
- kernel_initializer: Initializer for the kernel weights matrix
281
- recurrent_initializer: Initializer for the recurrent_kernel weights matrix
282
- bias_initializer: Initializer for the bias vector
283
- kernel_regularizer: Regularizer function applied to the kernel weights matrix
284
- recurrent_regularizer: Regularizer function applied to the recurrent_kernel weights matrix
285
- bias_regularizer: Regularizer function applied to the bias vector
286
- activity_regularizer: Regularizer function applied to the output of the layer
287
- kernel_constraint: Constraint function applied to the kernel weights matrix
288
- recurrent_constraint: Constraint function applied to the recurrent_kernel weights matrix
289
- bias_constraint: Constraint function applied to the bias vector
290
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs
291
- recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state
292
- return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence
293
- return_state: Boolean. Whether to return the last state in addition to the output
294
- go_backwards: Boolean (default False). If True, process the input sequence backwards and return the reversed sequence
295
- stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch
296
- unroll: Boolean (default False). If True, the network will be unrolled, else a symbolic loop will be used
297
"""
298
```
299
300
### Optimizers
301
302
Optimization algorithms for training neural networks.
303
304
```python { .api }
305
class Adam(Optimizer):
306
"""
307
Optimizer that implements the Adam algorithm.
308
309
Parameters:
310
- learning_rate: A Tensor, floating point value, or a schedule that is a tf.keras.optimizers.schedules.LearningRateSchedule
311
- beta_1: A float value or a constant float tensor, or a callable that takes no arguments and returns the actual value to use
312
- beta_2: A float value or a constant float tensor, or a callable that takes no arguments and returns the actual value to use
313
- epsilon: A small constant for numerical stability
314
- amsgrad: Boolean. Whether to apply AMSGrad variant of this algorithm from the paper "On the Convergence of Adam and beyond"
315
- name: Optional name prefix for the operations created when applying gradients
316
"""
317
318
class SGD(Optimizer):
319
"""
320
Gradient descent (with momentum) optimizer.
321
322
Parameters:
323
- learning_rate: A Tensor, floating point value, or a schedule that is a tf.keras.optimizers.schedules.LearningRateSchedule
324
- momentum: float hyperparameter >= 0 that accelerates gradient descent in the relevant direction and dampens oscillations
325
- nesterov: boolean. Whether to apply Nesterov momentum
326
- name: Optional name prefix for the operations created when applying gradients
327
"""
328
329
class RMSprop(Optimizer):
330
"""
331
Optimizer that implements the RMSprop algorithm.
332
333
Parameters:
334
- learning_rate: A Tensor, floating point value, or a schedule that is a tf.keras.optimizers.schedules.LearningRateSchedule
335
- rho: Discounting factor for the history/coming gradient
336
- momentum: A scalar or a scalar Tensor
337
- epsilon: A small constant for numerical stability
338
- centered: Boolean. If True, gradients are normalized by the estimated variance of the gradient
339
- name: Optional name prefix for the operations created when applying gradients
340
"""
341
```
342
343
## Usage Examples
344
345
```python
346
import tensorflow as tf
347
from tensorflow import keras
348
from tensorflow.keras import layers
349
350
# Sequential model
351
model = keras.Sequential([
352
layers.Dense(128, activation='relu', input_shape=(784,)),
353
layers.Dropout(0.2),
354
layers.Dense(10, activation='softmax')
355
])
356
357
# Functional API model
358
inputs = keras.Input(shape=(784,))
359
x = layers.Dense(128, activation='relu')(inputs)
360
x = layers.Dropout(0.2)(x)
361
outputs = layers.Dense(10, activation='softmax')(x)
362
model = keras.Model(inputs=inputs, outputs=outputs)
363
364
# Compile model
365
model.compile(
366
optimizer='adam',
367
loss='sparse_categorical_crossentropy',
368
metrics=['accuracy']
369
)
370
371
# Train model (example with dummy data)
372
import numpy as np
373
x_train = np.random.random((1000, 784))
374
y_train = np.random.randint(10, size=(1000,))
375
376
model.fit(x_train, y_train, epochs=5, batch_size=32, validation_split=0.2)
377
378
# Evaluate and predict
379
x_test = np.random.random((100, 784))
380
y_test = np.random.randint(10, size=(100,))
381
382
loss, accuracy = model.evaluate(x_test, y_test)
383
predictions = model.predict(x_test)
384
385
# Save and load model
386
model.save('my_model.h5')
387
loaded_model = keras.models.load_model('my_model.h5')
388
389
# Convolutional model example
390
cnn_model = keras.Sequential([
391
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
392
layers.MaxPooling2D((2, 2)),
393
layers.Conv2D(64, (3, 3), activation='relu'),
394
layers.MaxPooling2D((2, 2)),
395
layers.Conv2D(64, (3, 3), activation='relu'),
396
layers.Flatten(),
397
layers.Dense(64, activation='relu'),
398
layers.Dense(10, activation='softmax')
399
])
400
401
# LSTM model example
402
lstm_model = keras.Sequential([
403
layers.LSTM(50, return_sequences=True, input_shape=(10, 1)),
404
layers.LSTM(50, return_sequences=False),
405
layers.Dense(25),
406
layers.Dense(1)
407
])
408
```