Tessl Tile for pypi/keras@3.11.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

activations.md applications.md data-utils.md index.md initializers.md layers.md models.md operations.md random.md regularizers.md saving.md training.md

layers.mddocs/

0
# Layers and Building Blocks
1

2
Comprehensive layer types for building neural networks in Keras. Layers are the fundamental building blocks that transform inputs through learnable parameters and mathematical operations.
3

4
## Capabilities
5

6
### Base Layer Class
7

8
The foundational Layer class that all Keras layers inherit from, providing core functionality for parameter management, computation, and serialization.
9

10
```python { .api }
11
class Layer:
12
    def __init__(self, trainable=True, name=None, dtype=None, **kwargs):
13
        """
14
        Base class for all neural network layers.
15
        
16
        Parameters:
17
        - trainable: Whether layer weights should be trainable
18
        - name: Name of the layer
19
        - dtype: Data type for layer computations
20
        """
21
    
22
    def call(self, inputs, **kwargs):
23
        """
24
        Forward pass computation logic.
25
        
26
        Parameters:
27
        - inputs: Input tensor(s)
28
        
29
        Returns:
30
        Output tensor(s)
31
        """
32
    
33
    def build(self, input_shape):
34
        """
35
        Create layer weights based on input shape.
36
        
37
        Parameters:
38
        - input_shape: Shape of input tensor
39
        """
40
    
41
    def get_config(self):
42
        """
43
        Get layer configuration for serialization.
44
        
45
        Returns:
46
        Dict containing layer configuration
47
        """
48
```
49

50
### Core Layers
51

52
Fundamental layers for basic neural network operations including dense connections, embeddings, and utility layers.
53

54
```python { .api }
55
class Dense(Layer):
56
    def __init__(self, units, activation=None, use_bias=True, 
57
                 kernel_initializer='glorot_uniform', bias_initializer='zeros',
58
                 kernel_regularizer=None, bias_regularizer=None,
59
                 activity_regularizer=None, kernel_constraint=None,
60
                 bias_constraint=None, lora_rank=None, lora_alpha=None, **kwargs):
61
        """
62
        Fully connected layer.
63
        
64
        Parameters:
65
        - units: Number of output units
66
        - activation: Activation function to use
67
        - use_bias: Whether to use bias vector
68
        - kernel_initializer: Initializer for weight matrix
69
        - bias_initializer: Initializer for bias vector
70
        - kernel_regularizer: Regularizer for weight matrix
71
        - bias_regularizer: Regularizer for bias vector
72
        - activity_regularizer: Regularizer for layer output
73
        - kernel_constraint: Constraint for weight matrix
74
        - bias_constraint: Constraint for bias vector
75
        - lora_rank: Rank for LoRA (Low-Rank Adaptation)
76
        - lora_alpha: Alpha parameter for LoRA scaling
77
        """
78

79
class Embedding(Layer):
80
    def __init__(self, input_dim, output_dim, embeddings_initializer='uniform',
81
                 embeddings_regularizer=None, mask_zero=False, **kwargs):
82
        """
83
        Embedding layer for discrete tokens.
84
        
85
        Parameters:
86
        - input_dim: Size of vocabulary
87
        - output_dim: Size of dense vector embeddings
88
        - embeddings_initializer: Initializer for embedding matrix
89
        - embeddings_regularizer: Regularizer for embedding matrix
90
        - mask_zero: Whether input value 0 is special "padding" value
91
        """
92

93
class Flatten(Layer):
94
    def __init__(self, data_format=None, **kwargs):
95
        """
96
        Flatten input tensor to 1D (except batch dimension).
97
        
98
        Parameters:
99
        - data_format: Data format for input tensor
100
        """
101

102
class Reshape(Layer):
103
    def __init__(self, target_shape, **kwargs):
104
        """
105
        Reshape input tensor to target shape.
106
        
107
        Parameters:
108
        - target_shape: Target shape tuple (not including batch dimension)
109
        """
110

111
class Lambda(Layer):
112
    def __init__(self, function, output_shape=None, mask=None, **kwargs):
113
        """
114
        Wrap arbitrary expression as layer.
115
        
116
        Parameters:
117
        - function: Function to be evaluated
118
        - output_shape: Expected output shape from function
119
        - mask: Mask to be applied to output
120
        """
121
```
122

123
### Convolutional Layers
124

125
Layers for convolutional operations in 1D, 2D, and 3D, including standard convolution, transposed convolution, depthwise, and separable convolutions.
126

127
```python { .api }
128
class Conv2D(Layer):
129
    def __init__(self, filters, kernel_size, strides=(1, 1), padding='valid',
130
                 data_format=None, dilation_rate=(1, 1), groups=1,
131
                 activation=None, use_bias=True, **kwargs):
132
        """
133
        2D convolution layer.
134
        
135
        Parameters:
136
        - filters: Number of output filters
137
        - kernel_size: Size of convolution window
138
        - strides: Stride of convolution
139
        - padding: Padding mode ('valid' or 'same')
140
        - data_format: Data format ('channels_last' or 'channels_first')
141
        - dilation_rate: Dilation rate for dilated convolution
142
        - groups: Number of groups for grouped convolution
143
        - activation: Activation function
144
        - use_bias: Whether to use bias
145
        """
146

147
class Conv1D(Layer):
148
    def __init__(self, filters, kernel_size, strides=1, padding='valid',
149
                 data_format='channels_last', dilation_rate=1, groups=1,
150
                 activation=None, use_bias=True, **kwargs):
151
        """1D convolution layer."""
152

153
class Conv3D(Layer):
154
    def __init__(self, filters, kernel_size, strides=(1, 1, 1), padding='valid',
155
                 data_format=None, dilation_rate=(1, 1, 1), groups=1,
156
                 activation=None, use_bias=True, **kwargs):
157
        """3D convolution layer."""
158

159
class Conv2DTranspose(Layer):
160
    def __init__(self, filters, kernel_size, strides=(1, 1), padding='valid',
161
                 output_padding=None, data_format=None, dilation_rate=(1, 1),
162
                 activation=None, use_bias=True, **kwargs):
163
        """2D transposed convolution layer."""
164

165
class DepthwiseConv2D(Layer):
166
    def __init__(self, kernel_size, strides=(1, 1), padding='valid',
167
                 depth_multiplier=1, data_format=None, dilation_rate=(1, 1),
168
                 activation=None, use_bias=True, **kwargs):
169
        """2D depthwise convolution layer."""
170

171
class SeparableConv2D(Layer):
172
    def __init__(self, filters, kernel_size, strides=(1, 1), padding='valid',
173
                 data_format=None, dilation_rate=(1, 1), depth_multiplier=1,
174
                 activation=None, use_bias=True, **kwargs):
175
        """2D separable convolution layer."""
176
```
177

178
### Pooling Layers
179

180
Pooling operations for downsampling feature maps using max pooling, average pooling, and global pooling variants.
181

182
```python { .api }
183
class MaxPooling2D(Layer):
184
    def __init__(self, pool_size=(2, 2), strides=None, padding='valid',
185
                 data_format=None, **kwargs):
186
        """
187
        2D max pooling layer.
188
        
189
        Parameters:
190
        - pool_size: Size of pooling window
191
        - strides: Stride of pooling operation
192
        - padding: Padding mode
193
        - data_format: Data format
194
        """
195

196
class AveragePooling2D(Layer):
197
    def __init__(self, pool_size=(2, 2), strides=None, padding='valid',
198
                 data_format=None, **kwargs):
199
        """2D average pooling layer."""
200

201
class GlobalMaxPooling2D(Layer):
202
    def __init__(self, data_format=None, keepdims=False, **kwargs):
203
        """
204
        Global max pooling for 2D data.
205
        
206
        Parameters:
207
        - data_format: Data format
208
        - keepdims: Whether to keep spatial dimensions
209
        """
210

211
class GlobalAveragePooling2D(Layer):
212
    def __init__(self, data_format=None, keepdims=False, **kwargs):
213
        """Global average pooling for 2D data."""
214
```
215

216
### Recurrent Layers
217

218
Recurrent neural network layers including LSTM, GRU, and simple RNN variants for sequence processing.
219

220
```python { .api }
221
class LSTM(Layer):
222
    def __init__(self, units, activation='tanh', recurrent_activation='sigmoid',
223
                 use_bias=True, kernel_initializer='glorot_uniform',
224
                 recurrent_initializer='orthogonal', bias_initializer='zeros',
225
                 dropout=0.0, recurrent_dropout=0.0, return_sequences=False,
226
                 return_state=False, go_backwards=False, stateful=False,
227
                 unroll=False, **kwargs):
228
        """
229
        Long Short-Term Memory layer.
230
        
231
        Parameters:
232
        - units: Dimensionality of output space
233
        - activation: Activation function for gates
234
        - recurrent_activation: Activation function for recurrent step
235
        - use_bias: Whether to use bias vectors
236
        - kernel_initializer: Initializer for input weights
237
        - recurrent_initializer: Initializer for recurrent weights
238
        - bias_initializer: Initializer for bias vectors
239
        - dropout: Dropout rate for input connections
240
        - recurrent_dropout: Dropout rate for recurrent connections
241
        - return_sequences: Whether to return full sequence or last output
242
        - return_state: Whether to return last state in addition to output
243
        - go_backwards: Whether to process sequence backwards
244
        - stateful: Whether to reset states between batches
245
        - unroll: Whether to unroll the recurrent loop
246
        """
247

248
class GRU(Layer):
249
    def __init__(self, units, activation='tanh', recurrent_activation='sigmoid',
250
                 use_bias=True, dropout=0.0, recurrent_dropout=0.0,
251
                 return_sequences=False, return_state=False, **kwargs):
252
        """Gated Recurrent Unit layer."""
253

254
class SimpleRNN(Layer):
255
    def __init__(self, units, activation='tanh', use_bias=True, dropout=0.0,
256
                 recurrent_dropout=0.0, return_sequences=False, 
257
                 return_state=False, **kwargs):
258
        """Simple RNN layer."""
259

260
class Bidirectional(Layer):
261
    def __init__(self, layer, merge_mode='concat', weights=None, **kwargs):
262
        """
263
        Bidirectional wrapper for RNNs.
264
        
265
        Parameters:
266
        - layer: RNN layer to wrap
267
        - merge_mode: How to combine forward and backward outputs
268
        - weights: Initial weights
269
        """
270
```
271

272
### Attention Layers
273

274
Attention mechanisms for focusing on relevant parts of input sequences and implementing transformer-style architectures.
275

276
```python { .api }
277
class MultiHeadAttention(Layer):
278
    def __init__(self, num_heads, key_dim, value_dim=None, dropout=0.0,
279
                 use_bias=True, output_shape=None, **kwargs):
280
        """
281
        Multi-head attention layer.
282
        
283
        Parameters:
284
        - num_heads: Number of attention heads
285
        - key_dim: Size of each attention head for query and key
286
        - value_dim: Size of each attention head for value
287
        - dropout: Dropout probability for attention weights
288
        - use_bias: Whether to use bias in linear projections
289
        - output_shape: Expected shape of output tensor
290
        """
291

292
class Attention(Layer):
293
    def __init__(self, use_scale=False, score_mode='dot', **kwargs):
294
        """
295
        Attention layer for computing attention weights.
296
        
297
        Parameters:
298
        - use_scale: Whether to scale attention scores
299
        - score_mode: Type of attention score computation
300
        """
301
```
302

303
### Normalization Layers
304

305
Normalization techniques for stabilizing and accelerating training, including batch normalization, layer normalization, and group normalization.
306

307
```python { .api }
308
class BatchNormalization(Layer):
309
    def __init__(self, axis=-1, momentum=0.99, epsilon=1e-3, center=True,
310
                 scale=True, beta_initializer='zeros', gamma_initializer='ones',
311
                 **kwargs):
312
        """
313
        Batch normalization layer.
314
        
315
        Parameters:
316
        - axis: Axis to normalize along
317
        - momentum: Momentum for moving statistics
318
        - epsilon: Small constant for numerical stability
319
        - center: Whether to add learned offset parameter
320
        - scale: Whether to add learned scaling parameter
321
        - beta_initializer: Initializer for beta parameter
322
        - gamma_initializer: Initializer for gamma parameter
323
        """
324

325
class LayerNormalization(Layer):
326
    def __init__(self, axis=-1, epsilon=1e-3, center=True, scale=True,
327
                 beta_initializer='zeros', gamma_initializer='ones', **kwargs):
328
        """Layer normalization layer."""
329

330
class GroupNormalization(Layer):
331
    def __init__(self, groups=32, axis=-1, epsilon=1e-3, center=True, 
332
                 scale=True, **kwargs):
333
        """
334
        Group normalization layer.
335
        
336
        Parameters:
337
        - groups: Number of groups for normalization
338
        - axis: Axis to normalize along
339
        - epsilon: Small constant for numerical stability
340
        - center: Whether to add learned offset parameter
341
        - scale: Whether to add learned scaling parameter
342
        """
343
```
344

345
### Regularization Layers
346

347
Layers for regularization including various dropout techniques and noise injection to prevent overfitting.
348

349
```python { .api }
350
class Dropout(Layer):
351
    def __init__(self, rate, noise_shape=None, seed=None, **kwargs):
352
        """
353
        Dropout layer for regularization.
354
        
355
        Parameters:
356
        - rate: Fraction of input units to drop
357
        - noise_shape: Shape of binary dropout mask
358
        - seed: Random seed for dropout
359
        """
360

361
class SpatialDropout2D(Layer):
362
    def __init__(self, rate, data_format=None, **kwargs):
363
        """
364
        2D spatial dropout layer.
365
        
366
        Parameters:
367
        - rate: Fraction of input units to drop
368
        - data_format: Data format
369
        """
370

371
class GaussianNoise(Layer):
372
    def __init__(self, stddev, **kwargs):
373
        """
374
        Gaussian noise regularization layer.
375
        
376
        Parameters:
377
        - stddev: Standard deviation of noise distribution
378
        """
379

380
class GaussianDropout(Layer):
381
    def __init__(self, rate, **kwargs):
382
        """
383
        Multiplicative Gaussian noise layer.
384
        
385
        Parameters:
386
        - rate: Drop probability as in Dropout
387
        """
388
```
389

390
### Activation Layers
391

392
Activation functions implemented as layers for explicit control and custom activation patterns.
393

394
```python { .api }
395
class Activation(Layer):
396
    def __init__(self, activation, **kwargs):
397
        """
398
        Activation layer.
399
        
400
        Parameters:
401
        - activation: Name of activation function or callable
402
        """
403

404
class ReLU(Layer):
405
    def __init__(self, max_value=None, negative_slope=0.0, threshold=0.0, **kwargs):
406
        """
407
        ReLU activation layer.
408
        
409
        Parameters:
410
        - max_value: Maximum activation value
411
        - negative_slope: Slope for negative values
412
        - threshold: Threshold value for activation
413
        """
414

415
class LeakyReLU(Layer):
416
    def __init__(self, alpha=0.3, **kwargs):
417
        """
418
        Leaky ReLU activation layer.
419
        
420
        Parameters:
421
        - alpha: Slope for negative values
422
        """
423

424
class ELU(Layer):
425
    def __init__(self, alpha=1.0, **kwargs):
426
        """
427
        ELU activation layer.
428
        
429
        Parameters:
430
        - alpha: Scale for negative values
431
        """
432

433
class Softmax(Layer):
434
    def __init__(self, axis=-1, **kwargs):
435
        """
436
        Softmax activation layer.
437
        
438
        Parameters:
439
        - axis: Axis along which to apply softmax
440
        """
441
```
442

443
### Merging Layers
444

445
Layers for combining multiple input tensors through various operations like addition, concatenation, and element-wise operations.
446

447
```python { .api }
448
class Add(Layer):
449
    def __init__(self, **kwargs):
450
        """Element-wise addition layer."""
451

452
class Concatenate(Layer):
453
    def __init__(self, axis=-1, **kwargs):
454
        """
455
        Concatenation layer.
456
        
457
        Parameters:
458
        - axis: Axis along which to concatenate
459
        """
460

461
class Multiply(Layer):
462
    def __init__(self, **kwargs):
463
        """Element-wise multiplication layer."""
464

465
class Average(Layer):
466
    def __init__(self, **kwargs):
467
        """Element-wise averaging layer."""
468

469
class Maximum(Layer):
470
    def __init__(self, **kwargs):
471
        """Element-wise maximum layer."""
472

473
class Minimum(Layer):
474
    def __init__(self, **kwargs):
475
        """Element-wise minimum layer."""
476

477
class Dot(Layer):
478
    def __init__(self, axes, normalize=False, **kwargs):
479
        """
480
        Dot product layer.
481
        
482
        Parameters:
483
        - axes: Axes to compute dot product over
484
        - normalize: Whether to normalize inputs
485
        """
486
```
487

488
## Usage Examples
489

490
### Building a CNN
491

492
```python
493
import keras
494
from keras import layers
495

496
model = keras.Sequential([
497
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
498
    layers.MaxPooling2D((2, 2)),
499
    layers.Conv2D(64, (3, 3), activation='relu'),
500
    layers.MaxPooling2D((2, 2)),
501
    layers.Conv2D(64, (3, 3), activation='relu'),
502
    layers.Flatten(),
503
    layers.Dense(64, activation='relu'),
504
    layers.Dense(10, activation='softmax')
505
])
506
```
507

508
### Building an LSTM for Sequence Processing
509

510
```python
511
import keras
512
from keras import layers
513

514
model = keras.Sequential([
515
    layers.Embedding(10000, 128, input_length=100),
516
    layers.LSTM(64, dropout=0.2, recurrent_dropout=0.2),
517
    layers.Dense(1, activation='sigmoid')
518
])
519
```
520

521
### Using Functional API for Complex Architecture
522

523
```python
524
import keras
525
from keras import layers
526

527
inputs = keras.Input(shape=(784,))
528
x = layers.Dense(128, activation='relu')(inputs)
529
x = layers.Dropout(0.2)(x)
530
branch1 = layers.Dense(64, activation='relu', name='branch1')(x)
531
branch2 = layers.Dense(64, activation='relu', name='branch2')(x)
532
merged = layers.Add()([branch1, branch2])
533
outputs = layers.Dense(10, activation='softmax')(merged)
534

535
model = keras.Model(inputs=inputs, outputs=outputs)
536
```

Version

Tile

Files

layers.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

layers.mddocs/