or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

activations.mdapplications.mddata-utils.mdindex.mdinitializers.mdlayers.mdmodels.mdoperations.mdrandom.mdregularizers.mdsaving.mdtraining.md

layers.mddocs/

0

# Layers and Building Blocks

1

2

Comprehensive layer types for building neural networks in Keras. Layers are the fundamental building blocks that transform inputs through learnable parameters and mathematical operations.

3

4

## Capabilities

5

6

### Base Layer Class

7

8

The foundational Layer class that all Keras layers inherit from, providing core functionality for parameter management, computation, and serialization.

9

10

```python { .api }

11

class Layer:

12

def __init__(self, trainable=True, name=None, dtype=None, **kwargs):

13

"""

14

Base class for all neural network layers.

15

16

Parameters:

17

- trainable: Whether layer weights should be trainable

18

- name: Name of the layer

19

- dtype: Data type for layer computations

20

"""

21

22

def call(self, inputs, **kwargs):

23

"""

24

Forward pass computation logic.

25

26

Parameters:

27

- inputs: Input tensor(s)

28

29

Returns:

30

Output tensor(s)

31

"""

32

33

def build(self, input_shape):

34

"""

35

Create layer weights based on input shape.

36

37

Parameters:

38

- input_shape: Shape of input tensor

39

"""

40

41

def get_config(self):

42

"""

43

Get layer configuration for serialization.

44

45

Returns:

46

Dict containing layer configuration

47

"""

48

```

49

50

### Core Layers

51

52

Fundamental layers for basic neural network operations including dense connections, embeddings, and utility layers.

53

54

```python { .api }

55

class Dense(Layer):

56

def __init__(self, units, activation=None, use_bias=True,

57

kernel_initializer='glorot_uniform', bias_initializer='zeros',

58

kernel_regularizer=None, bias_regularizer=None,

59

activity_regularizer=None, kernel_constraint=None,

60

bias_constraint=None, lora_rank=None, lora_alpha=None, **kwargs):

61

"""

62

Fully connected layer.

63

64

Parameters:

65

- units: Number of output units

66

- activation: Activation function to use

67

- use_bias: Whether to use bias vector

68

- kernel_initializer: Initializer for weight matrix

69

- bias_initializer: Initializer for bias vector

70

- kernel_regularizer: Regularizer for weight matrix

71

- bias_regularizer: Regularizer for bias vector

72

- activity_regularizer: Regularizer for layer output

73

- kernel_constraint: Constraint for weight matrix

74

- bias_constraint: Constraint for bias vector

75

- lora_rank: Rank for LoRA (Low-Rank Adaptation)

76

- lora_alpha: Alpha parameter for LoRA scaling

77

"""

78

79

class Embedding(Layer):

80

def __init__(self, input_dim, output_dim, embeddings_initializer='uniform',

81

embeddings_regularizer=None, mask_zero=False, **kwargs):

82

"""

83

Embedding layer for discrete tokens.

84

85

Parameters:

86

- input_dim: Size of vocabulary

87

- output_dim: Size of dense vector embeddings

88

- embeddings_initializer: Initializer for embedding matrix

89

- embeddings_regularizer: Regularizer for embedding matrix

90

- mask_zero: Whether input value 0 is special "padding" value

91

"""

92

93

class Flatten(Layer):

94

def __init__(self, data_format=None, **kwargs):

95

"""

96

Flatten input tensor to 1D (except batch dimension).

97

98

Parameters:

99

- data_format: Data format for input tensor

100

"""

101

102

class Reshape(Layer):

103

def __init__(self, target_shape, **kwargs):

104

"""

105

Reshape input tensor to target shape.

106

107

Parameters:

108

- target_shape: Target shape tuple (not including batch dimension)

109

"""

110

111

class Lambda(Layer):

112

def __init__(self, function, output_shape=None, mask=None, **kwargs):

113

"""

114

Wrap arbitrary expression as layer.

115

116

Parameters:

117

- function: Function to be evaluated

118

- output_shape: Expected output shape from function

119

- mask: Mask to be applied to output

120

"""

121

```

122

123

### Convolutional Layers

124

125

Layers for convolutional operations in 1D, 2D, and 3D, including standard convolution, transposed convolution, depthwise, and separable convolutions.

126

127

```python { .api }

128

class Conv2D(Layer):

129

def __init__(self, filters, kernel_size, strides=(1, 1), padding='valid',

130

data_format=None, dilation_rate=(1, 1), groups=1,

131

activation=None, use_bias=True, **kwargs):

132

"""

133

2D convolution layer.

134

135

Parameters:

136

- filters: Number of output filters

137

- kernel_size: Size of convolution window

138

- strides: Stride of convolution

139

- padding: Padding mode ('valid' or 'same')

140

- data_format: Data format ('channels_last' or 'channels_first')

141

- dilation_rate: Dilation rate for dilated convolution

142

- groups: Number of groups for grouped convolution

143

- activation: Activation function

144

- use_bias: Whether to use bias

145

"""

146

147

class Conv1D(Layer):

148

def __init__(self, filters, kernel_size, strides=1, padding='valid',

149

data_format='channels_last', dilation_rate=1, groups=1,

150

activation=None, use_bias=True, **kwargs):

151

"""1D convolution layer."""

152

153

class Conv3D(Layer):

154

def __init__(self, filters, kernel_size, strides=(1, 1, 1), padding='valid',

155

data_format=None, dilation_rate=(1, 1, 1), groups=1,

156

activation=None, use_bias=True, **kwargs):

157

"""3D convolution layer."""

158

159

class Conv2DTranspose(Layer):

160

def __init__(self, filters, kernel_size, strides=(1, 1), padding='valid',

161

output_padding=None, data_format=None, dilation_rate=(1, 1),

162

activation=None, use_bias=True, **kwargs):

163

"""2D transposed convolution layer."""

164

165

class DepthwiseConv2D(Layer):

166

def __init__(self, kernel_size, strides=(1, 1), padding='valid',

167

depth_multiplier=1, data_format=None, dilation_rate=(1, 1),

168

activation=None, use_bias=True, **kwargs):

169

"""2D depthwise convolution layer."""

170

171

class SeparableConv2D(Layer):

172

def __init__(self, filters, kernel_size, strides=(1, 1), padding='valid',

173

data_format=None, dilation_rate=(1, 1), depth_multiplier=1,

174

activation=None, use_bias=True, **kwargs):

175

"""2D separable convolution layer."""

176

```

177

178

### Pooling Layers

179

180

Pooling operations for downsampling feature maps using max pooling, average pooling, and global pooling variants.

181

182

```python { .api }

183

class MaxPooling2D(Layer):

184

def __init__(self, pool_size=(2, 2), strides=None, padding='valid',

185

data_format=None, **kwargs):

186

"""

187

2D max pooling layer.

188

189

Parameters:

190

- pool_size: Size of pooling window

191

- strides: Stride of pooling operation

192

- padding: Padding mode

193

- data_format: Data format

194

"""

195

196

class AveragePooling2D(Layer):

197

def __init__(self, pool_size=(2, 2), strides=None, padding='valid',

198

data_format=None, **kwargs):

199

"""2D average pooling layer."""

200

201

class GlobalMaxPooling2D(Layer):

202

def __init__(self, data_format=None, keepdims=False, **kwargs):

203

"""

204

Global max pooling for 2D data.

205

206

Parameters:

207

- data_format: Data format

208

- keepdims: Whether to keep spatial dimensions

209

"""

210

211

class GlobalAveragePooling2D(Layer):

212

def __init__(self, data_format=None, keepdims=False, **kwargs):

213

"""Global average pooling for 2D data."""

214

```

215

216

### Recurrent Layers

217

218

Recurrent neural network layers including LSTM, GRU, and simple RNN variants for sequence processing.

219

220

```python { .api }

221

class LSTM(Layer):

222

def __init__(self, units, activation='tanh', recurrent_activation='sigmoid',

223

use_bias=True, kernel_initializer='glorot_uniform',

224

recurrent_initializer='orthogonal', bias_initializer='zeros',

225

dropout=0.0, recurrent_dropout=0.0, return_sequences=False,

226

return_state=False, go_backwards=False, stateful=False,

227

unroll=False, **kwargs):

228

"""

229

Long Short-Term Memory layer.

230

231

Parameters:

232

- units: Dimensionality of output space

233

- activation: Activation function for gates

234

- recurrent_activation: Activation function for recurrent step

235

- use_bias: Whether to use bias vectors

236

- kernel_initializer: Initializer for input weights

237

- recurrent_initializer: Initializer for recurrent weights

238

- bias_initializer: Initializer for bias vectors

239

- dropout: Dropout rate for input connections

240

- recurrent_dropout: Dropout rate for recurrent connections

241

- return_sequences: Whether to return full sequence or last output

242

- return_state: Whether to return last state in addition to output

243

- go_backwards: Whether to process sequence backwards

244

- stateful: Whether to reset states between batches

245

- unroll: Whether to unroll the recurrent loop

246

"""

247

248

class GRU(Layer):

249

def __init__(self, units, activation='tanh', recurrent_activation='sigmoid',

250

use_bias=True, dropout=0.0, recurrent_dropout=0.0,

251

return_sequences=False, return_state=False, **kwargs):

252

"""Gated Recurrent Unit layer."""

253

254

class SimpleRNN(Layer):

255

def __init__(self, units, activation='tanh', use_bias=True, dropout=0.0,

256

recurrent_dropout=0.0, return_sequences=False,

257

return_state=False, **kwargs):

258

"""Simple RNN layer."""

259

260

class Bidirectional(Layer):

261

def __init__(self, layer, merge_mode='concat', weights=None, **kwargs):

262

"""

263

Bidirectional wrapper for RNNs.

264

265

Parameters:

266

- layer: RNN layer to wrap

267

- merge_mode: How to combine forward and backward outputs

268

- weights: Initial weights

269

"""

270

```

271

272

### Attention Layers

273

274

Attention mechanisms for focusing on relevant parts of input sequences and implementing transformer-style architectures.

275

276

```python { .api }

277

class MultiHeadAttention(Layer):

278

def __init__(self, num_heads, key_dim, value_dim=None, dropout=0.0,

279

use_bias=True, output_shape=None, **kwargs):

280

"""

281

Multi-head attention layer.

282

283

Parameters:

284

- num_heads: Number of attention heads

285

- key_dim: Size of each attention head for query and key

286

- value_dim: Size of each attention head for value

287

- dropout: Dropout probability for attention weights

288

- use_bias: Whether to use bias in linear projections

289

- output_shape: Expected shape of output tensor

290

"""

291

292

class Attention(Layer):

293

def __init__(self, use_scale=False, score_mode='dot', **kwargs):

294

"""

295

Attention layer for computing attention weights.

296

297

Parameters:

298

- use_scale: Whether to scale attention scores

299

- score_mode: Type of attention score computation

300

"""

301

```

302

303

### Normalization Layers

304

305

Normalization techniques for stabilizing and accelerating training, including batch normalization, layer normalization, and group normalization.

306

307

```python { .api }

308

class BatchNormalization(Layer):

309

def __init__(self, axis=-1, momentum=0.99, epsilon=1e-3, center=True,

310

scale=True, beta_initializer='zeros', gamma_initializer='ones',

311

**kwargs):

312

"""

313

Batch normalization layer.

314

315

Parameters:

316

- axis: Axis to normalize along

317

- momentum: Momentum for moving statistics

318

- epsilon: Small constant for numerical stability

319

- center: Whether to add learned offset parameter

320

- scale: Whether to add learned scaling parameter

321

- beta_initializer: Initializer for beta parameter

322

- gamma_initializer: Initializer for gamma parameter

323

"""

324

325

class LayerNormalization(Layer):

326

def __init__(self, axis=-1, epsilon=1e-3, center=True, scale=True,

327

beta_initializer='zeros', gamma_initializer='ones', **kwargs):

328

"""Layer normalization layer."""

329

330

class GroupNormalization(Layer):

331

def __init__(self, groups=32, axis=-1, epsilon=1e-3, center=True,

332

scale=True, **kwargs):

333

"""

334

Group normalization layer.

335

336

Parameters:

337

- groups: Number of groups for normalization

338

- axis: Axis to normalize along

339

- epsilon: Small constant for numerical stability

340

- center: Whether to add learned offset parameter

341

- scale: Whether to add learned scaling parameter

342

"""

343

```

344

345

### Regularization Layers

346

347

Layers for regularization including various dropout techniques and noise injection to prevent overfitting.

348

349

```python { .api }

350

class Dropout(Layer):

351

def __init__(self, rate, noise_shape=None, seed=None, **kwargs):

352

"""

353

Dropout layer for regularization.

354

355

Parameters:

356

- rate: Fraction of input units to drop

357

- noise_shape: Shape of binary dropout mask

358

- seed: Random seed for dropout

359

"""

360

361

class SpatialDropout2D(Layer):

362

def __init__(self, rate, data_format=None, **kwargs):

363

"""

364

2D spatial dropout layer.

365

366

Parameters:

367

- rate: Fraction of input units to drop

368

- data_format: Data format

369

"""

370

371

class GaussianNoise(Layer):

372

def __init__(self, stddev, **kwargs):

373

"""

374

Gaussian noise regularization layer.

375

376

Parameters:

377

- stddev: Standard deviation of noise distribution

378

"""

379

380

class GaussianDropout(Layer):

381

def __init__(self, rate, **kwargs):

382

"""

383

Multiplicative Gaussian noise layer.

384

385

Parameters:

386

- rate: Drop probability as in Dropout

387

"""

388

```

389

390

### Activation Layers

391

392

Activation functions implemented as layers for explicit control and custom activation patterns.

393

394

```python { .api }

395

class Activation(Layer):

396

def __init__(self, activation, **kwargs):

397

"""

398

Activation layer.

399

400

Parameters:

401

- activation: Name of activation function or callable

402

"""

403

404

class ReLU(Layer):

405

def __init__(self, max_value=None, negative_slope=0.0, threshold=0.0, **kwargs):

406

"""

407

ReLU activation layer.

408

409

Parameters:

410

- max_value: Maximum activation value

411

- negative_slope: Slope for negative values

412

- threshold: Threshold value for activation

413

"""

414

415

class LeakyReLU(Layer):

416

def __init__(self, alpha=0.3, **kwargs):

417

"""

418

Leaky ReLU activation layer.

419

420

Parameters:

421

- alpha: Slope for negative values

422

"""

423

424

class ELU(Layer):

425

def __init__(self, alpha=1.0, **kwargs):

426

"""

427

ELU activation layer.

428

429

Parameters:

430

- alpha: Scale for negative values

431

"""

432

433

class Softmax(Layer):

434

def __init__(self, axis=-1, **kwargs):

435

"""

436

Softmax activation layer.

437

438

Parameters:

439

- axis: Axis along which to apply softmax

440

"""

441

```

442

443

### Merging Layers

444

445

Layers for combining multiple input tensors through various operations like addition, concatenation, and element-wise operations.

446

447

```python { .api }

448

class Add(Layer):

449

def __init__(self, **kwargs):

450

"""Element-wise addition layer."""

451

452

class Concatenate(Layer):

453

def __init__(self, axis=-1, **kwargs):

454

"""

455

Concatenation layer.

456

457

Parameters:

458

- axis: Axis along which to concatenate

459

"""

460

461

class Multiply(Layer):

462

def __init__(self, **kwargs):

463

"""Element-wise multiplication layer."""

464

465

class Average(Layer):

466

def __init__(self, **kwargs):

467

"""Element-wise averaging layer."""

468

469

class Maximum(Layer):

470

def __init__(self, **kwargs):

471

"""Element-wise maximum layer."""

472

473

class Minimum(Layer):

474

def __init__(self, **kwargs):

475

"""Element-wise minimum layer."""

476

477

class Dot(Layer):

478

def __init__(self, axes, normalize=False, **kwargs):

479

"""

480

Dot product layer.

481

482

Parameters:

483

- axes: Axes to compute dot product over

484

- normalize: Whether to normalize inputs

485

"""

486

```

487

488

## Usage Examples

489

490

### Building a CNN

491

492

```python

493

import keras

494

from keras import layers

495

496

model = keras.Sequential([

497

layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),

498

layers.MaxPooling2D((2, 2)),

499

layers.Conv2D(64, (3, 3), activation='relu'),

500

layers.MaxPooling2D((2, 2)),

501

layers.Conv2D(64, (3, 3), activation='relu'),

502

layers.Flatten(),

503

layers.Dense(64, activation='relu'),

504

layers.Dense(10, activation='softmax')

505

])

506

```

507

508

### Building an LSTM for Sequence Processing

509

510

```python

511

import keras

512

from keras import layers

513

514

model = keras.Sequential([

515

layers.Embedding(10000, 128, input_length=100),

516

layers.LSTM(64, dropout=0.2, recurrent_dropout=0.2),

517

layers.Dense(1, activation='sigmoid')

518

])

519

```

520

521

### Using Functional API for Complex Architecture

522

523

```python

524

import keras

525

from keras import layers

526

527

inputs = keras.Input(shape=(784,))

528

x = layers.Dense(128, activation='relu')(inputs)

529

x = layers.Dropout(0.2)(x)

530

branch1 = layers.Dense(64, activation='relu', name='branch1')(x)

531

branch2 = layers.Dense(64, activation='relu', name='branch2')(x)

532

merged = layers.Add()([branch1, branch2])

533

outputs = layers.Dense(10, activation='softmax')(merged)

534

535

model = keras.Model(inputs=inputs, outputs=outputs)

536

```