CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-keras-nightly

Multi-backend deep learning framework providing a unified API for building and training neural networks across JAX, TensorFlow, PyTorch, and OpenVINO backends

Pending
Overview
Eval results
Files

initializers.mddocs/

Weight Initializers

Comprehensive collection of weight initialization strategies for neural network layers. Proper weight initialization is crucial for training stability and convergence speed. Keras provides various initializers from simple constant values to sophisticated variance-scaling methods based on layer characteristics.

Capabilities

Constant Initializers

Initializers that set weights to constant values or specific patterns.

class Zeros:
    """Initialize weights to zero."""
    def __init__(self): ...

class Ones:
    """Initialize weights to one."""
    def __init__(self): ...

class Constant:
    """Initialize weights to a constant value."""
    def __init__(self, value=0.0): ...

class Identity:
    """Initialize weights as identity matrix (for square matrices)."""
    def __init__(self, gain=1.0): ...

class STFT:
    """Short-Time Fourier Transform initializer."""
    def __init__(self, fft_length=128, window_length=128, window_step=32): ...

Random Initializers

Random initialization strategies with different distributions and scaling approaches.

class RandomNormal:
    """Initialize weights with normal distribution."""
    def __init__(self, mean=0.0, stddev=0.05, seed=None): ...

class RandomUniform:
    """Initialize weights with uniform distribution."""
    def __init__(self, minval=-0.05, maxval=0.05, seed=None): ...

class TruncatedNormal:
    """Initialize weights with truncated normal distribution."""
    def __init__(self, mean=0.0, stddev=0.05, seed=None): ...

class Orthogonal:
    """Initialize weights as orthogonal matrix."""
    def __init__(self, gain=1.0, seed=None): ...

class VarianceScaling:
    """Initialize weights with variance scaling."""
    def __init__(self, scale=1.0, mode='fan_in', distribution='truncated_normal', seed=None): ...

Xavier/Glorot Initializers

Xavier (Glorot) initialization methods that scale weights based on input and output dimensions.

class GlorotUniform:
    """Glorot uniform initializer (Xavier uniform)."""
    def __init__(self, seed=None): ...

class GlorotNormal:
    """Glorot normal initializer (Xavier normal)."""
    def __init__(self, seed=None): ...

He Initializers

He initialization methods optimized for ReLU activations.

class HeUniform:
    """He uniform initializer."""
    def __init__(self, seed=None): ...

class HeNormal:
    """He normal initializer."""
    def __init__(self, seed=None): ...

LeCun Initializers

LeCun initialization methods for SELU activations.

class LecunUniform:
    """LeCun uniform initializer."""
    def __init__(self, seed=None): ...

class LecunNormal:
    """LeCun normal initializer."""
    def __init__(self, seed=None): ...

Base Classes and Utilities

Base classes and utility functions for working with initializers.

class Initializer:
    """Base class for all initializers."""
    def __call__(self, shape, dtype=None, **kwargs): ...
    def get_config(self): ...
    
def get(identifier):
    """Retrieve an initializer by name or instance."""

def serialize(initializer):
    """Serialize an initializer to configuration."""

def deserialize(config, custom_objects=None):
    """Deserialize an initializer from configuration."""

Usage Examples

Basic Initialization

from keras import layers, initializers

# Using string identifiers
dense_layer = layers.Dense(64, kernel_initializer='he_normal')

# Using initializer classes
dense_layer = layers.Dense(64, 
                          kernel_initializer=initializers.HeNormal(),
                          bias_initializer=initializers.Zeros())

# Custom parameters
dense_layer = layers.Dense(64,
                          kernel_initializer=initializers.RandomNormal(mean=0.0, stddev=0.01),
                          bias_initializer=initializers.Constant(value=0.1))

Convolutional Layer Initialization

from keras import layers, initializers

# Convolutional layer with He initialization
conv_layer = layers.Conv2D(32, (3, 3),
                          kernel_initializer='he_uniform',
                          bias_initializer='zeros')

# With custom variance scaling
conv_layer = layers.Conv2D(32, (3, 3),
                          kernel_initializer=initializers.VarianceScaling(
                              scale=2.0, mode='fan_out', distribution='uniform'))

RNN Layer Initialization

from keras import layers, initializers

# LSTM with orthogonal recurrent weights
lstm_layer = layers.LSTM(128,
                        kernel_initializer='glorot_uniform',
                        recurrent_initializer='orthogonal',
                        bias_initializer='zeros')

# GRU with custom initialization
gru_layer = layers.GRU(64,
                      kernel_initializer=initializers.GlorotNormal(),
                      recurrent_initializer=initializers.Orthogonal(gain=1.0))

Custom Initializer

import keras
from keras import initializers

class CustomInitializer(initializers.Initializer):
    def __init__(self, scale=1.0):
        self.scale = scale
    
    def __call__(self, shape, dtype=None, **kwargs):
        # Custom initialization logic
        values = keras.random.normal(shape, dtype=dtype) * self.scale
        return values
    
    def get_config(self):
        return {'scale': self.scale}

# Use custom initializer
dense_layer = layers.Dense(64, kernel_initializer=CustomInitializer(scale=0.5))

Initialization Comparison

import keras
from keras import initializers
import numpy as np

# Compare different initializers for same shape
shape = (100, 50)

# Glorot (Xavier) initialization
glorot_weights = initializers.GlorotNormal()(shape)
print(f"Glorot std: {keras.ops.std(glorot_weights):.4f}")

# He initialization  
he_weights = initializers.HeNormal()(shape)
print(f"He std: {keras.ops.std(he_weights):.4f}")

# LeCun initialization
lecun_weights = initializers.LecunNormal()(shape)
print(f"LeCun std: {keras.ops.std(lecun_weights):.4f}")

Identity Initialization for Skip Connections

from keras import layers, initializers, models

# Identity initialization for residual connections
inputs = layers.Input(shape=(64,))
x = layers.Dense(64, kernel_initializer='he_normal')(inputs)
x = layers.ReLU()(x)

# Skip connection with identity initialization
skip = layers.Dense(64, kernel_initializer=initializers.Identity(gain=0.1))(inputs)
outputs = layers.Add()([x, skip])

model = models.Model(inputs, outputs)

Initialization Guidelines

By Activation Function

  • ReLU/Leaky ReLU: Use HeNormal or HeUniform
  • SELU: Use LecunNormal or LecunUniform
  • Tanh/Sigmoid: Use GlorotNormal or GlorotUniform
  • Linear: Use GlorotNormal or custom variance scaling

By Layer Type

  • Dense layers: Glorot (balanced) or He (with ReLU)
  • Convolutional layers: He initialization is commonly used
  • Recurrent layers: Orthogonal for recurrent weights, Glorot for input weights
  • Batch normalization: Ones for gamma, Zeros for beta
  • Embeddings: Random uniform or normal with small variance

General Principles

  • Avoid zero initialization for weights (except biases)
  • Consider activation function when choosing initializer
  • Use orthogonal initialization for recurrent connections
  • Adjust scale based on network depth and width

Install with Tessl CLI

npx tessl i tessl/pypi-keras-nightly

docs

activations.md

applications.md

backend-config.md

core-framework.md

index.md

initializers.md

layers.md

losses-metrics.md

operations.md

optimizers.md

preprocessing.md

regularizers.md

training-callbacks.md

tile.json