or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

admin.mdadvanced.mdclient-server.mdcredentials-security.mderrors-status.mdhealth.mdindex.mdinterceptors.mdload-balancing.mdmetadata-context.mdname-resolution.mdobservability.mdreflection.mdstreaming.mdtesting.mdxds.md
tile.json

load-balancing.mddocs/

Load Balancing

This document covers load balancing in gRPC-Go, including built-in policies, custom balancer implementation, service configuration, and SubConn management.

Overview

The balancer package defines APIs for load balancing in gRPC. Load balancers manage SubConns (sub-connections) to backend servers and decide which SubConn to use for each RPC.

import "google.golang.org/grpc/balancer"

All APIs in the balancer package are experimental.

Service Configuration

Load balancing policies can be configured via service config:

import "google.golang.org/grpc"

// Configure via dial option
serviceConfig := `{
    "loadBalancingPolicy": "round_robin"
}`

conn, err := grpc.NewClient("dns:///example.com:8080",
    grpc.WithDefaultServiceConfig(serviceConfig),
    grpc.WithTransportCredentials(creds))

// Disable service config from resolver
conn, err := grpc.NewClient("example.com:8080",
    grpc.WithDisableServiceConfig(),
    grpc.WithTransportCredentials(creds))

Service Config Format

{
  "loadBalancingPolicy": "round_robin",
  "loadBalancingConfig": [
    {
      "round_robin": {}
    }
  ],
  "methodConfig": [
    {
      "name": [
        {
          "service": "myservice",
          "method": "MyMethod"
        }
      ],
      "waitForReady": true,
      "timeout": "30s",
      "maxRequestMessageBytes": 4194304,
      "maxResponseMessageBytes": 4194304
    }
  ]
}

Built-in Load Balancing Policies

Pick First

The default load balancing policy. Tries to connect to addresses in order and uses the first successful connection.

import "google.golang.org/grpc"

const PickFirstBalancerName = "pick_first"

// Pick first is used by default
conn, err := grpc.NewClient("example.com:8080",
    grpc.WithTransportCredentials(creds))

// Explicit configuration
serviceConfig := `{"loadBalancingPolicy": "pick_first"}`
conn, err := grpc.NewClient("example.com:8080",
    grpc.WithDefaultServiceConfig(serviceConfig),
    grpc.WithTransportCredentials(creds))

Round Robin

Distributes requests evenly across all available backends.

// Enable round robin load balancing
serviceConfig := `{"loadBalancingPolicy": "round_robin"}`
conn, err := grpc.NewClient("dns:///example.com:8080",
    grpc.WithDefaultServiceConfig(serviceConfig),
    grpc.WithTransportCredentials(creds))

Example with DNS resolver:

import (
    "google.golang.org/grpc"
    "google.golang.org/grpc/credentials"
)

// DNS resolver returns multiple IPs, round robin distributes across them
creds, _ := credentials.NewClientTLSFromFile("ca.pem", "")
serviceConfig := `{"loadBalancingPolicy": "round_robin"}`

conn, err := grpc.NewClient("dns:///myservice.example.com:443",
    grpc.WithDefaultServiceConfig(serviceConfig),
    grpc.WithTransportCredentials(creds))
if err != nil {
    log.Fatalf("failed to connect: %v", err)
}
defer conn.Close()

client := pb.NewMyServiceClient(conn)

Balancer Interface

Core Interface

type Balancer interface {
    // UpdateClientConnState is called when the state of ClientConn changes
    // Returns ErrBadResolverState to trigger ResolveNow with exponential backoff
    UpdateClientConnState(ClientConnState) error

    // ResolverError is called when the name resolver reports an error
    ResolverError(error)

    // UpdateSubConnState is called when the state of a SubConn changes
    // Deprecated: Use NewSubConnOptions.StateListener instead
    UpdateSubConnState(SubConn, SubConnState)

    // Close closes the balancer
    // Should call SubConn.Shutdown for existing SubConns
    Close()

    // ExitIdle instructs the LB policy to reconnect/exit IDLE state
    ExitIdle()
}

Builder Interface

type Builder interface {
    // Build creates a new balancer with the ClientConn
    Build(cc ClientConn, opts BuildOptions) Balancer

    // Name returns the name of balancers built by this builder
    // Used to pick balancers (e.g., in service config)
    Name() string
}

// Register registers the balancer builder to the balancer map
// Must only be called during initialization (e.g., in init())
// Not thread-safe
func Register(b Builder)

// Get returns the balancer builder registered with the given name
// Case-insensitive. Returns nil if not found.
func Get(name string) Builder

Build Options

type BuildOptions struct {
    // DialCreds is transport credentials for remote load balancer communication
    DialCreds credentials.TransportCredentials

    // CredsBundle is credentials bundle for remote load balancer communication
    CredsBundle credentials.Bundle

    // Dialer for remote load balancer communication
    Dialer func(context.Context, string) (net.Conn, error)

    // Authority for authentication handshake with remote load balancer
    Authority string

    // ChannelzParent is the parent ClientConn's channelz channel
    ChannelzParent channelz.Identifier

    // CustomUserAgent to set on ClientConn created by balancer
    CustomUserAgent string

    // Target contains the parsed dial target address info
    Target resolver.Target
}

Client Connection State

type ClientConnState struct {
    // ResolverState from the name resolver
    ResolverState resolver.State

    // BalancerConfig is the parsed load balancing configuration
    // Returned by the builder's ParseConfig method
    BalancerConfig serviceconfig.LoadBalancingConfig
}

ClientConn Interface

The ClientConn interface provided to balancers:

type ClientConn interface {
    // NewSubConn creates a new SubConn
    // Doesn't block; connection attempts start on Connect() call
    // Deprecated: SubConns will only support one address in future
    NewSubConn([]resolver.Address, NewSubConnOptions) (SubConn, error)

    // RemoveSubConn removes and shuts down the SubConn
    // Deprecated: use SubConn.Shutdown instead
    RemoveSubConn(SubConn)

    // UpdateAddresses updates addresses in the SubConn
    // Deprecated: create new SubConns for new addresses instead
    UpdateAddresses(SubConn, []resolver.Address)

    // UpdateState notifies gRPC that balancer's internal state has changed
    // gRPC will update ClientConn connectivity state and call Pick on new Picker
    UpdateState(State)

    // ResolveNow notifies gRPC to perform name resolution
    ResolveNow(resolver.ResolveNowOptions)

    // Target returns the dial target for this ClientConn
    // Deprecated: use Target field in BuildOptions instead
    Target() string

    // MetricsRecorder provides metrics recorder for balancers
    MetricsRecorder() estats.MetricsRecorder
}

Balancer State

type State struct {
    // ConnectivityState of the balancer
    // Used to determine the ClientConn state
    ConnectivityState connectivity.State

    // Picker is used to choose SubConns for RPCs
    Picker Picker
}

SubConn Management

SubConn Interface

type SubConn interface {
    // UpdateAddresses updates addresses used in this SubConn
    // Deprecated: create new SubConns for new addresses instead
    UpdateAddresses([]resolver.Address)

    // Connect starts connecting for this SubConn
    Connect()

    // GetOrBuildProducer returns a Producer for this SubConn
    // Creates new one if doesn't exist
    // Should only be called on SubConn in state Ready
    GetOrBuildProducer(ProducerBuilder) (p Producer, close func())

    // Shutdown shuts down the SubConn gracefully
    // Started RPCs allowed to complete
    // Final state update delivered with ConnectivityState Shutdown
    Shutdown()

    // RegisterHealthListener registers a health listener for Ready SubConn
    // Only one listener can be registered at a time
    // Register each time SubConn's connectivity state changes to READY
    RegisterHealthListener(func(SubConnState))
}

SubConn Lifecycle

All SubConns start in IDLE and require explicit Connect() call:

  1. IDLE → Call Connect() to trigger connection
  2. CONNECTING → Attempting to connect
  3. READY → Connection established successfully
  4. TRANSIENT_FAILURE → Connection failed, will backoff then return to IDLE
  5. SHUTDOWN → SubConn shut down
// Create SubConn with state listener
sc, err := cc.NewSubConn([]resolver.Address{{Addr: "example.com:443"}},
    balancer.NewSubConnOptions{
        StateListener: func(state balancer.SubConnState) {
            switch state.ConnectivityState {
            case connectivity.Ready:
                fmt.Println("SubConn ready")
            case connectivity.TransientFailure:
                fmt.Printf("SubConn failed: %v\n", state.ConnectionError)
            case connectivity.Idle:
                // Reconnect on idle
                sc.Connect()
            }
        },
    })
if err != nil {
    log.Fatal(err)
}

// Trigger connection
sc.Connect()

SubConn Options

type NewSubConnOptions struct {
    // CredsBundle for the SubConn
    // Deprecated: Use Attributes field in resolver.Address
    CredsBundle credentials.Bundle

    // HealthCheckEnabled enables health check service on this SubConn
    HealthCheckEnabled bool

    // StateListener is called when SubConn state changes
    // If nil, Balancer.UpdateSubConnState is called instead
    StateListener func(SubConnState)
}

SubConn State

type SubConnState struct {
    // ConnectivityState of the SubConn
    ConnectivityState connectivity.State

    // ConnectionError set if ConnectivityState is TransientFailure
    ConnectionError error
}

Picker Interface

Pickers select which SubConn to use for each RPC:

type Picker interface {
    // Pick returns the connection to use for this RPC
    // Should not block
    // Returns PickResult or error
    Pick(info PickInfo) (PickResult, error)
}

Pick Info

type PickInfo struct {
    // FullMethodName called with NewClientStream
    // Format: /service/Method
    FullMethodName string

    // Ctx is the RPC's context
    // May contain RPC-level information like outgoing headers
    Ctx context.Context
}

Pick Result

type PickResult struct {
    // SubConn to use for this pick (must be Ready)
    // Must be one returned by ClientConn.NewSubConn
    SubConn SubConn

    // Done is called when RPC completes
    // Called with nil if SubConn not ready
    // May be nil if balancer doesn't need notification
    Done func(DoneInfo)

    // Metadata to inject for this call
    // Merged with existing metadata from client application
    Metadata metadata.MD
}

Done Info

type DoneInfo struct {
    // Err is the RPC error (may be nil)
    Err error

    // Trailer contains metadata from RPC's trailer
    Trailer metadata.MD

    // BytesSent indicates if bytes were sent to server
    BytesSent bool

    // BytesReceived indicates if bytes received from server
    BytesReceived bool

    // ServerLoad is load received from server (usually trailing metadata)
    // Supported type: *orca_v3.LoadReport
    ServerLoad any
}

Pick Errors

var (
    // ErrNoSubConnAvailable: gRPC will block RPC until new picker available
    ErrNoSubConnAvailable = errors.New("no SubConn is available")

    // ErrTransientFailure: wait-for-ready RPCs block, others fail
    // Deprecated: return appropriate error instead
    ErrTransientFailure = errors.New("all SubConns are in TransientFailure")
)

// If Pick returns error:
// - ErrNoSubConnAvailable: gRPC blocks until UpdateState provides new picker
// - Status error: RPC terminated with that code and message
// - Other errors: wait-for-ready RPCs wait, others fail with Unavailable

Custom Balancer Implementation

Simple Custom Balancer

import (
    "sync"
    "google.golang.org/grpc/balancer"
    "google.golang.org/grpc/balancer/base"
    "google.golang.org/grpc/connectivity"
)

// Custom picker
type myPicker struct {
    subConns []balancer.SubConn
    mu       sync.Mutex
    next     int
}

func (p *myPicker) Pick(info balancer.PickInfo) (balancer.PickResult, error) {
    p.mu.Lock()
    defer p.mu.Unlock()

    if len(p.subConns) == 0 {
        return balancer.PickResult{}, balancer.ErrNoSubConnAvailable
    }

    // Simple round-robin
    sc := p.subConns[p.next]
    p.next = (p.next + 1) % len(p.subConns)

    return balancer.PickResult{
        SubConn: sc,
        Done: func(info balancer.DoneInfo) {
            // Track RPC completion if needed
        },
    }, nil
}

// Custom balancer
type myBalancer struct {
    cc       balancer.ClientConn
    subConns map[resolver.Address]balancer.SubConn
    mu       sync.Mutex
}

func (b *myBalancer) UpdateClientConnState(state balancer.ClientConnState) error {
    b.mu.Lock()
    defer b.mu.Unlock()

    // Create SubConns for new addresses
    addrs := state.ResolverState.Addresses
    for _, addr := range addrs {
        if _, ok := b.subConns[addr]; !ok {
            sc, err := b.cc.NewSubConn([]resolver.Address{addr},
                balancer.NewSubConnOptions{
                    StateListener: func(state balancer.SubConnState) {
                        b.updateState()
                    },
                })
            if err != nil {
                continue
            }
            b.subConns[addr] = sc
            sc.Connect()
        }
    }

    // Remove SubConns for removed addresses
    for addr, sc := range b.subConns {
        found := false
        for _, a := range addrs {
            if a == addr {
                found = true
                break
            }
        }
        if !found {
            sc.Shutdown()
            delete(b.subConns, addr)
        }
    }

    b.updateState()
    return nil
}

func (b *myBalancer) updateState() {
    b.mu.Lock()
    defer b.mu.Unlock()

    // Collect ready SubConns
    var readySCs []balancer.SubConn
    for _, sc := range b.subConns {
        // State tracking omitted for brevity
        readySCs = append(readySCs, sc)
    }

    if len(readySCs) > 0 {
        b.cc.UpdateState(balancer.State{
            ConnectivityState: connectivity.Ready,
            Picker: &myPicker{subConns: readySCs},
        })
    } else {
        b.cc.UpdateState(balancer.State{
            ConnectivityState: connectivity.Connecting,
            Picker: base.NewErrPicker(balancer.ErrNoSubConnAvailable),
        })
    }
}

func (b *myBalancer) ResolverError(err error) {
    // Handle resolver error
}

func (b *myBalancer) UpdateSubConnState(sc balancer.SubConn, state balancer.SubConnState) {
    // Deprecated: use StateListener in NewSubConnOptions
}

func (b *myBalancer) Close() {
    b.mu.Lock()
    defer b.mu.Unlock()

    for _, sc := range b.subConns {
        sc.Shutdown()
    }
}

func (b *myBalancer) ExitIdle() {
    b.mu.Lock()
    defer b.mu.Unlock()

    for _, sc := range b.subConns {
        sc.Connect()
    }
}

// Builder
type myBalancerBuilder struct{}

func (bb *myBalancerBuilder) Build(cc balancer.ClientConn, opts balancer.BuildOptions) balancer.Balancer {
    return &myBalancer{
        cc:       cc,
        subConns: make(map[resolver.Address]balancer.SubConn),
    }
}

func (bb *myBalancerBuilder) Name() string {
    return "my_balancer"
}

// Register in init
func init() {
    balancer.Register(&myBalancerBuilder{})
}

Base Balancer

The base package provides utilities for building simple balancers:

import "google.golang.org/grpc/balancer/base"

type PickerBuilder interface {
    // Build returns a picker that will be used by gRPC
    Build(info PickerBuildInfo) balancer.Picker
}

type PickerBuildInfo struct {
    // ReadySCs is a map from ready SubConns to their addresses
    ReadySCs map[balancer.SubConn]SubConnInfo
}

type SubConnInfo struct {
    // Address used to create this SubConn
    Address resolver.Address
}

type Config struct {
    // HealthCheck indicates whether health checking should be enabled
    HealthCheck bool
}

// NewBalancerBuilder returns a base balancer builder
func NewBalancerBuilder(name string, pb PickerBuilder, config Config) balancer.Builder

// NewErrPicker returns a Picker that always returns err
func NewErrPicker(err error) balancer.Picker

Example using base balancer:

import (
    "google.golang.org/grpc/balancer"
    "google.golang.org/grpc/balancer/base"
)

type myPickerBuilder struct{}

func (pb *myPickerBuilder) Build(info base.PickerBuildInfo) balancer.Picker {
    if len(info.ReadySCs) == 0 {
        return base.NewErrPicker(balancer.ErrNoSubConnAvailable)
    }

    var scs []balancer.SubConn
    for sc := range info.ReadySCs {
        scs = append(scs, sc)
    }

    return &myPicker{subConns: scs}
}

func init() {
    balancer.Register(base.NewBalancerBuilder("my_lb", &myPickerBuilder{}, base.Config{}))
}

Config Parser

Implement ConfigParser to parse custom load balancing config:

type ConfigParser interface {
    // ParseConfig parses JSON load balancer config
    // Unknown fields should be ignored for forward compatibility
    ParseConfig(json.RawMessage) (serviceconfig.LoadBalancingConfig, error)
}

// If Builder implements ConfigParser, ParseConfig is called when
// new service configs are received, and result is provided in UpdateClientConnState

Connectivity State Evaluator

Helper for aggregating SubConn connectivity states:

type ConnectivityStateEvaluator struct {
    // Has unexported fields
}

// CurrentState returns current aggregate connection state
func (cse *ConnectivityStateEvaluator) CurrentState() connectivity.State

// RecordTransition records state change and returns new aggregate state
// - If at least one SubConn Ready → aggregated state is Ready
// - Else if at least one SubConn Connecting → aggregated state is Connecting
// - Else if at least one SubConn Idle → aggregated state is Idle
// - Else if at least one SubConn TransientFailure or no SubConns → TransientFailure
// Shutdown is not considered
func (cse *ConnectivityStateEvaluator) RecordTransition(oldState, newState connectivity.State) connectivity.State

Producer Pattern

For advanced use cases requiring per-SubConn state:

type Producer any

type ProducerBuilder interface {
    // Build creates a Producer for the SubConn
    // First parameter is grpc.ClientConnInterface for creating RPCs/streams
    // Returns producer and close function
    Build(grpcClientConnInterface any) (p Producer, close func())
}

// GetOrBuildProducer on SubConn
producer, close := subConn.GetOrBuildProducer(myProducerBuilder)
defer close()

Best Practices

Balancer Implementation

  1. State management: Track SubConn states carefully, use StateListener
  2. Thread safety: Protect shared state with mutexes
  3. Graceful shutdown: Call SubConn.Shutdown() in Close()
  4. Error handling: Return ErrBadResolverState to trigger re-resolution
  5. ExitIdle: Implement to support connection on demand

Picker Implementation

  1. Don't block: Pick should return immediately
  2. Handle no backends: Return ErrNoSubConnAvailable if no SubConns ready
  3. Use Done callback: Track RPC outcomes for advanced policies
  4. Metadata injection: Add per-call metadata in PickResult if needed

Registration

  1. Register in init(): Call balancer.Register() in package init function
  2. Unique names: Use unique, descriptive names (lowercase by convention)
  3. Thread safety: Registration is not thread-safe, must be done at init time

Testing

import (
    "testing"
    "google.golang.org/grpc/balancer"
    "google.golang.org/grpc/resolver"
)

func TestMyBalancer(t *testing.T) {
    // Create test balancer
    builder := balancer.Get("my_balancer")
    if builder == nil {
        t.Fatal("balancer not registered")
    }

    // Test with mock ClientConn
    // See balancer/balancer_test.go for examples
}