CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/golang-github-com-go-co-op-gocron-v2

A Golang job scheduling library that lets you run Go functions at pre-determined intervals using cron expressions, fixed durations, daily, weekly, monthly, or one-time schedules with support for distributed deployments.

Overview
Eval results
Files

distributed-locking.mddocs/guides/distributed/

Distributed Locking

Locker interface, implementations, and lock keys for per-job mutual exclusion.

Overview

Distributed locking provides per-job mutual exclusion across multiple instances. Unlike leader election (all-or-nothing), distributed locking allows different jobs to run on different instances while ensuring each job runs on only one instance at a time.

type Locker interface {
    Lock(ctx context.Context, key string) (Lock, error)
}

type Lock interface {
    Unlock(ctx context.Context) error
}

Interfaces

Locker

type Locker interface {
    Lock(ctx context.Context, key string) (Lock, error)
}

Lock attempts to acquire a distributed lock:

  • key: Job name (or qualified function name if unnamed)
  • Returns Lock if acquired successfully
  • Returns error if lock cannot be acquired (already held by another instance)

Lock

type Lock interface {
    Unlock(ctx context.Context) error
}

Unlock releases the lock after job completion.

Scheduler Options

Scheduler-Level Locker

func WithDistributedLocker(locker Locker) SchedulerOption

Applies to all jobs:

s, err := gocron.NewScheduler(
    gocron.WithDistributedLocker(locker),
)

Returns ErrWithDistributedLockerNil if locker is nil.

Per-Job Locker

func WithDistributedJobLocker(locker Locker) JobOption

Overrides scheduler-level locker for specific job:

j, _ := s.NewJob(
    gocron.DurationJob(time.Minute),
    gocron.NewTask(myFunc),
    gocron.WithDistributedJobLocker(customLocker), // Override
)

Returns ErrWithDistributedJobLockerNil if locker is nil.

Disable Locker for Job

func WithDisabledDistributedJobLocker(disabled bool) JobOption

Disables scheduler-level locker for specific job:

s, _ := gocron.NewScheduler(
    gocron.WithDistributedLocker(locker), // Applied to all jobs
)

// This job doesn't use distributed locking
j, _ := s.NewJob(
    gocron.DurationJob(time.Minute),
    gocron.NewTask(localOnlyFunc),
    gocron.WithDisabledDistributedJobLocker(true),
)

How It Works

Job Execution Flow

  1. Job scheduled time arrives
  2. Scheduler calls locker.Lock(ctx, jobName)
  3. If lock acquired → job runs → lock.Unlock() called
  4. If lock fails → job skipped (another instance running it)
Instance A:
  09:00 - Lock "job1" → success → Run job1 → Unlock
  09:01 - Lock "job2" → success → Run job2 → Unlock

Instance B:
  09:00 - Lock "job1" → fail (A holds it) → Skip
  09:01 - Lock "job2" → fail (A holds it) → Skip

Lock Keys

Lock key is the job's name (set via WithName):

// Lock key: "database-cleanup"
j, _ := s.NewJob(
    gocron.DurationJob(time.Minute),
    gocron.NewTask(myFunc),
    gocron.WithName("database-cleanup"), // This becomes the lock key
)

If no name set, qualified function name is used:

// Lock key: "mypackage.myFunc"
j, _ := s.NewJob(
    gocron.DurationJob(time.Minute),
    gocron.NewTask(myFunc), // No name → uses "mypackage.myFunc"
)

Important: Use WithName for stable, predictable lock keys.

When to Use Distributed Locking

Use distributed locking when:

  • Different jobs can run on different instances
  • Per-job mutual exclusion is sufficient
  • Clock synchronization is available (NTP)
  • You need flexible job distribution
  • Want to spread load across instances

Don't use distributed locking when:

  • Clock synchronization unavailable
  • All jobs must run on same instance
  • Simple all-or-nothing leadership needed

Use Leader Election instead for all-or-nothing control.

Implementation: Redis

Basic Redis Locker

import (
    "context"
    "errors"
    "time"

    "github.com/go-co-op/gocron/v2"
    "github.com/redis/go-redis/v9"
)

type redisLocker struct {
    client *redis.Client
}

func NewRedisLocker(client *redis.Client) *redisLocker {
    return &redisLocker{client: client}
}

func (l *redisLocker) Lock(ctx context.Context, key string) (gocron.Lock, error) {
    // TTL should be longer than max expected job duration
    ttl := 5 * time.Minute
    lockKey := "gocron:lock:" + key

    // Try to acquire lock
    ok, err := l.client.SetNX(ctx, lockKey, "1", ttl).Result()
    if err != nil {
        return nil, err
    }

    if !ok {
        return nil, errors.New("lock already held")
    }

    return &redisLock{client: l.client, key: lockKey}, nil
}

type redisLock struct {
    client *redis.Client
    key    string
}

func (l *redisLock) Unlock(ctx context.Context) error {
    return l.client.Del(ctx, l.key).Err()
}

Usage

func main() {
    client := redis.NewClient(&redis.Options{
        Addr: "localhost:6379",
    })
    defer client.Close()

    locker := NewRedisLocker(client)

    s, _ := gocron.NewScheduler(
        gocron.WithDistributedLocker(locker),
    )
    defer s.Shutdown()

    s.NewJob(
        gocron.DurationJob(time.Minute),
        gocron.NewTask(doWork),
        gocron.WithName("my-job"), // Lock key
    )

    s.Start()
    select {}
}

Redis Locker with Renewal

For long-running jobs, renew the lock periodically:

type renewableRedisLock struct {
    client *redis.Client
    key    string
    ttl    time.Duration
    cancel context.CancelFunc
}

func (l *redisLocker) Lock(ctx context.Context, key string) (gocron.Lock, error) {
    ttl := 5 * time.Minute
    lockKey := "gocron:lock:" + key

    ok, err := l.client.SetNX(ctx, lockKey, "1", ttl).Result()
    if err != nil {
        return nil, err
    }

    if !ok {
        return nil, errors.New("lock already held")
    }

    lock := &renewableRedisLock{
        client: l.client,
        key:    lockKey,
        ttl:    ttl,
    }

    lock.startRenewal(ctx)
    return lock, nil
}

func (l *renewableRedisLock) startRenewal(ctx context.Context) {
    renewCtx, cancel := context.WithCancel(ctx)
    l.cancel = cancel

    ticker := time.NewTicker(l.ttl / 2) // Renew at half TTL
    go func() {
        defer ticker.Stop()
        for {
            select {
            case <-renewCtx.Done():
                return
            case <-ticker.C:
                l.client.Expire(ctx, l.key, l.ttl)
            }
        }
    }()
}

func (l *renewableRedisLock) Unlock(ctx context.Context) error {
    l.cancel() // Stop renewal
    return l.client.Del(ctx, l.key).Err()
}

Implementation: PostgreSQL

Using Advisory Locks

import (
    "context"
    "database/sql"
    "errors"
    "hash/fnv"

    "github.com/go-co-op/gocron/v2"
    _ "github.com/lib/pq"
)

type postgresLocker struct {
    db *sql.DB
}

func NewPostgresLocker(db *sql.DB) *postgresLocker {
    return &postgresLocker{db: db}
}

func (l *postgresLocker) Lock(ctx context.Context, key string) (gocron.Lock, error) {
    // Hash the key to a 64-bit integer
    lockID := hashKeyToInt64(key)

    // Try to acquire advisory lock
    var acquired bool
    err := l.db.QueryRowContext(ctx, "SELECT pg_try_advisory_lock($1)", lockID).Scan(&acquired)
    if err != nil {
        return nil, err
    }

    if !acquired {
        return nil, errors.New("lock already held")
    }

    return &postgresLock{db: l.db, lockID: lockID}, nil
}

type postgresLock struct {
    db     *sql.DB
    lockID int64
}

func (l *postgresLock) Unlock(ctx context.Context) error {
    var released bool
    err := l.db.QueryRowContext(ctx, "SELECT pg_advisory_unlock($1)", l.lockID).Scan(&released)
    if err != nil {
        return err
    }

    if !released {
        return errors.New("lock was not held")
    }

    return nil
}

func hashKeyToInt64(key string) int64 {
    h := fnv.New64a()
    h.Write([]byte(key))
    return int64(h.Sum64())
}

Advantages:

  • No external dependencies (if already using PostgreSQL)
  • Automatic cleanup on connection close
  • No TTL management needed

Limitations:

  • Requires database connection per instance
  • Advisory locks are session-scoped

Implementation: Redlock (Multi-Redis)

Using Redsync for Redlock

import (
    "context"
    "errors"
    "time"

    "github.com/go-co-op/gocron/v2"
    "github.com/go-redsync/redsync/v4"
    "github.com/go-redsync/redsync/v4/redis/goredis/v9"
    "github.com/redis/go-redis/v9"
)

type redlockLocker struct {
    rs *redsync.Redsync
}

func NewRedlockLocker(clients []*redis.Client) *redlockLocker {
    pools := make([]redsync.Pool, len(clients))
    for i, client := range clients {
        pools[i] = goredis.NewPool(client)
    }

    rs := redsync.New(pools...)

    return &redlockLocker{rs: rs}
}

func (l *redlockLocker) Lock(ctx context.Context, key string) (gocron.Lock, error) {
    mutex := l.rs.NewMutex(
        "gocron:lock:"+key,
        redsync.WithExpiry(5*time.Minute),
        redsync.WithTries(3),
    )

    if err := mutex.LockContext(ctx); err != nil {
        return nil, errors.New("lock already held")
    }

    return &redlockLock{mutex: mutex}, nil
}

type redlockLock struct {
    mutex *redsync.Mutex
}

func (l *redlockLock) Unlock(ctx context.Context) error {
    _, err := l.mutex.UnlockContext(ctx)
    return err
}

Advantages:

  • Higher availability (multiple Redis instances)
  • Better fault tolerance

Best Practices

1. Use Stable Job Names

// GOOD: Stable, explicit name
j, _ := s.NewJob(
    gocron.DurationJob(time.Minute),
    gocron.NewTask(myFunc),
    gocron.WithName("database-cleanup"), // Explicit lock key
)

// BAD: Function name may change if code refactored
j, _ := s.NewJob(
    gocron.DurationJob(time.Minute),
    gocron.NewTask(myFunc), // Default key = "pkg.myFunc"
)

2. Set Appropriate Lock TTL

// TTL should be 2-3x max expected job duration
// Job takes 1-2 minutes → TTL = 5 minutes
ttl := 5 * time.Minute

Too short: Lock expires during job execution, another instance acquires it Too long: Failed instance holds lock longer, delays failover

3. Handle Lock Failures

j, _ := s.NewJob(
    gocron.DurationJob(time.Minute),
    gocron.NewTask(myFunc),
    gocron.WithName("my-job"),
    gocron.WithEventListeners(
        gocron.AfterLockError(func(jobID uuid.UUID, jobName string, err error) {
            log.Printf("Lock failed for %s: %v", jobName, err)
            metrics.IncrementLockFailure(jobName)
        }),
    ),
)

4. Synchronize Clocks

Use NTP to synchronize clocks:

# Install NTP
sudo apt-get install ntp

# Verify synchronization
ntpq -p

Clock skew impact:

  • < 1 second: Usually acceptable
  • 5 seconds: One instance consistently wins locks

5. Align Job Start Times

For distributed deployments, align jobs to time boundaries:

// Align to 5-minute boundaries
now := time.Now()
next5Min := now.Truncate(5*time.Minute).Add(5*time.Minute)

j, _ := s.NewJob(
    gocron.DurationJob(5*time.Minute),
    gocron.NewTask(myFunc),
    gocron.WithName("aligned-job"),
    gocron.WithStartAt(gocron.WithStartDateTime(next5Min)),
)

Reduces lock contention by ensuring all instances schedule at the same time.

6. Monitor Lock Contention

type monitoringLocker struct {
    locker Locker
}

func (l *monitoringLocker) Lock(ctx context.Context, key string) (gocron.Lock, error) {
    start := time.Now()
    lock, err := l.locker.Lock(ctx, key)
    duration := time.Since(start)

    if err != nil {
        metrics.IncrementLockFailure(key)
        log.Printf("Lock failed for %s after %v: %v", key, duration, err)
    } else {
        metrics.RecordLockAcquisition(key, duration.Seconds())
    }

    return lock, err
}

Advanced Patterns

Per-Job Locker Override

s, _ := gocron.NewScheduler(
    gocron.WithDistributedLocker(redisLocker),
)

// Most jobs use Redis locker
s.NewJob(
    gocron.DurationJob(time.Minute),
    gocron.NewTask(normalJob),
    gocron.WithName("normal-job"),
)

// Critical job uses PostgreSQL locker for stronger guarantees
s.NewJob(
    gocron.DurationJob(time.Minute),
    gocron.NewTask(criticalJob),
    gocron.WithName("critical-job"),
    gocron.WithDistributedJobLocker(postgresLocker), // Override
)

Lock-Free Local Jobs

s, _ := gocron.NewScheduler(
    gocron.WithDistributedLocker(locker),
)

// Distributed job
s.NewJob(
    gocron.DurationJob(time.Minute),
    gocron.NewTask(distributedJob),
    gocron.WithName("distributed-job"),
)

// Local-only job (no locking)
s.NewJob(
    gocron.DurationJob(time.Second),
    gocron.NewTask(localMetrics),
    gocron.WithDisabledDistributedJobLocker(true),
)

Custom Lock Keys

// Group jobs by tenant/resource
j, _ := s.NewJob(
    gocron.DurationJob(time.Minute),
    gocron.NewTask(func(tenantID string) {
        processTenant(tenantID)
    }, "tenant-123"),
    gocron.WithName("process-tenant:tenant-123"), // Lock per tenant
)

Troubleshooting

One Instance Always Wins Locks

Symptom: Same instance consistently acquires locks.

Causes:

  1. Clock skew between instances
  2. One instance schedules jobs earlier

Solutions:

  • Synchronize clocks with NTP
  • Align job start times to boundaries
  • Verify time.Now() returns similar values across instances

Jobs Not Running on Any Instance

Symptom: All instances fail to acquire locks.

Causes:

  1. Lock coordinator unavailable (Redis, PostgreSQL down)
  2. Stale locks from crashed instances
  3. Lock TTL too long

Solutions:

  • Check coordinator service health
  • Implement lock expiry (TTL)
  • Reduce lock TTL
  • Add lock cleanup job

Lock Not Released

Symptom: Lock held indefinitely, jobs stop running.

Causes:

  1. Instance crashed without calling Unlock()
  2. Lock TTL not set
  3. Unlock() not called

Solutions:

  • Always set lock TTL
  • Implement automatic expiry
  • Use defer or finally for Unlock()
  • Add lock monitoring

High Lock Contention

Symptom: Many lock failures, jobs often skipped.

Causes:

  1. Too many instances for job frequency
  2. Clock skew causing simultaneous attempts
  3. Job takes longer than interval

Solutions:

  • Reduce number of instances
  • Increase job interval
  • Stagger job schedules across instances
  • Sync clocks with NTP

Related Documentation

Install with Tessl CLI

npx tessl i tessl/golang-github-com-go-co-op-gocron-v2

docs

index.md

tile.json