or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

dirhash.mdgosumcheck.mdindex.mdmodfile.mdmodule.mdnote.mdsemver.mdstorage.mdsumdb.mdtlog.mdzip.md
tile.json

dirhash.mddocs/

dirhash - Directory Hashing for Module Verification

Package dirhash defines hashes over directory trees for verifying module content.

Import

import "golang.org/x/mod/sumdb/dirhash"

Overview

The dirhash package provides functions for computing cryptographic hashes of directory trees. These hashes are recorded in go.sum files and in the Go checksum database to allow verifying that a newly-downloaded module has the expected content.

The package supports hashing both file system directories and zip files, making it suitable for various module distribution scenarios.

Variables

var DefaultHash Hash = Hash1

DefaultHash is the default hash function used in new go.sum entries.

Types

Hash

type Hash func(files []string, open func(string) (io.ReadCloser, error)) (string, error)

A Hash is a directory hash function. It accepts a list of files along with a function that opens the content of each file. It opens, reads, hashes, and closes each file and returns the overall directory hash.

Functions

DirFiles

func DirFiles(dir, prefix string) ([]string, error)

Returns the list of files in the tree rooted at dir, replacing the directory name dir with prefix in each name. The resulting names always use forward slashes.

Example:

files, err := dirhash.DirFiles("./mymodule", "example.com/mymodule@v1.0.0")
if err != nil {
    log.Fatal(err)
}

for _, file := range files {
    fmt.Println(file)
}
// Output:
// example.com/mymodule@v1.0.0/go.mod
// example.com/mymodule@v1.0.0/main.go
// example.com/mymodule@v1.0.0/internal/helper.go

Hash1

func Hash1(files []string, open func(string) (io.ReadCloser, error)) (string, error)

Hash1 is the "h1:" directory hash function, using SHA-256.

Hash1 is "h1:" followed by the base64-encoded SHA-256 hash of a summary prepared as if by the Unix command:

sha256sum $(find . -type f | sort) | sha256sum

More precisely, the hashed summary contains a single line for each file in the list, ordered by slices.Sort applied to the file names, where each line consists of:

  • The hexadecimal SHA-256 hash of the file content
  • Two spaces (U+0020)
  • The file name
  • A newline (U+000A)

File names with newlines (U+000A) are disallowed.

Example:

files := []string{
    "example.com/module@v1.0.0/go.mod",
    "example.com/module@v1.0.0/main.go",
}

open := func(name string) (io.ReadCloser, error) {
    return os.Open(name)
}

hash, err := dirhash.Hash1(files, open)
if err != nil {
    log.Fatal(err)
}

fmt.Println(hash)
// Output: h1:abcdefghijklmnopqrstuvwxyz0123456789ABCD=

HashDir

func HashDir(dir, prefix string, hash Hash) (string, error)

Returns the hash of the local file system directory dir, replacing the directory name itself with prefix in the file names used in the hash function.

Example:

hash, err := dirhash.HashDir("./mymodule", "example.com/mymodule@v1.0.0", dirhash.DefaultHash)
if err != nil {
    log.Fatal(err)
}

fmt.Printf("Directory hash: %s\n", hash)
// Output: Directory hash: h1:...

HashZip

func HashZip(zipfile string, hash Hash) (string, error)

Returns the hash of the file content in the named zip file. Only the file names and their contents are included in the hash: the exact zip file format encoding, compression method, per-file modification times, and other metadata are ignored.

Example:

hash, err := dirhash.HashZip("module.zip", dirhash.DefaultHash)
if err != nil {
    log.Fatal(err)
}

fmt.Printf("Zip hash: %s\n", hash)
// Output: Zip hash: h1:...

Usage Examples

Computing Hash of a Directory

package main

import (
    "fmt"
    "log"

    "golang.org/x/mod/sumdb/dirhash"
)

func main() {
    // Compute hash of a module directory
    hash, err := dirhash.HashDir(
        "./path/to/module",
        "example.com/mymodule@v1.0.0",
        dirhash.DefaultHash,
    )
    if err != nil {
        log.Fatal(err)
    }

    fmt.Printf("Module hash: %s\n", hash)
}

Computing Hash of a Zip File

package main

import (
    "fmt"
    "log"

    "golang.org/x/mod/sumdb/dirhash"
)

func main() {
    // Compute hash of a module zip file
    hash, err := dirhash.HashZip("module.zip", dirhash.DefaultHash)
    if err != nil {
        log.Fatal(err)
    }

    fmt.Printf("Zip file hash: %s\n", hash)
}

Verifying Directory Matches Expected Hash

package main

import (
    "fmt"
    "log"

    "golang.org/x/mod/sumdb/dirhash"
)

func verifyModule(dir, prefix, expectedHash string) error {
    actualHash, err := dirhash.HashDir(dir, prefix, dirhash.DefaultHash)
    if err != nil {
        return fmt.Errorf("failed to compute hash: %w", err)
    }

    if actualHash != expectedHash {
        return fmt.Errorf("hash mismatch: got %s, want %s", actualHash, expectedHash)
    }

    return nil
}

func main() {
    err := verifyModule(
        "./mymodule",
        "example.com/mymodule@v1.0.0",
        "h1:abc123...",
    )
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println("Module verified successfully")
}

Listing Files Before Hashing

package main

import (
    "fmt"
    "log"

    "golang.org/x/mod/sumdb/dirhash"
)

func main() {
    // List all files that will be included in the hash
    files, err := dirhash.DirFiles("./mymodule", "example.com/mymodule@v1.0.0")
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println("Files to be hashed:")
    for _, file := range files {
        fmt.Printf("  %s\n", file)
    }

    // Now compute the hash
    hash, err := dirhash.HashDir(
        "./mymodule",
        "example.com/mymodule@v1.0.0",
        dirhash.DefaultHash,
    )
    if err != nil {
        log.Fatal(err)
    }

    fmt.Printf("\nDirectory hash: %s\n", hash)
}

Custom Hash Function

package main

import (
    "crypto/sha256"
    "encoding/base64"
    "fmt"
    "io"
    "log"
    "sort"

    "golang.org/x/mod/sumdb/dirhash"
)

// CustomHash implements a custom hashing algorithm
func CustomHash(files []string, open func(string) (io.ReadCloser, error)) (string, error) {
    // Sort files for consistent ordering
    sorted := make([]string, len(files))
    copy(sorted, files)
    sort.Strings(sorted)

    // Create a hash of all file contents
    h := sha256.New()

    for _, file := range sorted {
        r, err := open(file)
        if err != nil {
            return "", err
        }

        fileHash := sha256.New()
        if _, err := io.Copy(fileHash, r); err != nil {
            r.Close()
            return "", err
        }
        r.Close()

        // Write filename and hash to combined hash
        fmt.Fprintf(h, "%x  %s\n", fileHash.Sum(nil), file)
    }

    // Return with custom prefix
    return "custom:" + base64.StdEncoding.EncodeToString(h.Sum(nil)), nil
}

func main() {
    hash, err := dirhash.HashDir(
        "./mymodule",
        "example.com/mymodule@v1.0.0",
        CustomHash,
    )
    if err != nil {
        log.Fatal(err)
    }

    fmt.Printf("Custom hash: %s\n", hash)
}

Comparing Directory and Zip Hashes

package main

import (
    "fmt"
    "log"

    "golang.org/x/mod/sumdb/dirhash"
)

func main() {
    // Hash the directory
    dirHash, err := dirhash.HashDir(
        "./mymodule",
        "example.com/mymodule@v1.0.0",
        dirhash.DefaultHash,
    )
    if err != nil {
        log.Fatal(err)
    }

    // Hash the zip file created from the directory
    zipHash, err := dirhash.HashZip("mymodule.zip", dirhash.DefaultHash)
    if err != nil {
        log.Fatal(err)
    }

    fmt.Printf("Directory hash: %s\n", dirHash)
    fmt.Printf("Zip hash:       %s\n", zipHash)

    if dirHash == zipHash {
        fmt.Println("✓ Hashes match - zip correctly represents directory")
    } else {
        fmt.Println("✗ Hash mismatch - zip differs from directory")
    }
}

Generating go.sum Entry

package main

import (
    "fmt"
    "log"

    "golang.org/x/mod/module"
    "golang.org/x/mod/sumdb/dirhash"
)

func generateGoSumEntry(mod module.Version, dir string) (string, error) {
    // Compute hash for module content
    prefix := fmt.Sprintf("%s@%s", mod.Path, mod.Version)
    hash, err := dirhash.HashDir(dir, prefix, dirhash.DefaultHash)
    if err != nil {
        return "", err
    }

    // Format as go.sum line
    return fmt.Sprintf("%s %s %s", mod.Path, mod.Version, hash), nil
}

func main() {
    mod := module.Version{
        Path:    "example.com/mymodule",
        Version: "v1.0.0",
    }

    entry, err := generateGoSumEntry(mod, "./mymodule")
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println("go.sum entry:")
    fmt.Println(entry)
}

Batch Hashing Multiple Directories

package main

import (
    "fmt"
    "log"
    "sync"

    "golang.org/x/mod/sumdb/dirhash"
)

type ModuleHash struct {
    Path    string
    Version string
    Hash    string
    Err     error
}

func hashModules(modules []struct{ dir, prefix string }) []ModuleHash {
    var wg sync.WaitGroup
    results := make([]ModuleHash, len(modules))

    for i, mod := range modules {
        wg.Add(1)
        go func(i int, dir, prefix string) {
            defer wg.Done()

            hash, err := dirhash.HashDir(dir, prefix, dirhash.DefaultHash)
            results[i] = ModuleHash{
                Path: prefix,
                Hash: hash,
                Err:  err,
            }
        }(i, mod.dir, mod.prefix)
    }

    wg.Wait()
    return results
}

func main() {
    modules := []struct{ dir, prefix string }{
        {"./module1", "example.com/module1@v1.0.0"},
        {"./module2", "example.com/module2@v2.0.0"},
        {"./module3", "example.com/module3@v1.5.0"},
    }

    results := hashModules(modules)

    for _, result := range results {
        if result.Err != nil {
            log.Printf("Error hashing %s: %v", result.Path, result.Err)
            continue
        }
        fmt.Printf("%s %s\n", result.Path, result.Hash)
    }
}

Validating Zip File Against Expected Hash

package main

import (
    "fmt"
    "log"

    "golang.org/x/mod/sumdb/dirhash"
)

func validateZip(zipPath, expectedHash string) error {
    actualHash, err := dirhash.HashZip(zipPath, dirhash.DefaultHash)
    if err != nil {
        return fmt.Errorf("failed to hash zip: %w", err)
    }

    if actualHash != expectedHash {
        return fmt.Errorf("hash mismatch:\n  got:  %s\n  want: %s", actualHash, expectedHash)
    }

    return nil
}

func main() {
    // Expected hash from go.sum
    expectedHash := "h1:abc123def456..."

    err := validateZip("downloaded-module.zip", expectedHash)
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println("Zip file validated successfully")
}

Hash Format

The "h1:" hash format produces a string like:

h1:abcdefghijklmnopqrstuvwxyz0123456789ABCD=

Breaking this down:

  • Prefix: h1: indicates the Hash1 algorithm (SHA-256)
  • Hash: Base64-encoded SHA-256 hash of the file summary
  • Padding: May include = padding characters from base64 encoding

What Gets Hashed

Included

  • File contents (binary data)
  • File paths (relative to module root)
  • File order (alphabetically sorted)

Excluded

  • File modification times
  • File permissions
  • Directory entries (directories themselves)
  • Symbolic links
  • Zip file metadata (compression method, central directory format, etc.)

Use Cases

Module Verification

// Verify downloaded module matches expected hash
hash, _ := dirhash.HashDir("./downloaded", "module@version", dirhash.DefaultHash)
if hash != expectedHash {
    log.Fatal("Module verification failed")
}

Build Reproducibility

// Ensure builds use identical source
beforeHash, _ := dirhash.HashDir("./src", "build", dirhash.DefaultHash)
// ... build process ...
afterHash, _ := dirhash.HashDir("./src", "build", dirhash.DefaultHash)
if beforeHash != afterHash {
    log.Fatal("Source modified during build")
}

Cache Invalidation

// Use hash as cache key
hash, _ := dirhash.HashDir("./sources", "cache-key", dirhash.DefaultHash)
cacheKey := fmt.Sprintf("build-%s", hash)

Performance Considerations

  1. File I/O: Hashing reads all files, which can be slow for large modules
  2. Parallelization: Files are read sequentially; consider parallel hashing for multiple modules
  3. Caching: Cache hashes when possible to avoid recomputation
  4. Memory: Entire file list is kept in memory during hashing

Security Notes

  • Collision Resistance: SHA-256 provides strong collision resistance
  • Tamper Detection: Any change to file content or names changes the hash
  • Metadata Independence: Timestamps and permissions don't affect the hash
  • Zip Format Independence: Different zip encodings of same content produce identical hashes

See Also

  • zip - Module zip file creation and extraction
  • sumdb - Checksum database client and server
  • Go Modules Checksum Database