or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

apidiff.mdconstraints.mdebnf.mderrors.mdevent.mdgorelease.mdindex.mdio-i2c.mdio-spi.mdjsonrpc2.mdmaps.mdmmap.mdmodgraphviz.mdrand.mdshiny.mdslices.mdslog.mdstats.mdsumdb.mdtrace.mdtxtar.mdtypeparams.mdutf8string.md
tile.json

ebnf.mddocs/

Extended Backus-Naur Form (EBNF) Parser

The golang.org/x/exp/ebnf package provides a library for parsing and verifying Extended Backus-Naur Form (EBNF) grammar specifications. It allows you to parse EBNF productions from text input, manipulate the abstract syntax tree, and verify grammar consistency.

Package Information

  • Package Name: golang.org/x/exp/ebnf
  • Package Type: Go
  • Language: Go
  • Installation: go get golang.org/x/exp/ebnf
  • Documentation: golang.org/x/exp

Core Imports

import (
    "golang.org/x/exp/ebnf"
    "io"
)

Basic Usage

package main

import (
    "golang.org/x/exp/ebnf"
    "os"
    "log"
)

func main() {
    // Parse EBNF grammar from a file
    file, err := os.Open("grammar.ebnf")
    if err != nil {
        log.Fatal(err)
    }
    defer file.Close()

    grammar, err := ebnf.Parse("grammar.ebnf", file)
    if err != nil {
        log.Fatal(err)
    }

    // Verify the grammar starting from "Start" production
    err = ebnf.Verify(grammar, "Start")
    if err != nil {
        log.Fatal(err)
    }

    log.Println("Grammar parsed and verified successfully")
}

EBNF Grammar Format

The package expects EBNF input following this formal grammar:

Production  = name "=" [ Expression ] "." .
Expression  = Alternative { "|" Alternative } .
Alternative = Term { Term } .
Term        = name | token [ "…" token ] | Group | Option | Repetition .
Group       = "(" Expression ")" .
Option      = "[" Expression "]" .
Repetition  = "{" Expression "}" .

Grammar Rules

  • Names are Go identifiers
  • Tokens are Go strings (e.g., "keyword", ";")
  • Comments and whitespace follow Go language rules
  • Production names starting with uppercase Unicode letters denote non-terminal productions (allow whitespace and comments between tokens)
  • Other production names denote lexical productions (no whitespace/comments between tokens)

Example EBNF Grammar

Start = Statement { Statement } .

Statement = Assignment ";" .

Assignment = identifier "=" Expression .

Expression = Term { "+" Term } .

Term = Factor { "*" Factor } .

Factor = number | identifier | "(" Expression ")" .

identifier = ? an identifier ? .
number = ? a number ? .

Capabilities

Parsing EBNF Grammars

Parse EBNF grammar productions from source code.

func Parse(filename string, src io.Reader) (Grammar, error)

Parses a set of EBNF productions from source src. It returns a Grammar containing all parsed productions. Errors are reported for incorrect syntax and if a production is declared more than once. The filename parameter is used only for error positions.

Parameters:

  • filename (string): Name of the source file, used only for error reporting
  • src (io.Reader): Source containing EBNF productions

Returns:

  • Grammar: Map of production name to Production pointers
  • error: Non-nil if parsing fails

Verifying Grammar Consistency

Verify that a grammar is consistent and complete.

func Verify(grammar Grammar, start string) error

Verify checks that:

  • All productions used are defined in the grammar
  • All productions defined are used when beginning at the start production
  • Lexical productions refer only to other lexical productions

Parameters:

  • grammar (Grammar): The parsed grammar to verify
  • start (string): Name of the start production

Returns:

  • error: Non-nil if verification fails with description of inconsistencies

Types

Grammar

type Grammar map[string]*Production

A Grammar is a set of EBNF productions indexed by production name (string key to Production pointer).

Usage Example:

grammar := ebnf.Grammar{}
// Grammar is populated by Parse()
for name, production := range grammar {
    println(name, "->", production)
}

Production

type Production struct {
    Name *Name       // The production name
    Expr Expression // The production expression
}

func (x *Production) Pos() scanner.Position

A Production node represents an EBNF production rule. The Name field identifies the production, and the Expr field contains the production expression.

Methods:

  • Pos() scanner.Position: Returns the position of the first character of the production

Expression

type Expression interface {
    Pos() scanner.Position
}

An Expression node represents a production expression. All expression types implement this interface:

  • Alternative
  • Bad
  • Group
  • Name
  • Option
  • Range
  • Repetition
  • Sequence
  • Token

Alternative

type Alternative []Expression

An Alternative node represents a non-empty list of alternative expressions, separated by the | operator (e.g., x | y | z).

Methods:

  • Pos() scanner.Position: Returns the position of the first alternative

Usage Example:

// Matches: Expression "=" ( "+" | "-" | "*" )
alt := ebnf.Alternative{
    /* expression 1 */,
    /* expression 2 */,
    /* expression 3 */,
}

Sequence

type Sequence []Expression

A Sequence node represents a non-empty list of sequential expressions (e.g., x y z). All expressions must occur in order.

Methods:

  • Pos() scanner.Position: Returns the position of the first element

Usage Example:

// Matches: A B C (in that order)
seq := ebnf.Sequence{exprA, exprB, exprC}

Group

type Group struct {
    Lparen scanner.Position // Position of "("
    Body   Expression       // The grouped expression
}

func (x *Group) Pos() scanner.Position

A Group node represents a grouped expression enclosed in parentheses. Used to control precedence in expressions.

Fields:

  • Lparen: Position of the opening parenthesis
  • Body: The expression inside the parentheses

Option

type Option struct {
    Lbrack scanner.Position // Position of "["
    Body   Expression       // The optional expression
}

func (x *Option) Pos() scanner.Position

An Option node represents an optional expression enclosed in square brackets (e.g., [expression]). The expression inside occurs zero or one time.

Fields:

  • Lbrack: Position of the opening bracket
  • Body: The optional expression

Repetition

type Repetition struct {
    Lbrace scanner.Position // Position of "{"
    Body   Expression       // The repeated expression
}

func (x *Repetition) Pos() scanner.Position

A Repetition node represents a repeated expression enclosed in curly braces (e.g., {expression}). The expression inside occurs zero or more times.

Fields:

  • Lbrace: Position of the opening brace
  • Body: The repeated expression

Name

type Name struct {
    StringPos scanner.Position // Position of the name
    String    string          // The name text
}

func (x *Name) Pos() scanner.Position

A Name node represents a production name (identifier). Names starting with uppercase letters denote non-terminal productions.

Fields:

  • StringPos: Position of the name in the source
  • String: The actual name text

Token

type Token struct {
    StringPos scanner.Position // Position of the token
    String    string          // The token text (without quotes)
}

func (x *Token) Pos() scanner.Position

A Token node represents a literal token or string terminal. Tokens are specified as Go strings in the EBNF grammar.

Fields:

  • StringPos: Position of the token in the source
  • String: The token text (quotes removed)

Range

type Range struct {
    Begin *Token // The beginning token
    End   *Token // The ending token
}

func (x *Range) Pos() scanner.Position

A Range node represents a range of characters or tokens, specified as begin … end (using the ellipsis operator).

Fields:

  • Begin: The starting token of the range
  • End: The ending token of the range

Bad

type Bad struct {
    TokPos scanner.Position // Position of the error
    Error  string          // Error message
}

func (x *Bad) Pos() scanner.Position

A Bad node stands for pieces of source code that lead to a parse error. It allows error recovery during parsing.

Fields:

  • TokPos: Position where the error occurred
  • Error: Description of the parse error

Ebnflint Command-Line Tool

The golang.org/x/exp/ebnflint tool verifies that EBNF productions are consistent and grammatically correct.

Purpose

Ebnflint reads EBNF productions from HTML documents (such as the Go specification) and verifies that:

  • All referenced productions are defined
  • All defined productions are used
  • Grammar is consistent and complete

Grammar in HTML

Grammar productions in HTML documents are grouped in boxes demarcated by:

<pre class="ebnf">
Production = ... .
</pre>

Usage

go tool ebnflint [--start production] [file]

Flags

  • --start production: Name of the start production for the grammar (defaults to "Start")
  • file: HTML file containing EBNF grammar. If omitted, reads from standard input.

Examples

Verify grammar starting from "Start" production:

go tool ebnflint grammar.html

Verify grammar with custom start production:

go tool ebnflint --start Program grammar.html

Verify grammar from standard input:

cat spec.html | go tool ebnflint --start Expression

Exit Codes

  • 0: Grammar is valid
  • 1: Grammar contains errors or inconsistencies

Advanced Usage Patterns

Walking the AST

// Type assertion to determine expression type
switch expr := expression.(type) {
case *ebnf.Alternative:
    println("Alternative expression")
case *ebnf.Sequence:
    println("Sequence expression")
case *ebnf.Group:
    println("Group expression")
case *ebnf.Option:
    println("Optional expression")
case *ebnf.Repetition:
    println("Repetition expression")
case *ebnf.Name:
    println("Reference to:", expr.String)
case *ebnf.Token:
    println("Literal token:", expr.String)
case *ebnf.Range:
    println("Character range")
case *ebnf.Bad:
    println("Parse error:", expr.Error)
}

Checking Production Names

import "unicode"

for name, prod := range grammar {
    // Check if non-terminal (name starts with uppercase)
    if unicode.IsUpper(rune(name[0])) {
        println(name, "is a non-terminal")
    } else {
        println(name, "is a lexical production")
    }
}

Error Handling

grammar, err := ebnf.Parse("test.ebnf", file)
if err != nil {
    // err contains parsing errors
    println("Parse error:", err)
}

err = ebnf.Verify(grammar, "Start")
if err != nil {
    // err contains verification errors
    // Could indicate undefined productions, unused productions,
    // or lexical productions referencing non-terminals
    println("Verification error:", err)
}

Related Types

The golang.org/x/exp/ebnf package uses types from the text/scanner package:

import "text/scanner"

type scanner.Position struct {
    Filename string // Filename, if any
    Offset   int    // Byte offset, starting at 0
    Line     int    // Line number, starting at 1
    Column   int    // Column number, starting at 0
}

All expression nodes include a Pos() method that returns a scanner.Position indicating where the expression appears in the source.

Use Cases

  1. Language Design: Parse and validate grammar specifications for domain-specific languages
  2. Compiler Development: Verify EBNF grammars during compiler frontend development
  3. Documentation: Validate EBNF grammars embedded in language specifications
  4. Grammar Analysis: Analyze and traverse grammar rules programmatically
  5. Error Detection: Identify undefined, unused, or inconsistent productions early

Key Differences Between Production Types

  • Non-terminal productions (names starting with uppercase): Allow whitespace and comments between tokens
  • Lexical productions (other names): No whitespace or comments between tokens allowed
  • This distinction affects verification; lexical productions can only reference other lexical productions