The golang.org/x/exp/ebnf package provides a library for parsing and verifying Extended Backus-Naur Form (EBNF) grammar specifications. It allows you to parse EBNF productions from text input, manipulate the abstract syntax tree, and verify grammar consistency.
go get golang.org/x/exp/ebnfimport (
"golang.org/x/exp/ebnf"
"io"
)package main
import (
"golang.org/x/exp/ebnf"
"os"
"log"
)
func main() {
// Parse EBNF grammar from a file
file, err := os.Open("grammar.ebnf")
if err != nil {
log.Fatal(err)
}
defer file.Close()
grammar, err := ebnf.Parse("grammar.ebnf", file)
if err != nil {
log.Fatal(err)
}
// Verify the grammar starting from "Start" production
err = ebnf.Verify(grammar, "Start")
if err != nil {
log.Fatal(err)
}
log.Println("Grammar parsed and verified successfully")
}The package expects EBNF input following this formal grammar:
Production = name "=" [ Expression ] "." .
Expression = Alternative { "|" Alternative } .
Alternative = Term { Term } .
Term = name | token [ "…" token ] | Group | Option | Repetition .
Group = "(" Expression ")" .
Option = "[" Expression "]" .
Repetition = "{" Expression "}" ."keyword", ";")Start = Statement { Statement } .
Statement = Assignment ";" .
Assignment = identifier "=" Expression .
Expression = Term { "+" Term } .
Term = Factor { "*" Factor } .
Factor = number | identifier | "(" Expression ")" .
identifier = ? an identifier ? .
number = ? a number ? .Parse EBNF grammar productions from source code.
func Parse(filename string, src io.Reader) (Grammar, error)Parses a set of EBNF productions from source src. It returns a Grammar containing all parsed productions. Errors are reported for incorrect syntax and if a production is declared more than once. The filename parameter is used only for error positions.
Parameters:
filename (string): Name of the source file, used only for error reportingsrc (io.Reader): Source containing EBNF productionsReturns:
Grammar: Map of production name to Production pointerserror: Non-nil if parsing failsVerify that a grammar is consistent and complete.
func Verify(grammar Grammar, start string) errorVerify checks that:
Parameters:
grammar (Grammar): The parsed grammar to verifystart (string): Name of the start productionReturns:
error: Non-nil if verification fails with description of inconsistenciestype Grammar map[string]*ProductionA Grammar is a set of EBNF productions indexed by production name (string key to Production pointer).
Usage Example:
grammar := ebnf.Grammar{}
// Grammar is populated by Parse()
for name, production := range grammar {
println(name, "->", production)
}type Production struct {
Name *Name // The production name
Expr Expression // The production expression
}
func (x *Production) Pos() scanner.PositionA Production node represents an EBNF production rule. The Name field identifies the production, and the Expr field contains the production expression.
Methods:
Pos() scanner.Position: Returns the position of the first character of the productiontype Expression interface {
Pos() scanner.Position
}An Expression node represents a production expression. All expression types implement this interface:
type Alternative []ExpressionAn Alternative node represents a non-empty list of alternative expressions, separated by the | operator (e.g., x | y | z).
Methods:
Pos() scanner.Position: Returns the position of the first alternativeUsage Example:
// Matches: Expression "=" ( "+" | "-" | "*" )
alt := ebnf.Alternative{
/* expression 1 */,
/* expression 2 */,
/* expression 3 */,
}type Sequence []ExpressionA Sequence node represents a non-empty list of sequential expressions (e.g., x y z). All expressions must occur in order.
Methods:
Pos() scanner.Position: Returns the position of the first elementUsage Example:
// Matches: A B C (in that order)
seq := ebnf.Sequence{exprA, exprB, exprC}type Group struct {
Lparen scanner.Position // Position of "("
Body Expression // The grouped expression
}
func (x *Group) Pos() scanner.PositionA Group node represents a grouped expression enclosed in parentheses. Used to control precedence in expressions.
Fields:
Lparen: Position of the opening parenthesisBody: The expression inside the parenthesestype Option struct {
Lbrack scanner.Position // Position of "["
Body Expression // The optional expression
}
func (x *Option) Pos() scanner.PositionAn Option node represents an optional expression enclosed in square brackets (e.g., [expression]). The expression inside occurs zero or one time.
Fields:
Lbrack: Position of the opening bracketBody: The optional expressiontype Repetition struct {
Lbrace scanner.Position // Position of "{"
Body Expression // The repeated expression
}
func (x *Repetition) Pos() scanner.PositionA Repetition node represents a repeated expression enclosed in curly braces (e.g., {expression}). The expression inside occurs zero or more times.
Fields:
Lbrace: Position of the opening braceBody: The repeated expressiontype Name struct {
StringPos scanner.Position // Position of the name
String string // The name text
}
func (x *Name) Pos() scanner.PositionA Name node represents a production name (identifier). Names starting with uppercase letters denote non-terminal productions.
Fields:
StringPos: Position of the name in the sourceString: The actual name texttype Token struct {
StringPos scanner.Position // Position of the token
String string // The token text (without quotes)
}
func (x *Token) Pos() scanner.PositionA Token node represents a literal token or string terminal. Tokens are specified as Go strings in the EBNF grammar.
Fields:
StringPos: Position of the token in the sourceString: The token text (quotes removed)type Range struct {
Begin *Token // The beginning token
End *Token // The ending token
}
func (x *Range) Pos() scanner.PositionA Range node represents a range of characters or tokens, specified as begin … end (using the ellipsis operator).
Fields:
Begin: The starting token of the rangeEnd: The ending token of the rangetype Bad struct {
TokPos scanner.Position // Position of the error
Error string // Error message
}
func (x *Bad) Pos() scanner.PositionA Bad node stands for pieces of source code that lead to a parse error. It allows error recovery during parsing.
Fields:
TokPos: Position where the error occurredError: Description of the parse errorThe golang.org/x/exp/ebnflint tool verifies that EBNF productions are consistent and grammatically correct.
Ebnflint reads EBNF productions from HTML documents (such as the Go specification) and verifies that:
Grammar productions in HTML documents are grouped in boxes demarcated by:
<pre class="ebnf">
Production = ... .
</pre>go tool ebnflint [--start production] [file]--start production: Name of the start production for the grammar (defaults to "Start")file: HTML file containing EBNF grammar. If omitted, reads from standard input.Verify grammar starting from "Start" production:
go tool ebnflint grammar.htmlVerify grammar with custom start production:
go tool ebnflint --start Program grammar.htmlVerify grammar from standard input:
cat spec.html | go tool ebnflint --start Expression// Type assertion to determine expression type
switch expr := expression.(type) {
case *ebnf.Alternative:
println("Alternative expression")
case *ebnf.Sequence:
println("Sequence expression")
case *ebnf.Group:
println("Group expression")
case *ebnf.Option:
println("Optional expression")
case *ebnf.Repetition:
println("Repetition expression")
case *ebnf.Name:
println("Reference to:", expr.String)
case *ebnf.Token:
println("Literal token:", expr.String)
case *ebnf.Range:
println("Character range")
case *ebnf.Bad:
println("Parse error:", expr.Error)
}import "unicode"
for name, prod := range grammar {
// Check if non-terminal (name starts with uppercase)
if unicode.IsUpper(rune(name[0])) {
println(name, "is a non-terminal")
} else {
println(name, "is a lexical production")
}
}grammar, err := ebnf.Parse("test.ebnf", file)
if err != nil {
// err contains parsing errors
println("Parse error:", err)
}
err = ebnf.Verify(grammar, "Start")
if err != nil {
// err contains verification errors
// Could indicate undefined productions, unused productions,
// or lexical productions referencing non-terminals
println("Verification error:", err)
}The golang.org/x/exp/ebnf package uses types from the text/scanner package:
import "text/scanner"
type scanner.Position struct {
Filename string // Filename, if any
Offset int // Byte offset, starting at 0
Line int // Line number, starting at 1
Column int // Column number, starting at 0
}All expression nodes include a Pos() method that returns a scanner.Position indicating where the expression appears in the source.