tessl/npm-ohm-js

An object-oriented language for parsing and pattern matching based on parsing expression grammars

—

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview

Eval results

Files

Parsing Expressions

Name: tessl/npm-ohm-js
Author: tessl

Low-level parsing expression classes that form the building blocks of Ohm grammars. These classes represent the fundamental parsing operations defined in parsing expression grammars (PEGs).

Imports

import { pexprs, grammar } from "ohm-js";

For TypeScript:

import { 
  pexprs, 
  grammar,
  Grammar,
  PExpr,
  Terminal,
  Range,
  Param,
  Alt,
  Extend,
  Splice,
  Seq,
  Iter,
  Star,
  Plus,
  Opt,
  Not,
  Lookahead,
  Lex,
  Apply,
  UnicodeChar,
  CaseInsensitiveTerminal
} from "ohm-js";

Capabilities

Parsing Expression Namespace

The pexprs namespace contains all parsing expression constructors and singleton instances.

/**
 * Constructors for parsing expressions (aka pexprs). (Except for `any`
 * and `end`, which are not constructors but singleton instances.)
 */
const pexprs: {
  PExpr: typeof PExpr;
  Terminal: typeof Terminal;
  Range: typeof Range;
  Param: typeof Param;
  Alt: typeof Alt;
  Extend: typeof Extend;
  Splice: typeof Splice;
  Seq: typeof Seq;
  Iter: typeof Iter;
  Star: typeof Star;
  Plus: typeof Plus;
  Opt: typeof Opt;
  Not: typeof Not;
  Lookahead: typeof Lookahead;
  Lex: typeof Lex;
  Apply: typeof Apply;
  UnicodeChar: typeof UnicodeChar;
  CaseInsensitiveTerminal: typeof CaseInsensitiveTerminal;
  any: PExpr;
  end: PExpr;
};

Base Parsing Expression

Abstract base class for all parsing expressions.

/**
 * Abstract base class for parsing expressions
 */
class PExpr {
  /** Returns the arity (number of children) this expression expects */
  getArity(): number;
  /** Returns true if this expression can match zero characters */
  isNullable(): boolean;
  /** Returns a string representation of this expression */
  toString(): string;
  /** Returns a human-readable display string */
  toDisplayString(): string;
}

Terminal Expressions

Expressions that match literal text and character ranges.

/**
 * Matches literal terminal strings
 */
class Terminal extends PExpr {}

/**
 * Matches character ranges (e.g., 'a'..'z')
 */
class Range extends PExpr {}

/**
 * Case-insensitive terminal matching
 */
class CaseInsensitiveTerminal extends PExpr {}

/**
 * Matches Unicode character categories
 */
class UnicodeChar extends PExpr {}

Usage in Grammar:

// These expressions are typically created automatically by grammar compilation
// Terminal: "hello" in grammar source
// Range: 'a'..'z' in grammar source  
// CaseInsensitiveTerminal: caseInsensitive<"Hello"> in grammar source
// UnicodeChar: unicodeCgegory<"Letter"> in grammar source

Composite Expressions

Expressions that combine other expressions.

/**
 * Alternation (choice) - matches first successful alternative
 */
class Alt extends PExpr {
  /** Array of alternative expressions to try */
  terms: PExpr[];
}

/**
 * Sequence - matches all expressions in order
 */
class Seq extends PExpr {
  /** Array of expressions that must all match */
  factors: PExpr[];
}

/**
 * Grammar extension (used internally for rule extension)
 */
class Extend extends Alt {}

/**
 * Grammar splicing (used internally for rule override)
 */
class Splice extends Alt {}

Iteration Expressions

Expressions for repetition patterns.

/**
 * Abstract base for iteration expressions
 */
class Iter extends PExpr {}

/**
 * Zero or more repetitions (*)
 */
class Star extends Iter {}

/**
 * One or more repetitions (+)
 */
class Plus extends Iter {}

/**
 * Optional (zero or one) (?)
 */
class Opt extends Iter {}

Grammar Examples:

// Star: rule* in grammar
// Plus: rule+ in grammar  
// Opt: rule? in grammar

Lookahead Expressions

Expressions that look ahead without consuming input.

/**
 * Negative lookahead (~expr) - succeeds if expr fails
 */
class Not extends PExpr {}

/**
 * Positive lookahead (&expr) - succeeds if expr succeeds
 */
class Lookahead extends PExpr {}

Grammar Examples:

// Not: ~"keyword" in grammar
// Lookahead: &"prefix" in grammar

Rule Application

Expression for applying grammar rules.

/**
 * Rule application - applies a named rule
 */
class Apply extends PExpr {}

Grammar Examples:

// Apply: ruleName or ruleName<arg1, arg2> in grammar

Special Expressions

Specialized parsing expressions for specific scenarios.

/**
 * Parameter reference (used in parameterized rules)
 */
class Param extends PExpr {}

/**
 * Lexical rule application (implicit whitespace handling disabled)
 */
class Lex extends PExpr {}

Singleton Instances

Pre-created instances for common patterns.

/**
 * Matches any single character (.)
 */
const any: PExpr;

/**
 * Matches end of input
 */
const end: PExpr;

Grammar Examples:

// any: . in grammar
// end: used implicitly at end of top-level matches

Expression Hierarchy

The parsing expression classes form a hierarchy:

PExpr (abstract base)
├── Terminal
├── Range  
├── CaseInsensitiveTerminal
├── UnicodeChar
├── Param
├── Alt
│   ├── Extend
│   └── Splice
├── Seq
├── Iter (abstract)
│   ├── Star
│   ├── Plus
│   └── Opt
├── Not
├── Lookahead
├── Lex
└── Apply

Working with Parsing Expressions

While parsing expressions are typically created automatically during grammar compilation, they can be useful for advanced use cases:

Introspection

import { grammar, pexprs } from "ohm-js";

const g = grammar(`
  Example {
    rule = "hello" | "world"
    list = rule+
  }
`);

// Access rule bodies (which are parsing expressions)
const ruleBody = g.rules.rule.body;
console.log(ruleBody instanceof pexprs.Alt); // true
console.log(ruleBody.terms.length); // 2 (for "hello" | "world")

const listBody = g.rules.list.body;
console.log(listBody instanceof pexprs.Plus); // true

Custom Grammar Construction

Advanced users can construct grammars programmatically:

// This is advanced usage - most users should use grammar strings
import { pexprs } from "ohm-js";

// Create parsing expressions manually
const terminal = new pexprs.Terminal("hello");
const range = new pexprs.Range("a", "z");
const choice = new pexprs.Alt([terminal, range]);

Expression Properties

All parsing expressions support introspection:

const expr = g.rules.someRule.body;

console.log("Arity:", expr.getArity()); // Number of child nodes produced
console.log("Nullable:", expr.isNullable()); // Can match empty string
console.log("String form:", expr.toString()); // Grammar source representation
console.log("Display:", expr.toDisplayString()); // Human-readable form

Built-in Rule Integration

Parsing expressions integrate with Ohm's built-in rules:

// Built-in rules are available in all grammars
const match = grammar.match("abc123", "alnum+");
if (match.succeeded()) {
  // Built-in rules like 'alnum', 'letter', 'digit' are implemented
  // using parsing expressions under the hood
}

Error Handling

Parsing expressions contribute to error reporting:

const match = grammar.match("invalid input");
if (match.failed()) {
  // Error messages incorporate information from parsing expressions
  // to provide detailed failure information
  console.log(match.message);
}

Install with Tessl CLI