CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-stylis

A lightweight CSS preprocessor that provides CSS parsing, AST manipulation, vendor prefixing, and serialization capabilities.

Pending
Overview
Eval results
Files

tokenization.mddocs/

Tokenization

Low-level parsing utilities for character-by-character CSS analysis, token extraction, and custom parsing workflows.

Capabilities

High-level Tokenization

Tokenize Function

Converts CSS strings into arrays of tokens for analysis and custom processing.

/**
 * Convert CSS string into array of tokens
 * @param value - CSS string to tokenize
 * @returns Array of string tokens
 */
function tokenize(value: string): string[];

Usage Examples:

import { tokenize } from 'stylis';

// Basic tokenization
const tokens = tokenize('h1 h2 h3 [h4 h5] fn(args) "a b c"');
console.log(tokens);
// ['h1', 'h2', 'h3', '[h4 h5]', 'fn', '(args)', '"a b c"']

// CSS property tokenization
const propTokens = tokenize('margin: 10px 20px;');
console.log(propTokens);
// ['margin', ':', '10px', '20px', ';']

// Complex selector tokenization
const selectorTokens = tokenize('.class:hover > .child[attr="value"]');

Parser State Management

State Variables

Global variables that track the current parsing state during tokenization.

let line: number;      // Current line number in parsing
let column: number;    // Current column number in parsing  
let length: number;    // Length of current input string
let position: number;  // Current position in input string
let character: number; // Current character code
let characters: string; // Current input string being parsed

Alloc Function

Initializes the tokenizer state with a new input string and resets parsing position.

/**
 * Initialize tokenizer state with input string
 * @param value - CSS string to prepare for parsing
 * @returns Empty array (parsing workspace)
 */
function alloc(value: string): any[];

Dealloc Function

Cleans up tokenizer state and returns the final value.

/**
 * Clean up tokenizer state and return value
 * @param value - Value to return after cleanup
 * @returns The passed value after state cleanup
 */
function dealloc(value: any): any;

Character Navigation

Character Reading Functions

Functions for moving through and examining characters in the input stream.

/**
 * Get current character code without advancing position
 * @returns Current character code (0 if at end)
 */
function char(): number;

/**
 * Move to previous character and return its character code
 * @returns Previous character code
 */
function prev(): number;

/**
 * Move to next character and return its character code  
 * @returns Next character code (0 if at end)
 */
function next(): number;

/**
 * Look at current character without advancing position
 * @returns Current character code
 */
function peek(): number;

/**
 * Get current position in input string
 * @returns Current character position
 */
function caret(): number;

String Extraction

/**
 * Extract substring from current parsing context
 * @param begin - Start position
 * @param end - End position  
 * @returns Extracted substring
 */
function slice(begin: number, end: number): string;

Token Type Classification

Token Function

Classifies character codes into token types for parsing decisions.

/**
 * Get token type for character code
 * @param type - Character code to classify
 * @returns Token type number (0-5)
 */
function token(type: number): number;

Token Type Classifications:

  • 5: Whitespace tokens (0, 9, 10, 13, 32) - \0, \t, \n, \r, space
  • 4: Isolate tokens (33, 43, 44, 47, 62, 64, 126, 59, 123, 125) - !, +, ,, /, >, @, ~, ;, {, }
  • 3: Accompanied tokens (58) - :
  • 2: Opening delimit tokens (34, 39, 40, 91) - ", ', (, [
  • 1: Closing delimit tokens (41, 93) - ), ]
  • 0: Default/identifier tokens

Specialized Parsing Functions

Delimiter Handling

/**
 * Parse delimited content (quotes, brackets, parentheses)
 * @param type - Delimiter character code
 * @returns Delimited content as string
 */
function delimit(type: number): string;

/**
 * Find matching delimiter position
 * @param type - Opening delimiter character code
 * @returns Position of matching closing delimiter
 */
function delimiter(type: number): number;

Whitespace Processing

/**
 * Handle whitespace characters during parsing
 * @param type - Previous character type for context
 * @returns Space character or empty string based on context
 */
function whitespace(type: number): string;

Escape Sequence Handling

/**
 * Handle CSS escape sequences
 * @param index - Starting position of escape sequence
 * @param count - Maximum characters to process
 * @returns Processed escape sequence
 */
function escaping(index: number, count: number): string;

Comment Processing

/**
 * Parse CSS comment blocks (/* */ and //)
 * @param type - Comment type indicator
 * @param index - Starting position
 * @returns Complete comment string with delimiters
 */
function commenter(type: number, index: number): string;

Identifier Extraction

/**
 * Parse CSS identifiers (class names, property names, etc.)
 * @param index - Starting position of identifier
 * @returns Identifier string
 */
function identifier(index: number): string;

AST Node Management

Node Creation

/**
 * Create AST node object with metadata
 * @param value - Node value/content
 * @param root - Root node reference  
 * @param parent - Parent node reference
 * @param type - Node type string
 * @param props - Node properties
 * @param children - Child nodes
 * @param length - Character length
 * @param siblings - Sibling nodes array
 * @returns AST node object
 */
function node(
  value: string, 
  root: object | null, 
  parent: object | null, 
  type: string, 
  props: string[] | string, 
  children: object[] | string, 
  length: number, 
  siblings: object[]
): object;

Node Manipulation

/**
 * Copy AST node with modifications
 * @param root - Source node to copy
 * @param props - Properties to override
 * @returns New AST node with modifications
 */
function copy(root: object, props: object): object;

/**
 * Lift node to root level in AST hierarchy
 * @param root - Node to lift
 * @returns void (modifies node structure)
 */
function lift(root: object): void;

Custom Tokenization Examples

Token Analysis

import { tokenize, alloc, next, token, dealloc } from 'stylis';

// Analyze token types in CSS
function analyzeTokens(css) {
  alloc(css);
  const analysis = [];
  
  while (next()) {
    const charCode = char();
    const tokenType = token(charCode);
    const charStr = String.fromCharCode(charCode);
    
    analysis.push({
      char: charStr,
      code: charCode,
      type: tokenType,
      position: caret()
    });
  }
  
  return dealloc(analysis);
}

Custom Parser

import { alloc, next, peek, char, slice, caret, dealloc } from 'stylis';

// Simple custom property parser
function parseCustomProperties(css) {
  alloc(css);
  const properties = [];
  
  while (next()) {
    if (char() === 45 && peek() === 45) { // --
      const start = caret() - 1;
      
      // Find end of property name
      while (next() && char() !== 58) {} // Find :
      const nameEnd = caret() - 1;
      
      // Find end of property value  
      while (next() && char() !== 59) {} // Find ;
      const valueEnd = caret();
      
      properties.push({
        name: slice(start, nameEnd),
        value: slice(nameEnd + 1, valueEnd - 1).trim()
      });
    }
  }
  
  return dealloc(properties);
}

Character-by-Character Processing

import { alloc, next, char, dealloc } from 'stylis';

// Count specific characters in CSS
function countCharacters(css, targetChar) {
  alloc(css);
  let count = 0;
  const targetCode = targetChar.charCodeAt(0);
  
  while (next()) {
    if (char() === targetCode) {
      count++;
    }
  }
  
  return dealloc(count);
}

// Usage
const braceCount = countCharacters('.class { color: red; }', '{'); // 1
const semicolonCount = countCharacters('a: 1; b: 2; c: 3;', ';'); // 3

Error Handling

Tokenization functions are designed to handle malformed CSS gracefully:

  • Invalid Characters: Skipped or treated as identifiers
  • Unmatched Delimiters: Parsing continues to end of input
  • Escape Sequences: Invalid escapes are preserved as-is
  • End of Input: Functions return appropriate default values (0 for characters, empty strings for content)

The tokenizer maintains internal state consistency even when processing malformed input, allowing higher-level parsers to make recovery decisions.

Install with Tessl CLI

npx tessl i tessl/npm-stylis

docs

index.md

middleware.md

parser.md

serialization.md

tokenization.md

utilities.md

tile.json