Tiny JavaScript tokenizer that never fails and is almost spec-compliant
npx @tessl/cli install tessl/npm-js-tokens@9.0.0js-tokens is a tiny, regex-powered, lenient JavaScript tokenizer that never fails and is almost spec-compliant. It provides a generator function that turns JavaScript code strings into token objects, making it perfect for syntax highlighting, code formatting, linters, and any application requiring reliable JavaScript tokenization.
npm install js-tokensconst jsTokens = require("js-tokens");For ES modules:
import jsTokens from "js-tokens";const jsTokens = require("js-tokens");
// Basic tokenization
const code = 'JSON.stringify({k:3.14**2}, null /*replacer*/, "\\t")';
const tokens = Array.from(jsTokens(code));
// Extract token values
const tokenValues = tokens.map(token => token.value);
console.log(tokenValues.join("|"));
// Output: JSON|.|stringify|(|{|k|:|3.14|**|2|}|,| |null| |/*replacer*/|,| |"\t"|)
// Loop over tokens
for (const token of jsTokens("hello, !world")) {
console.log(`${token.type}: ${token.value}`);
}
// JSX tokenization
const jsxCode = '<div>Hello {"world"}!</div>';
const jsxTokens = Array.from(jsTokens(jsxCode, { jsx: true }));js-tokens is built around a single core function with the following key characteristics:
Core tokenization function that converts JavaScript code strings into detailed token objects with comprehensive type information.
/**
* Tokenizes JavaScript code into an iterable of token objects
* @param input - JavaScript code string to tokenize
* @param options - Optional configuration object
* @returns Iterable of Token objects for regular JavaScript
*/
function jsTokens(input: string, options?: { jsx?: boolean }): Iterable<Token>;
/**
* Tokenizes JavaScript code with JSX support
* @param input - JavaScript/JSX code string to tokenize
* @param options - Configuration object with jsx: true
* @returns Iterable of Token and JSXToken objects
*/
function jsTokens(
input: string,
options: { jsx: true }
): Iterable<Token | JSXToken>;js-tokens recognizes 17 different token types for standard JavaScript code:
type Token =
| { type: "StringLiteral"; value: string; closed: boolean }
| { type: "NoSubstitutionTemplate"; value: string; closed: boolean }
| { type: "TemplateHead"; value: string }
| { type: "TemplateMiddle"; value: string }
| { type: "TemplateTail"; value: string; closed: boolean }
| { type: "RegularExpressionLiteral"; value: string; closed: boolean }
| { type: "MultiLineComment"; value: string; closed: boolean }
| { type: "SingleLineComment"; value: string }
| { type: "HashbangComment"; value: string }
| { type: "IdentifierName"; value: string }
| { type: "PrivateIdentifier"; value: string }
| { type: "NumericLiteral"; value: string }
| { type: "Punctuator"; value: string }
| { type: "WhiteSpace"; value: string }
| { type: "LineTerminatorSequence"; value: string }
| { type: "Invalid"; value: string };Key Token Properties:
type: Token classification (one of the 17 standard types)value: The actual text content of the tokenclosed: Boolean property on certain tokens (StringLiteral, NoSubstitutionTemplate, TemplateTail, RegularExpressionLiteral, MultiLineComment, JSXString) indicating if they are properly terminatedWhen JSX mode is enabled ({ jsx: true }), js-tokens additionally recognizes 5 JSX-specific token types:
type JSXToken =
| { type: "JSXString"; value: string; closed: boolean }
| { type: "JSXText"; value: string }
| { type: "JSXIdentifier"; value: string }
| { type: "JSXPunctuator"; value: string }
| { type: "JSXInvalid"; value: string };JSX Mode Behavior:
js-tokens never throws errors and always produces meaningful output:
closed: false property to indicate incomplete strings, templates, regex, etc.Example with incomplete tokens:
const tokens = Array.from(jsTokens('"unclosed string\n'));
// Produces: { type: "StringLiteral", value: '"unclosed string', closed: false }
const regexTokens = Array.from(jsTokens('/unclosed regex\n'));
// Produces: { type: "RegularExpressionLiteral", value: '/unclosed regex', closed: false }interface TokenizeOptions {
/** Enable JSX support (default: false) */
jsx?: boolean;
}All tokens include these base properties:
interface BaseToken {
/** Token type classification */
type: string;
/** Original text content of the token */
value: string;
}Tokens that can be incomplete include a closed property:
interface ClosedToken extends BaseToken {
/** Whether the token is properly closed/terminated */
closed: boolean;
}Tokens with closed property:
Token Examples:
// Closed string: { type: "StringLiteral", value: '"hello"', closed: true }
// Unclosed string: { type: "StringLiteral", value: '"hello', closed: false }
// Closed regex: { type: "RegularExpressionLiteral", value: '/abc/g', closed: true }
// Unclosed regex: { type: "RegularExpressionLiteral", value: '/abc', closed: false }