or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/npm-ret

Tokenizes a string that represents a regular expression.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/ret@0.0.x

To install, run

npx @tessl/cli install tessl/npm-ret@0.0.0

0

# Ret

1

2

Ret is a TypeScript library that tokenizes regular expression strings into structured AST-like representations, providing comprehensive parsing and reconstruction capabilities for regex analysis, transformation, and validation tools.

3

4

## Package Information

5

6

- **Package Name**: ret

7

- **Package Type**: npm

8

- **Language**: TypeScript

9

- **Installation**: `npm install ret`

10

11

## Core Imports

12

13

```typescript

14

import ret, { tokenizer, reconstruct, types } from "ret";

15

// Character sets are not directly exported from main module

16

// Import them separately if needed:

17

// import { words, ints, whitespace } from "ret/dist/sets";

18

```

19

20

For CommonJS:

21

22

```javascript

23

const ret = require("ret");

24

const { types, reconstruct } = ret;

25

// ret is the tokenizer function

26

// ret.types and ret.reconstruct are also available

27

28

// For character set utilities:

29

const sets = require("ret/dist/sets");

30

const { words, ints, whitespace, notWords, notInts, notWhitespace, anyChar } = sets;

31

```

32

33

## Basic Usage

34

35

```typescript

36

import ret, { reconstruct, types } from "ret";

37

38

// Tokenize a regular expression

39

const tokens = ret(/foo|bar/.source);

40

// or: const tokens = tokenizer(/foo|bar/.source);

41

42

// Tokens structure:

43

// {

44

// "type": types.ROOT,

45

// "options": [

46

// [ { "type": types.CHAR, "value": 102 }, // 'f'

47

// { "type": types.CHAR, "value": 111 }, // 'o'

48

// { "type": types.CHAR, "value": 111 } ],// 'o'

49

// [ { "type": types.CHAR, "value": 98 }, // 'b'

50

// { "type": types.CHAR, "value": 97 }, // 'a'

51

// { "type": types.CHAR, "value": 114 } ] // 'r'

52

// ]

53

// }

54

55

// Reconstruct regex from tokens

56

const regexString = reconstruct(tokens); // "foo|bar"

57

58

// Working with character sets

59

import { words, ints } from "ret/dist/sets";

60

61

const wordToken = words(); // Equivalent to \w

62

const digitToken = ints(); // Equivalent to \d

63

64

reconstruct(wordToken); // "\\w"

65

reconstruct(digitToken); // "\\d"

66

```

67

68

## Architecture

69

70

Ret is built around several key components:

71

72

- **Tokenizer**: Core parser that converts regex strings into structured token trees

73

- **Type System**: Comprehensive TypeScript types for all token variants (characters, groups, sets, repetitions, etc.)

74

- **Reconstruction**: Converts token structures back to valid regex strings

75

- **Character Sets**: Predefined character class utilities (digits, words, whitespace, etc.)

76

- **Utilities**: Helper functions for string processing and character class parsing

77

78

## Capabilities

79

80

### Regex Tokenization

81

82

Core tokenization functionality that parses regular expression strings into structured token representations. Handles all regex features including groups, character classes, quantifiers, and lookarounds.

83

84

```typescript { .api }

85

function tokenizer(regexpStr: string): Root;

86

```

87

88

[Tokenization](./tokenization.md)

89

90

### Token Reconstruction

91

92

Converts token structures back into valid regular expression strings, enabling regex transformation and analysis workflows.

93

94

```typescript { .api }

95

function reconstruct(token: Tokens): string;

96

```

97

98

[Reconstruction](./reconstruction.md)

99

100

### Character Set Utilities

101

102

Predefined character class utilities for generating common regex character sets programmatically.

103

104

```typescript { .api }

105

function words(): Set;

106

function notWords(): Set;

107

function ints(): Set;

108

function notInts(): Set;

109

function whitespace(): Set;

110

function notWhitespace(): Set;

111

function anyChar(): Set;

112

```

113

114

[Character Sets](./character-sets.md)

115

116

## Core Types

117

118

```typescript { .api }

119

enum types {

120

ROOT,

121

GROUP,

122

POSITION,

123

SET,

124

RANGE,

125

REPETITION,

126

REFERENCE,

127

CHAR

128

}

129

130

interface Root {

131

type: types.ROOT;

132

stack?: Token[];

133

options?: Token[][];

134

flags?: string[];

135

}

136

137

interface Group {

138

type: types.GROUP;

139

stack?: Token[];

140

options?: Token[][];

141

remember: boolean;

142

followedBy?: boolean;

143

notFollowedBy?: boolean;

144

lookBehind?: boolean;

145

name?: string;

146

}

147

148

interface Set {

149

type: types.SET;

150

set: SetTokens;

151

not: boolean;

152

}

153

154

interface Repetition {

155

type: types.REPETITION;

156

min: number;

157

max: number;

158

value: Token;

159

}

160

161

interface Position {

162

type: types.POSITION;

163

value: '$' | '^' | 'b' | 'B';

164

}

165

166

interface Reference {

167

type: types.REFERENCE;

168

value: number;

169

}

170

171

interface Char {

172

type: types.CHAR;

173

value: number;

174

}

175

176

interface Range {

177

type: types.RANGE;

178

from: number;

179

to: number;

180

}

181

182

type Token = Group | Position | Set | Range | Repetition | Reference | Char;

183

type Tokens = Root | Token;

184

type SetTokens = (Range | Char | Set)[];

185

```