0
# String and Character Parsing
1
2
Built-in parsers for common string and character matching patterns. These provide convenient shortcuts for frequently used parsing operations.
3
4
## Capabilities
5
6
### Pre-built Character Parsers
7
8
Ready-to-use parsers for common character types and patterns.
9
10
```javascript { .api }
11
// Single character parsers
12
Parsimmon.any; // Matches any single character
13
Parsimmon.digit; // Matches single digit [0-9]
14
Parsimmon.letter; // Matches single letter [a-zA-Z]
15
16
// Multiple character parsers
17
Parsimmon.digits; // Matches zero or more digits
18
Parsimmon.letters; // Matches zero or more letters
19
Parsimmon.whitespace; // Matches one or more whitespace characters
20
Parsimmon.optWhitespace; // Matches zero or more whitespace characters
21
```
22
23
**Usage Examples:**
24
25
```javascript
26
// Parse single characters
27
Parsimmon.digit.parse("5"); // { status: true, value: "5" }
28
Parsimmon.letter.parse("a"); // { status: true, value: "a" }
29
Parsimmon.any.parse("@"); // { status: true, value: "@" }
30
31
// Parse multiple characters
32
Parsimmon.digits.parse("123"); // { status: true, value: "123" }
33
Parsimmon.letters.parse("hello"); // { status: true, value: "hello" }
34
Parsimmon.whitespace.parse(" \t"); // { status: true, value: " \t" }
35
36
// Combining with other parsers
37
const word = Parsimmon.letters.skip(Parsimmon.optWhitespace);
38
const number = Parsimmon.digits.map(Number);
39
```
40
41
### Line Ending Parsers
42
43
Parsers for different line ending formats across platforms.
44
45
```javascript { .api }
46
Parsimmon.cr; // Matches carriage return (\r)
47
Parsimmon.lf; // Matches line feed (\n)
48
Parsimmon.crlf; // Matches Windows line ending (\r\n)
49
Parsimmon.newline; // Matches any line ending (CR, LF, or CRLF)
50
Parsimmon.end; // Matches newline or end of input
51
```
52
53
**Usage Examples:**
54
55
```javascript
56
// Parse different line endings
57
Parsimmon.newline.parse("\n"); // { status: true, value: "\n" }
58
Parsimmon.newline.parse("\r\n"); // { status: true, value: "\r\n" }
59
Parsimmon.crlf.parse("\r\n"); // { status: true, value: "\r\n" }
60
61
// Use in line-based parsing
62
const line = Parsimmon.regexp(/[^\r\n]*/).skip(Parsimmon.newline);
63
const lines = line.many();
64
lines.parse("line1\nline2\nline3\n");
65
```
66
67
### Special Position Parsers
68
69
Parsers for special positions and conditions in the input.
70
71
```javascript { .api }
72
Parsimmon.eof; // Matches end of input
73
Parsimmon.all; // Consumes all remaining input
74
Parsimmon.index; // Returns current position information
75
```
76
77
**Usage Examples:**
78
79
```javascript
80
// Ensure complete parsing
81
const completeNumber = Parsimmon.digits.skip(Parsimmon.eof);
82
completeNumber.parse("123"); // Success
83
completeNumber.parse("123abc"); // Fails - not at end
84
85
// Get all remaining input
86
const remaining = Parsimmon.string("start:").then(Parsimmon.all);
87
remaining.parse("start:everything else"); // { status: true, value: "everything else" }
88
89
// Track position
90
const withPosition = Parsimmon.seq(
91
Parsimmon.index,
92
Parsimmon.string("hello"),
93
Parsimmon.index
94
).map(([start, value, end]) => ({ value, start, end }));
95
```
96
97
### Character Set Parsers
98
99
Parsers for matching characters from specific sets or ranges.
100
101
```javascript { .api }
102
/**
103
* Matches any character from the given string
104
* @param {string} str - String containing valid characters
105
* @returns {Parser} Parser that matches any character in str
106
*/
107
Parsimmon.oneOf(str);
108
109
/**
110
* Matches any character NOT in the given string
111
* @param {string} str - String containing invalid characters
112
* @returns {Parser} Parser that matches any character not in str
113
*/
114
Parsimmon.noneOf(str);
115
116
/**
117
* Matches any character in the given range (inclusive)
118
* @param {string} begin - Start character of range
119
* @param {string} end - End character of range
120
* @returns {Parser} Parser that matches characters in range
121
*/
122
Parsimmon.range(begin, end);
123
```
124
125
**Usage Examples:**
126
127
```javascript
128
// Parse vowels
129
const vowel = Parsimmon.oneOf("aeiouAEIOU");
130
vowel.parse("a"); // { status: true, value: "a" }
131
vowel.parse("x"); // { status: false, ... }
132
133
// Parse consonants
134
const consonant = Parsimmon.noneOf("aeiouAEIOU0123456789 \t\n");
135
consonant.parse("b"); // { status: true, value: "b" }
136
137
// Parse letter ranges
138
const lowercase = Parsimmon.range("a", "z");
139
const uppercase = Parsimmon.range("A", "Z");
140
const hexDigit = Parsimmon.oneOf("0123456789abcdefABCDEF");
141
142
// Combine ranges
143
const alphanumeric = Parsimmon.alt(
144
Parsimmon.range("a", "z"),
145
Parsimmon.range("A", "Z"),
146
Parsimmon.range("0", "9")
147
);
148
```
149
150
### Predicate-based Parsers
151
152
Parsers that use custom functions to test characters.
153
154
```javascript { .api }
155
/**
156
* Matches a single character/byte that satisfies the predicate
157
* @param {Function} predicate - Function that tests a character
158
* @returns {Parser} Parser that matches when predicate returns true
159
*/
160
Parsimmon.test(predicate);
161
162
/**
163
* Consumes characters while predicate is true
164
* @param {Function} predicate - Function that tests each character
165
* @returns {Parser} Parser that consumes matching characters
166
*/
167
Parsimmon.takeWhile(predicate);
168
```
169
170
**Usage Examples:**
171
172
```javascript
173
// Parse uppercase letters
174
const isUppercase = (ch) => ch >= "A" && ch <= "Z";
175
const uppercase = Parsimmon.test(isUppercase);
176
uppercase.parse("H"); // { status: true, value: "H" }
177
178
// Parse identifiers
179
const isAlphaNum = (ch) => /[a-zA-Z0-9_]/.test(ch);
180
const identifier = Parsimmon.test(ch => /[a-zA-Z_]/.test(ch))
181
.then(Parsimmon.takeWhile(isAlphaNum))
182
.map((first, rest) => first + rest);
183
184
// Parse numbers with custom logic
185
const isDigit = (ch) => ch >= "0" && ch <= "9";
186
const naturalNumber = Parsimmon.takeWhile(isDigit)
187
.assert(str => str.length > 0, "at least one digit")
188
.map(Number);
189
190
// Parse until delimiter
191
const isNotComma = (ch) => ch !== ",";
192
const csvField = Parsimmon.takeWhile(isNotComma);
193
```
194
195
### Lookahead Parsers
196
197
Parsers that check ahead without consuming input.
198
199
```javascript { .api }
200
/**
201
* Succeeds if parser would succeed, but consumes no input
202
* @param {Parser|string|RegExp} x - Parser, string, or regex to look ahead for
203
* @returns {Parser} Parser that looks ahead without consuming
204
*/
205
Parsimmon.lookahead(x);
206
207
/**
208
* Succeeds if parser would fail, consumes no input
209
* @param {Parser} parser - Parser that should not match
210
* @returns {Parser} Parser that succeeds when parser fails
211
*/
212
Parsimmon.notFollowedBy(parser);
213
```
214
215
**Usage Examples:**
216
217
```javascript
218
// Parse numbers not followed by letters
219
const numberNotInWord = Parsimmon.digits
220
.skip(Parsimmon.notFollowedBy(Parsimmon.letter));
221
222
numberNotInWord.parse("123"); // Success
223
numberNotInWord.parse("123abc"); // Fails
224
225
// Look ahead for keywords
226
const keywordIf = Parsimmon.string("if")
227
.skip(Parsimmon.lookahead(Parsimmon.regexp(/\s/)));
228
229
keywordIf.parse("if "); // Success
230
keywordIf.parse("ifelse"); // Fails
231
232
// Conditional parsing based on lookahead
233
const stringOrNumber = Parsimmon.lookahead(Parsimmon.digit)
234
.then(Parsimmon.digits.map(Number))
235
.or(Parsimmon.letters);
236
```