Tessl Tile for npm/next-token-prediction@1.1.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

index.md language-model.md text-prediction.md training-system.md vector-operations.md

text-prediction.mddocs/

0
# Text Prediction
1

2
The Text Prediction system provides core prediction capabilities for single tokens, token sequences, and multiple completion alternatives. It uses n-gram analysis combined with high-dimensional embeddings to generate contextually appropriate text completions.
3

4
## Capabilities
5

6
### Token Prediction
7

8
Predicts the next single token based on input context using n-gram lookup and similarity analysis.
9

10
```javascript { .api }
11
/**
12
 * Predict the next single token based on input context
13
 * @param {string} token - Input token or phrase to predict from
14
 * @returns {TokenPredictionResult} Prediction with token and ranked alternatives
15
 */
16
getTokenPrediction(token);
17

18
/**
19
 * Token prediction result structure
20
 */
21
interface TokenPredictionResult {
22
  token: string;              // Best predicted next token
23
  rankedTokenList: string[];  // Alternative tokens ranked by likelihood
24
  error?: {                   // Error information if prediction fails
25
    message: string;
26
  };
27
}
28
```
29

30
**Usage Examples:**
31

32
```javascript
33
// Simple token prediction
34
const prediction1 = model.getTokenPrediction('hello');
35
// Returns: { token: 'world', rankedTokenList: ['world', 'there', 'everyone', ...] }
36

37
// Phrase-based prediction
38
const prediction2 = model.getTokenPrediction('the weather is');
39
// Returns: { token: 'beautiful', rankedTokenList: ['beautiful', 'nice', 'sunny', ...] }
40

41
// Error handling
42
const prediction3 = model.getTokenPrediction('xyzunknowntoken');
43
// May return: { error: { message: 'Failed to look up n-gram.' }, token: '', rankedTokenList: [] }
44
```
45

46
### Token Sequence Prediction
47

48
Predicts a sequence of multiple tokens by iteratively applying token prediction to build longer completions.
49

50
```javascript { .api }
51
/**
52
 * Predict a sequence of tokens extending the input
53
 * @param {string} input - Input text to extend
54
 * @param {number} [sequenceLength=2] - Number of tokens to predict in sequence
55
 * @returns {SequencePredictionResult} Sequence completion with metadata
56
 */
57
getTokenSequencePrediction(input, sequenceLength);
58

59
/**
60
 * Sequence prediction result structure
61
 */
62
interface SequencePredictionResult {
63
  completion: string;         // Complete predicted sequence
64
  sequenceLength: number;     // Number of tokens in sequence
65
  token: string;             // First predicted token
66
  rankedTokenList: string[]; // Alternative first tokens ranked by likelihood
67
}
68
```
69

70
**Usage Examples:**
71

72
```javascript
73
// Short sequence prediction
74
const sequence1 = model.getTokenSequencePrediction('JavaScript is', 3);
75
// Returns: {
76
//   completion: 'a programming language',
77
//   sequenceLength: 3,
78
//   token: 'a',
79
//   rankedTokenList: ['a', 'an', 'the', ...]
80
// }
81

82
// Single token sequence (equivalent to getTokenPrediction but different format)
83
const sequence2 = model.getTokenSequencePrediction('hello', 1);
84
// Returns: {
85
//   completion: 'world',
86
//   sequenceLength: 1,
87
//   token: 'world',
88
//   rankedTokenList: ['world', 'there', ...]
89
// }
90

91
// Longer sequence
92
const sequence3 = model.getTokenSequencePrediction('The quick brown', 5);
93
// Returns: {
94
//   completion: 'fox jumps over the',
95
//   sequenceLength: 5,
96
//   token: 'fox',
97
//   rankedTokenList: ['fox', 'dog', 'cat', ...]
98
// }
99
```
100

101
### Multiple Completions
102

103
Generates multiple alternative completions for comprehensive text prediction scenarios, providing a top-k sampling approach.
104

105
```javascript { .api }
106
/**
107
 * Generate multiple completion alternatives with ranking
108
 * @param {string} input - Input text to complete
109
 * @returns {CompletionsResult} Multiple completions with ranking information
110
 */
111
getCompletions(input);
112

113
/**
114
 * Multiple completions result structure
115
 */
116
interface CompletionsResult {
117
  completion: string;         // Primary/best completion
118
  token: string;             // First token of primary completion
119
  rankedTokenList: string[]; // Alternative first tokens ranked by likelihood
120
  completions: string[];     // Array of alternative full completions
121
}
122
```
123

124
**Usage Examples:**
125

126
```javascript
127
// Get multiple completion options
128
const completions = model.getCompletions('The sun');
129
// Returns: {
130
//   completion: 'is shining brightly today',
131
//   token: 'is',
132
//   rankedTokenList: ['is', 'was', 'will', 'has', ...],
133
//   completions: [
134
//     'is shining brightly today',
135
//     'was setting behind the mountains',
136
//     'will rise tomorrow morning',
137
//     'has been hidden by clouds',
138
//     // ... more alternatives
139
//   ]
140
// }
141

142
// Use for autocomplete suggestions
143
const suggestions = model.getCompletions('I need to');
144
console.log('Completion options:');
145
suggestions.completions.forEach((completion, index) => {
146
  console.log(`${index + 1}. I need to ${completion}`);
147
});
148
```
149

150
### Prediction Configuration
151

152
The prediction system respects several environment variables for customization:
153

154
```javascript { .api }
155
/**
156
 * Environment configuration affecting prediction behavior
157
 */
158
interface PredictionConfig {
159
  RANKING_BATCH_SIZE: number;      // Number of alternatives in rankedTokenList (default: 50)
160
  MAX_RESPONSE_LENGTH: number;     // Maximum sequence length for predictions (default: 240)
161
  VARIANCE: number;               // Prediction randomization level (default: 0)
162
}
163
```
164

165
**Configuration Examples:**
166

167
```bash
168
# Increase number of alternatives returned
169
export RANKING_BATCH_SIZE=100
170

171
# Allow longer sequence predictions
172
export MAX_RESPONSE_LENGTH=500
173

174
# Add some randomization to predictions (experimental)
175
export VARIANCE=1
176
```
177

178
## Internal Prediction Methods
179

180
These methods are available on the transformer instance but typically used internally:
181

182
```javascript { .api }
183
/**
184
 * Look up n-gram by token sequence
185
 * @param {string} input - Space-separated token sequence
186
 * @returns {Object} N-gram lookup result
187
 */
188
ngramSearch(input);
189

190
/**
191
 * Look up embedding vector for token pair
192
 * @param {string} prevToken - Previous token context
193
 * @param {string} token - Current token
194
 * @returns {number[]} Embedding vector or null vector
195
 */
196
embeddingSearch(prevToken, token);
197

198
/**
199
 * Calculate vector similarity between tokens
200
 * @param {string} prevToken - Previous token context
201
 * @param {string} token - Reference token
202
 * @returns {Object} Similar token with ranking data
203
 */
204
getSimilarToken(prevToken, token);
205
```
206

207
## Error Handling
208

209
The prediction system handles several error conditions gracefully:
210

211
- **Missing N-grams**: When input tokens don't exist in training data, returns empty predictions with error message
212
- **Unknown Tokens**: Skips unrecognized tokens during processing
213
- **End of Sequence**: Gracefully handles completion at natural stopping points
214
- **Invalid Input**: Returns empty results for null or undefined inputs
215

216
**Error Response Format:**
217

218
```javascript
219
{
220
  error: { message: "Failed to look up n-gram." },
221
  token: "",
222
  rankedTokenList: []
223
}
224
```

Version

Tile

Files

text-prediction.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

text-prediction.mddocs/