or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.mdlanguage-model.mdtext-prediction.mdtraining-system.mdvector-operations.md

text-prediction.mddocs/

0

# Text Prediction

1

2

The Text Prediction system provides core prediction capabilities for single tokens, token sequences, and multiple completion alternatives. It uses n-gram analysis combined with high-dimensional embeddings to generate contextually appropriate text completions.

3

4

## Capabilities

5

6

### Token Prediction

7

8

Predicts the next single token based on input context using n-gram lookup and similarity analysis.

9

10

```javascript { .api }

11

/**

12

* Predict the next single token based on input context

13

* @param {string} token - Input token or phrase to predict from

14

* @returns {TokenPredictionResult} Prediction with token and ranked alternatives

15

*/

16

getTokenPrediction(token);

17

18

/**

19

* Token prediction result structure

20

*/

21

interface TokenPredictionResult {

22

token: string; // Best predicted next token

23

rankedTokenList: string[]; // Alternative tokens ranked by likelihood

24

error?: { // Error information if prediction fails

25

message: string;

26

};

27

}

28

```

29

30

**Usage Examples:**

31

32

```javascript

33

// Simple token prediction

34

const prediction1 = model.getTokenPrediction('hello');

35

// Returns: { token: 'world', rankedTokenList: ['world', 'there', 'everyone', ...] }

36

37

// Phrase-based prediction

38

const prediction2 = model.getTokenPrediction('the weather is');

39

// Returns: { token: 'beautiful', rankedTokenList: ['beautiful', 'nice', 'sunny', ...] }

40

41

// Error handling

42

const prediction3 = model.getTokenPrediction('xyzunknowntoken');

43

// May return: { error: { message: 'Failed to look up n-gram.' }, token: '', rankedTokenList: [] }

44

```

45

46

### Token Sequence Prediction

47

48

Predicts a sequence of multiple tokens by iteratively applying token prediction to build longer completions.

49

50

```javascript { .api }

51

/**

52

* Predict a sequence of tokens extending the input

53

* @param {string} input - Input text to extend

54

* @param {number} [sequenceLength=2] - Number of tokens to predict in sequence

55

* @returns {SequencePredictionResult} Sequence completion with metadata

56

*/

57

getTokenSequencePrediction(input, sequenceLength);

58

59

/**

60

* Sequence prediction result structure

61

*/

62

interface SequencePredictionResult {

63

completion: string; // Complete predicted sequence

64

sequenceLength: number; // Number of tokens in sequence

65

token: string; // First predicted token

66

rankedTokenList: string[]; // Alternative first tokens ranked by likelihood

67

}

68

```

69

70

**Usage Examples:**

71

72

```javascript

73

// Short sequence prediction

74

const sequence1 = model.getTokenSequencePrediction('JavaScript is', 3);

75

// Returns: {

76

// completion: 'a programming language',

77

// sequenceLength: 3,

78

// token: 'a',

79

// rankedTokenList: ['a', 'an', 'the', ...]

80

// }

81

82

// Single token sequence (equivalent to getTokenPrediction but different format)

83

const sequence2 = model.getTokenSequencePrediction('hello', 1);

84

// Returns: {

85

// completion: 'world',

86

// sequenceLength: 1,

87

// token: 'world',

88

// rankedTokenList: ['world', 'there', ...]

89

// }

90

91

// Longer sequence

92

const sequence3 = model.getTokenSequencePrediction('The quick brown', 5);

93

// Returns: {

94

// completion: 'fox jumps over the',

95

// sequenceLength: 5,

96

// token: 'fox',

97

// rankedTokenList: ['fox', 'dog', 'cat', ...]

98

// }

99

```

100

101

### Multiple Completions

102

103

Generates multiple alternative completions for comprehensive text prediction scenarios, providing a top-k sampling approach.

104

105

```javascript { .api }

106

/**

107

* Generate multiple completion alternatives with ranking

108

* @param {string} input - Input text to complete

109

* @returns {CompletionsResult} Multiple completions with ranking information

110

*/

111

getCompletions(input);

112

113

/**

114

* Multiple completions result structure

115

*/

116

interface CompletionsResult {

117

completion: string; // Primary/best completion

118

token: string; // First token of primary completion

119

rankedTokenList: string[]; // Alternative first tokens ranked by likelihood

120

completions: string[]; // Array of alternative full completions

121

}

122

```

123

124

**Usage Examples:**

125

126

```javascript

127

// Get multiple completion options

128

const completions = model.getCompletions('The sun');

129

// Returns: {

130

// completion: 'is shining brightly today',

131

// token: 'is',

132

// rankedTokenList: ['is', 'was', 'will', 'has', ...],

133

// completions: [

134

// 'is shining brightly today',

135

// 'was setting behind the mountains',

136

// 'will rise tomorrow morning',

137

// 'has been hidden by clouds',

138

// // ... more alternatives

139

// ]

140

// }

141

142

// Use for autocomplete suggestions

143

const suggestions = model.getCompletions('I need to');

144

console.log('Completion options:');

145

suggestions.completions.forEach((completion, index) => {

146

console.log(`${index + 1}. I need to ${completion}`);

147

});

148

```

149

150

### Prediction Configuration

151

152

The prediction system respects several environment variables for customization:

153

154

```javascript { .api }

155

/**

156

* Environment configuration affecting prediction behavior

157

*/

158

interface PredictionConfig {

159

RANKING_BATCH_SIZE: number; // Number of alternatives in rankedTokenList (default: 50)

160

MAX_RESPONSE_LENGTH: number; // Maximum sequence length for predictions (default: 240)

161

VARIANCE: number; // Prediction randomization level (default: 0)

162

}

163

```

164

165

**Configuration Examples:**

166

167

```bash

168

# Increase number of alternatives returned

169

export RANKING_BATCH_SIZE=100

170

171

# Allow longer sequence predictions

172

export MAX_RESPONSE_LENGTH=500

173

174

# Add some randomization to predictions (experimental)

175

export VARIANCE=1

176

```

177

178

## Internal Prediction Methods

179

180

These methods are available on the transformer instance but typically used internally:

181

182

```javascript { .api }

183

/**

184

* Look up n-gram by token sequence

185

* @param {string} input - Space-separated token sequence

186

* @returns {Object} N-gram lookup result

187

*/

188

ngramSearch(input);

189

190

/**

191

* Look up embedding vector for token pair

192

* @param {string} prevToken - Previous token context

193

* @param {string} token - Current token

194

* @returns {number[]} Embedding vector or null vector

195

*/

196

embeddingSearch(prevToken, token);

197

198

/**

199

* Calculate vector similarity between tokens

200

* @param {string} prevToken - Previous token context

201

* @param {string} token - Reference token

202

* @returns {Object} Similar token with ranking data

203

*/

204

getSimilarToken(prevToken, token);

205

```

206

207

## Error Handling

208

209

The prediction system handles several error conditions gracefully:

210

211

- **Missing N-grams**: When input tokens don't exist in training data, returns empty predictions with error message

212

- **Unknown Tokens**: Skips unrecognized tokens during processing

213

- **End of Sequence**: Gracefully handles completion at natural stopping points

214

- **Invalid Input**: Returns empty results for null or undefined inputs

215

216

**Error Response Format:**

217

218

```javascript

219

{

220

error: { message: "Failed to look up n-gram." },

221

token: "",

222

rankedTokenList: []

223

}

224

```