0
# Text Classification
1
2
Machine learning classifiers for categorizing text into predefined classes. Natural provides Naive Bayes, Logistic Regression, and Maximum Entropy classifiers with full training, persistence, and evaluation capabilities.
3
4
## Capabilities
5
6
### Bayes Classifier
7
8
Naive Bayes classifier implementation for text classification using probabilistic learning.
9
10
```javascript { .api }
11
/**
12
* Naive Bayes text classifier
13
* @param stemmer - Optional stemmer for text preprocessing
14
* @param smoothing - Laplace smoothing parameter (default: 1)
15
*/
16
class BayesClassifier {
17
constructor(stemmer?: object, smoothing?: number);
18
19
/** Add a training document with its classification */
20
addDocument(text: string, classification: string): void;
21
22
/** Remove a training document */
23
removeDocument(text: string, classification: string): void;
24
25
/** Train the classifier on added documents */
26
train(): void;
27
28
/** Retrain from scratch */
29
retrain(): void;
30
31
/** Classify a text observation */
32
classify(observation: string): string;
33
34
/** Get all classification scores for an observation */
35
getClassifications(observation: string): ClassificationResult[];
36
37
/** Save classifier to file */
38
save(filename: string, callback: (err?: Error) => void): void;
39
40
/** Set classifier options */
41
setOptions(options: object): void;
42
}
43
44
interface ClassificationResult {
45
label: string;
46
value: number;
47
}
48
```
49
50
**Static Methods:**
51
52
```javascript { .api }
53
/** Restore classifier from serialized data */
54
static BayesClassifier.restore(classifier: object, stemmer?: object): BayesClassifier;
55
56
/** Load classifier from file */
57
static BayesClassifier.load(filename: string, stemmer?: object, callback: (err?: Error, classifier?: BayesClassifier) => void): void;
58
59
/** Load from storage backend */
60
static BayesClassifier.loadFrom(key: string, stemmer?: object, storageBackend: object): Promise<BayesClassifier>;
61
```
62
63
**Usage Examples:**
64
65
```javascript
66
const natural = require('natural');
67
68
// Create and train classifier
69
const classifier = new natural.BayesClassifier();
70
71
// Add training data
72
classifier.addDocument('I love this sandwich', 'positive');
73
classifier.addDocument('This is an amazing book', 'positive');
74
classifier.addDocument('The book is terrible', 'negative');
75
classifier.addDocument('I hate this movie', 'negative');
76
77
// Train the model
78
classifier.train();
79
80
// Classify new text
81
console.log(classifier.classify('I love this book')); // 'positive'
82
console.log(classifier.classify('This is terrible')); // 'negative'
83
84
// Get detailed scores
85
const scores = classifier.getClassifications('This book is amazing');
86
console.log(scores);
87
// [{ label: 'positive', value: 0.87 }, { label: 'negative', value: 0.13 }]
88
89
// Save classifier
90
classifier.save('my-classifier.json', (err) => {
91
if (err) console.error(err);
92
else console.log('Classifier saved');
93
});
94
95
// Load classifier
96
natural.BayesClassifier.load('my-classifier.json', null, (err, classifier) => {
97
if (!err) {
98
console.log(classifier.classify('This is great!'));
99
}
100
});
101
```
102
103
### Logistic Regression Classifier
104
105
Logistic regression implementation for text classification using maximum likelihood estimation.
106
107
```javascript { .api }
108
/**
109
* Logistic regression text classifier
110
* @param stemmer - Optional stemmer for text preprocessing
111
*/
112
class LogisticRegressionClassifier {
113
constructor(stemmer?: object);
114
115
/** Add a training document with its classification */
116
addDocument(text: string, classification: string): void;
117
118
/** Remove a training document */
119
removeDocument(text: string, classification: string): void;
120
121
/** Train the classifier on added documents */
122
train(): void;
123
124
/** Classify a text observation */
125
classify(observation: string): string;
126
127
/** Get all classification scores for an observation */
128
getClassifications(observation: string): ClassificationResult[];
129
130
/** Save classifier to file */
131
save(filename: string, callback: (err?: Error) => void): void;
132
}
133
```
134
135
**Static Methods:**
136
137
```javascript { .api }
138
/** Restore classifier from serialized data */
139
static LogisticRegressionClassifier.restore(classifier: object, stemmer?: object): LogisticRegressionClassifier;
140
141
/** Load classifier from file */
142
static LogisticRegressionClassifier.load(filename: string, stemmer?: object, callback: (err?: Error, classifier?: LogisticRegressionClassifier) => void): void;
143
```
144
145
**Usage Examples:**
146
147
```javascript
148
const natural = require('natural');
149
150
// Create logistic regression classifier
151
const classifier = new natural.LogisticRegressionClassifier();
152
153
// Add training documents
154
classifier.addDocument('The movie was fantastic', 'positive');
155
classifier.addDocument('I really enjoyed it', 'positive');
156
classifier.addDocument('It was boring and slow', 'negative');
157
classifier.addDocument('Waste of time', 'negative');
158
159
// Train
160
classifier.train();
161
162
// Classify
163
console.log(classifier.classify('Great movie!')); // 'positive'
164
```
165
166
### Maximum Entropy Classifier
167
168
Maximum entropy classifier for complex feature-based classification tasks.
169
170
```javascript { .api }
171
/**
172
* Maximum entropy classifier
173
* @param features - FeatureSet for defining features (optional, creates new if not provided)
174
* @param sample - Sample for training data (optional, creates new if not provided)
175
*/
176
class MaxEntClassifier {
177
constructor(features?: FeatureSet, sample?: Sample);
178
179
/** Add feature set with classification */
180
addFeatureSet(features: FeatureSet, classification: string): void;
181
182
/** Train the classifier */
183
train(): void;
184
185
/** Classify feature set */
186
classify(features: FeatureSet): string;
187
}
188
189
/**
190
* Feature set for MaxEnt classifier
191
*/
192
class FeatureSet {
193
constructor();
194
195
/** Add feature with value */
196
addFeature(name: string, value: any): void;
197
}
198
199
/**
200
* Individual feature
201
*/
202
class Feature {
203
constructor(name: string, value: any);
204
}
205
206
/**
207
* Training sample for MaxEnt
208
*/
209
class Sample {
210
constructor(features: FeatureSet, classification: string);
211
}
212
```
213
214
**Usage Examples:**
215
216
```javascript
217
const natural = require('natural');
218
219
// Create MaxEnt classifier
220
const classifier = new natural.MaxEntClassifier();
221
222
// Create feature sets
223
const posFeatures = new natural.FeatureSet();
224
posFeatures.addFeature('contains_great', true);
225
posFeatures.addFeature('contains_amazing', true);
226
posFeatures.addFeature('word_count', 5);
227
228
const negFeatures = new natural.FeatureSet();
229
negFeatures.addFeature('contains_terrible', true);
230
negFeatures.addFeature('contains_awful', true);
231
negFeatures.addFeature('word_count', 4);
232
233
// Add training data
234
classifier.addFeatureSet(posFeatures, 'positive');
235
classifier.addFeatureSet(negFeatures, 'negative');
236
237
// Train
238
classifier.train();
239
240
// Classify new feature set
241
const testFeatures = new natural.FeatureSet();
242
testFeatures.addFeature('contains_great', true);
243
testFeatures.addFeature('word_count', 3);
244
245
console.log(classifier.classify(testFeatures)); // 'positive'
246
```
247
248
### Supporting Classes
249
250
**Advanced MaxEnt Supporting Classes:**
251
252
```javascript { .api }
253
/**
254
* Context for MaxEnt elements containing feature information
255
*/
256
class Context {
257
constructor(data?: object);
258
}
259
260
/**
261
* Base element class for MaxEnt training samples
262
*/
263
class Element {
264
constructor(classification: string, context: Context);
265
266
/** Generate features for this element */
267
generateFeatures(featureSet: FeatureSet): void;
268
}
269
270
/**
271
* Simple example element for basic MaxEnt usage
272
*/
273
class SEElement extends Element {
274
constructor(classification: string, context: Context);
275
276
/** Generate features for simple examples */
277
generateFeatures(featureSet: FeatureSet): void;
278
}
279
280
/**
281
* POS (Part-of-Speech) specific element for tagging applications
282
*/
283
class POSElement extends Element {
284
constructor(classification: string, context: Context);
285
286
/** Generate POS-specific features including word and tag windows */
287
generateFeatures(featureSet: FeatureSet): void;
288
}
289
290
/**
291
* GIS (Generalized Iterative Scaling) scaler for MaxEnt parameter estimation
292
*/
293
class GISScaler {
294
constructor(featureSet: FeatureSet, sample: Sample);
295
296
/** Calculate maximum sum of features for normalization */
297
calculateMaxSumOfFeatures(): boolean;
298
}
299
300
/**
301
* MaxEnt sentence class for POS tagging
302
*/
303
class MESentence {
304
constructor(taggedWords: Array<{token: string, tag: string}>);
305
306
/** Generate sample elements from sentence for MaxEnt training */
307
generateSampleElements(sample: Sample): void;
308
}
309
310
/**
311
* MaxEnt corpus for POS tagging applications
312
*/
313
class MECorpus {
314
constructor();
315
316
/** Add tagged sentence to corpus */
317
addSentence(sentence: MESentence): void;
318
319
/** Generate training sample from corpus */
320
generateSample(): Sample;
321
}