0
# Data Cleaning
1
2
Automatic data cleaning and normalization including type conversion, trimming, filtering, and automatic value assignment.
3
4
## Capabilities
5
6
### Clean Method
7
8
Primary method for cleaning and normalizing data according to schema rules.
9
10
```typescript { .api }
11
/**
12
* Cleans and normalizes an object according to schema rules
13
* @param doc - Document to clean
14
* @param options - Cleaning options
15
* @returns Cleaned document
16
*/
17
clean(doc: Record<string | number | symbol, unknown>, options?: CleanOptions): Record<string | number | symbol, unknown>;
18
19
interface CleanOptions {
20
/** Whether to automatically convert types (e.g., string "123" to number 123) */
21
autoConvert?: boolean;
22
/** Extended context object passed to autoValue functions */
23
extendedAutoValueContext?: CustomAutoValueContext;
24
/** Whether to remove properties not defined in schema */
25
filter?: boolean;
26
/** Whether to run autoValue functions */
27
getAutoValues?: boolean;
28
/** Whether the document being cleaned is a MongoDB modifier */
29
isModifier?: boolean;
30
/** Whether this is an upsert operation */
31
isUpsert?: boolean;
32
/** MongoObject instance for modifier operations */
33
mongoObject?: MongoObject;
34
/** Whether to mutate the original document (default: false, returns copy) */
35
mutate?: boolean;
36
/** Whether to remove empty strings */
37
removeEmptyStrings?: boolean;
38
/** Whether to remove null values from arrays */
39
removeNullsFromArrays?: boolean;
40
/** Whether to trim whitespace from string values */
41
trimStrings?: boolean;
42
}
43
```
44
45
**Usage Examples:**
46
47
```typescript
48
import SimpleSchema from "simpl-schema";
49
50
const userSchema = new SimpleSchema({
51
name: {
52
type: String,
53
trim: true
54
},
55
age: {
56
type: Number,
57
autoConvert: true
58
},
59
email: String,
60
tags: [String],
61
profile: {
62
type: Object,
63
optional: true
64
},
65
'profile.bio': {
66
type: String,
67
optional: true,
68
trim: true
69
}
70
});
71
72
// Basic cleaning
73
const dirtyData = {
74
name: " John Doe ",
75
age: "25", // String that should be number
76
email: "john@example.com",
77
tags: ["tag1", "tag2", null, ""],
78
unknownField: "should be removed"
79
};
80
81
const cleanData = userSchema.clean(dirtyData, {
82
autoConvert: true,
83
trimStrings: true,
84
filter: true,
85
removeEmptyStrings: true,
86
removeNullsFromArrays: true
87
});
88
// Result: {
89
// name: "John Doe",
90
// age: 25,
91
// email: "john@example.com",
92
// tags: ["tag1", "tag2"]
93
// }
94
```
95
96
### Auto-conversion
97
98
Automatic type conversion during cleaning process.
99
100
```typescript { .api }
101
interface CleanOptions {
102
/** Whether to automatically convert types based on schema definitions */
103
autoConvert?: boolean;
104
}
105
106
// Auto-conversion rules:
107
// - String numbers to Number type
108
// - String booleans ("true"/"false") to Boolean type
109
// - String dates to Date type
110
// - Arrays to proper array format
111
// - Objects to proper object format
112
```
113
114
**Usage Examples:**
115
116
```typescript
117
const schema = new SimpleSchema({
118
count: Number,
119
isActive: Boolean,
120
createdAt: Date,
121
tags: [String]
122
});
123
124
const data = {
125
count: "42",
126
isActive: "true",
127
createdAt: "2023-01-01",
128
tags: "tag1,tag2" // Could be converted to array
129
};
130
131
const cleaned = schema.clean(data, { autoConvert: true });
132
// Result: {
133
// count: 42,
134
// isActive: true,
135
// createdAt: new Date("2023-01-01"),
136
// tags: ["tag1,tag2"] // Note: SimpleSchema doesn't auto-split strings
137
// }
138
```
139
140
### String Trimming
141
142
Automatic whitespace removal from string values.
143
144
```typescript { .api }
145
interface CleanOptions {
146
/** Whether to trim whitespace from string values */
147
trimStrings?: boolean;
148
}
149
150
// Field-level trim option
151
interface SchemaKeyDefinitionBase {
152
/** Whether to trim this specific field during cleaning */
153
trim?: boolean;
154
}
155
```
156
157
**Usage Examples:**
158
159
```typescript
160
const schema = new SimpleSchema({
161
title: {
162
type: String,
163
trim: true // Field-level trim
164
},
165
description: String // No field-level trim specified
166
});
167
168
const data = {
169
title: " My Title ",
170
description: " My description "
171
};
172
173
// Clean with global trimStrings option
174
const cleaned1 = schema.clean(data, { trimStrings: true });
175
// Result: { title: "My Title", description: "My description" }
176
177
// Clean without global trimStrings (only field-level trim applies)
178
const cleaned2 = schema.clean(data);
179
// Result: { title: "My Title", description: " My description " }
180
```
181
182
### Filtering
183
184
Remove properties not defined in the schema.
185
186
```typescript { .api }
187
interface CleanOptions {
188
/** Whether to remove properties not defined in schema */
189
filter?: boolean;
190
}
191
```
192
193
**Usage Examples:**
194
195
```typescript
196
const schema = new SimpleSchema({
197
name: String,
198
email: String
199
});
200
201
const data = {
202
name: "John",
203
email: "john@example.com",
204
password: "secret123", // Not in schema
205
adminField: true // Not in schema
206
};
207
208
const filtered = schema.clean(data, { filter: true });
209
// Result: { name: "John", email: "john@example.com" }
210
211
const unfiltered = schema.clean(data);
212
// Result: { name: "John", email: "john@example.com", password: "secret123", adminField: true }
213
```
214
215
### Empty Value Removal
216
217
Remove empty strings and null values from arrays.
218
219
```typescript { .api }
220
interface CleanOptions {
221
/** Whether to remove empty strings from the document */
222
removeEmptyStrings?: boolean;
223
/** Whether to remove null values from arrays */
224
removeNullsFromArrays?: boolean;
225
}
226
```
227
228
**Usage Examples:**
229
230
```typescript
231
const schema = new SimpleSchema({
232
title: {
233
type: String,
234
optional: true
235
},
236
tags: [String],
237
categories: [String]
238
});
239
240
const data = {
241
title: "",
242
tags: ["javascript", "", "react", null, "node"],
243
categories: ["tech", null, "", "programming"]
244
};
245
246
const cleaned = schema.clean(data, {
247
removeEmptyStrings: true,
248
removeNullsFromArrays: true
249
});
250
// Result: {
251
// tags: ["javascript", "react", "node"],
252
// categories: ["tech", "programming"]
253
// }
254
// Note: title is removed because it's an empty string
255
```
256
257
### Auto Values
258
259
Automatically set field values using autoValue functions.
260
261
```typescript { .api }
262
interface CleanOptions {
263
/** Whether to run autoValue functions during cleaning */
264
getAutoValues?: boolean;
265
/** Extended context passed to autoValue functions */
266
extendedAutoValueContext?: CustomAutoValueContext;
267
}
268
269
// AutoValue function type
270
type AutoValueFunction = (this: AutoValueContext, obj: any) => any;
271
272
interface AutoValueContext {
273
/** Current field key */
274
key: string;
275
/** Whether the field value is explicitly set */
276
isSet: boolean;
277
/** Current field value (if set) */
278
value: any;
279
/** MongoDB update operator being used (if any) */
280
operator: string | null;
281
/** Access to other field values */
282
field(key: string): FieldInfo<any>;
283
/** Access to sibling field values */
284
siblingField(key: string): FieldInfo<any>;
285
/** Parent document being cleaned */
286
parentDoc: any;
287
/** Whether this is an insert operation */
288
isInsert: boolean;
289
/** Whether this is an update operation */
290
isUpdate: boolean;
291
/** Whether this is an upsert operation */
292
isUpsert: boolean;
293
/** Whether this is a modifier document */
294
isModifier: boolean;
295
/** Current user ID (if available) */
296
userId?: string;
297
/** Whether field is inside an array */
298
isInArrayItemObject: boolean;
299
/** Whether field is inside an object within an array */
300
isInSubObject: boolean;
301
}
302
```
303
304
**Usage Examples:**
305
306
```typescript
307
const schema = new SimpleSchema({
308
title: String,
309
slug: {
310
type: String,
311
autoValue() {
312
if (!this.isSet && this.field('title').isSet) {
313
// Auto-generate slug from title
314
return this.field('title').value.toLowerCase().replace(/\s+/g, '-');
315
}
316
}
317
},
318
createdAt: {
319
type: Date,
320
autoValue() {
321
if (this.isInsert && !this.isSet) {
322
return new Date();
323
}
324
}
325
},
326
updatedAt: {
327
type: Date,
328
autoValue() {
329
if (this.isUpdate) {
330
return new Date();
331
}
332
}
333
},
334
userId: {
335
type: String,
336
autoValue() {
337
if (this.isInsert && !this.isSet && this.userId) {
338
return this.userId;
339
}
340
}
341
}
342
});
343
344
const data = { title: "My Article" };
345
346
const cleaned = schema.clean(data, {
347
getAutoValues: true,
348
extendedAutoValueContext: { userId: "user123" }
349
});
350
// Result: {
351
// title: "My Article",
352
// slug: "my-article",
353
// createdAt: new Date(),
354
// userId: "user123"
355
// }
356
```
357
358
### MongoDB Modifier Cleaning
359
360
Clean MongoDB update modifier documents.
361
362
```typescript { .api }
363
interface CleanOptions {
364
/** Whether the document being cleaned is a MongoDB modifier */
365
isModifier?: boolean;
366
/** Whether this is an upsert operation */
367
isUpsert?: boolean;
368
/** MongoObject instance for advanced modifier handling */
369
mongoObject?: MongoObject;
370
}
371
```
372
373
**Usage Examples:**
374
375
```typescript
376
const schema = new SimpleSchema({
377
'profile.name': {
378
type: String,
379
trim: true
380
},
381
'profile.age': Number,
382
tags: [String],
383
updatedAt: {
384
type: Date,
385
autoValue() {
386
if (this.isUpdate) {
387
return new Date();
388
}
389
}
390
}
391
});
392
393
// Clean $set modifier
394
const setModifier = {
395
$set: {
396
'profile.name': ' John Doe ',
397
'profile.age': '30'
398
}
399
};
400
401
const cleanedSet = schema.clean(setModifier, {
402
isModifier: true,
403
trimStrings: true,
404
autoConvert: true,
405
getAutoValues: true
406
});
407
// Result: {
408
// $set: {
409
// 'profile.name': 'John Doe',
410
// 'profile.age': 30,
411
// updatedAt: new Date()
412
// }
413
// }
414
415
// Clean $push modifier
416
const pushModifier = {
417
$push: {
418
tags: ' javascript '
419
}
420
};
421
422
const cleanedPush = schema.clean(pushModifier, {
423
isModifier: true,
424
trimStrings: true
425
});
426
// Result: {
427
// $push: {
428
// tags: 'javascript'
429
// }
430
// }
431
```
432
433
### Mutation Control
434
435
Control whether cleaning mutates the original object or returns a copy.
436
437
```typescript { .api }
438
interface CleanOptions {
439
/** Whether to mutate the original document (default: false) */
440
mutate?: boolean;
441
}
442
```
443
444
**Usage Examples:**
445
446
```typescript
447
const schema = new SimpleSchema({
448
name: {
449
type: String,
450
trim: true
451
}
452
});
453
454
const originalData = { name: " John " };
455
456
// Clean without mutation (default)
457
const cleaned1 = schema.clean(originalData, { trimStrings: true });
458
console.log(originalData); // { name: " John " } - unchanged
459
console.log(cleaned1); // { name: "John" } - cleaned copy
460
461
// Clean with mutation
462
const cleaned2 = schema.clean(originalData, {
463
trimStrings: true,
464
mutate: true
465
});
466
console.log(originalData); // { name: "John" } - mutated
467
console.log(cleaned2); // { name: "John" } - same reference as originalData
468
console.log(originalData === cleaned2); // true
469
```
470
471
### Default Value Assignment
472
473
Automatically assign default values during cleaning.
474
475
```typescript { .api }
476
interface SchemaKeyDefinitionBase {
477
/** Default value to assign if field is not set */
478
defaultValue?: any;
479
}
480
481
// Default values are assigned during cleaning when getAutoValues is true
482
```
483
484
**Usage Examples:**
485
486
```typescript
487
const schema = new SimpleSchema({
488
name: String,
489
status: {
490
type: String,
491
defaultValue: 'active'
492
},
493
settings: {
494
type: Object,
495
defaultValue: () => ({ theme: 'light', lang: 'en' })
496
},
497
'settings.theme': {
498
type: String,
499
defaultValue: 'light'
500
},
501
'settings.lang': {
502
type: String,
503
defaultValue: 'en'
504
}
505
});
506
507
const data = { name: "John" };
508
509
const cleaned = schema.clean(data, { getAutoValues: true });
510
// Result: {
511
// name: "John",
512
// status: "active",
513
// settings: { theme: "light", lang: "en" }
514
// }
515
```
516
517
## Types
518
519
```typescript { .api }
520
interface CustomAutoValueContext {
521
[key: string]: any;
522
}
523
524
interface MongoObject {
525
[key: string]: any;
526
}
527
528
interface FieldInfo<ValueType> {
529
value: ValueType;
530
isSet: boolean;
531
operator: string | null;
532
}
533
```