CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-simpl-schema

A schema validation package that supports direct validation of MongoDB update modifier objects.

Pending
Overview
Eval results
Files

data-cleaning.mddocs/

Data Cleaning

Automatic data cleaning and normalization including type conversion, trimming, filtering, and automatic value assignment.

Capabilities

Clean Method

Primary method for cleaning and normalizing data according to schema rules.

/**
 * Cleans and normalizes an object according to schema rules
 * @param doc - Document to clean
 * @param options - Cleaning options
 * @returns Cleaned document
 */
clean(doc: Record<string | number | symbol, unknown>, options?: CleanOptions): Record<string | number | symbol, unknown>;

interface CleanOptions {
  /** Whether to automatically convert types (e.g., string "123" to number 123) */
  autoConvert?: boolean;
  /** Extended context object passed to autoValue functions */
  extendedAutoValueContext?: CustomAutoValueContext;
  /** Whether to remove properties not defined in schema */
  filter?: boolean;
  /** Whether to run autoValue functions */
  getAutoValues?: boolean;
  /** Whether the document being cleaned is a MongoDB modifier */
  isModifier?: boolean;
  /** Whether this is an upsert operation */
  isUpsert?: boolean;
  /** MongoObject instance for modifier operations */
  mongoObject?: MongoObject;
  /** Whether to mutate the original document (default: false, returns copy) */
  mutate?: boolean;
  /** Whether to remove empty strings */
  removeEmptyStrings?: boolean;
  /** Whether to remove null values from arrays */
  removeNullsFromArrays?: boolean;
  /** Whether to trim whitespace from string values */
  trimStrings?: boolean;
}

Usage Examples:

import SimpleSchema from "simpl-schema";

const userSchema = new SimpleSchema({
  name: {
    type: String,
    trim: true
  },
  age: {
    type: Number,
    autoConvert: true
  },
  email: String,
  tags: [String],
  profile: {
    type: Object,
    optional: true
  },
  'profile.bio': {
    type: String,
    optional: true,
    trim: true
  }
});

// Basic cleaning
const dirtyData = {
  name: "  John Doe  ",
  age: "25",        // String that should be number
  email: "john@example.com",
  tags: ["tag1", "tag2", null, ""],
  unknownField: "should be removed"
};

const cleanData = userSchema.clean(dirtyData, {
  autoConvert: true,
  trimStrings: true,
  filter: true,
  removeEmptyStrings: true,
  removeNullsFromArrays: true
});
// Result: {
//   name: "John Doe",
//   age: 25,
//   email: "john@example.com", 
//   tags: ["tag1", "tag2"]
// }

Auto-conversion

Automatic type conversion during cleaning process.

interface CleanOptions {
  /** Whether to automatically convert types based on schema definitions */
  autoConvert?: boolean;
}

// Auto-conversion rules:
// - String numbers to Number type
// - String booleans ("true"/"false") to Boolean type
// - String dates to Date type
// - Arrays to proper array format
// - Objects to proper object format

Usage Examples:

const schema = new SimpleSchema({
  count: Number,
  isActive: Boolean,
  createdAt: Date,
  tags: [String]
});

const data = {
  count: "42",
  isActive: "true", 
  createdAt: "2023-01-01",
  tags: "tag1,tag2" // Could be converted to array
};

const cleaned = schema.clean(data, { autoConvert: true });
// Result: {
//   count: 42,
//   isActive: true,
//   createdAt: new Date("2023-01-01"),
//   tags: ["tag1,tag2"] // Note: SimpleSchema doesn't auto-split strings
// }

String Trimming

Automatic whitespace removal from string values.

interface CleanOptions {
  /** Whether to trim whitespace from string values */
  trimStrings?: boolean;
}

// Field-level trim option
interface SchemaKeyDefinitionBase {
  /** Whether to trim this specific field during cleaning */
  trim?: boolean;
}

Usage Examples:

const schema = new SimpleSchema({
  title: {
    type: String,
    trim: true  // Field-level trim
  },
  description: String  // No field-level trim specified
});

const data = {
  title: "  My Title  ",
  description: "  My description  "
};

// Clean with global trimStrings option
const cleaned1 = schema.clean(data, { trimStrings: true });
// Result: { title: "My Title", description: "My description" }

// Clean without global trimStrings (only field-level trim applies)
const cleaned2 = schema.clean(data);
// Result: { title: "My Title", description: "  My description  " }

Filtering

Remove properties not defined in the schema.

interface CleanOptions {
  /** Whether to remove properties not defined in schema */
  filter?: boolean;
}

Usage Examples:

const schema = new SimpleSchema({
  name: String,
  email: String
});

const data = {
  name: "John",
  email: "john@example.com",
  password: "secret123",    // Not in schema
  adminField: true          // Not in schema  
};

const filtered = schema.clean(data, { filter: true });
// Result: { name: "John", email: "john@example.com" }

const unfiltered = schema.clean(data);
// Result: { name: "John", email: "john@example.com", password: "secret123", adminField: true }

Empty Value Removal

Remove empty strings and null values from arrays.

interface CleanOptions {
  /** Whether to remove empty strings from the document */
  removeEmptyStrings?: boolean;
  /** Whether to remove null values from arrays */
  removeNullsFromArrays?: boolean;
}

Usage Examples:

const schema = new SimpleSchema({
  title: {
    type: String,
    optional: true
  },
  tags: [String],
  categories: [String]
});

const data = {
  title: "",
  tags: ["javascript", "", "react", null, "node"],
  categories: ["tech", null, "", "programming"]
};

const cleaned = schema.clean(data, {
  removeEmptyStrings: true,
  removeNullsFromArrays: true
});
// Result: {
//   tags: ["javascript", "react", "node"],
//   categories: ["tech", "programming"]
// }
// Note: title is removed because it's an empty string

Auto Values

Automatically set field values using autoValue functions.

interface CleanOptions {
  /** Whether to run autoValue functions during cleaning */
  getAutoValues?: boolean;
  /** Extended context passed to autoValue functions */
  extendedAutoValueContext?: CustomAutoValueContext;
}

// AutoValue function type
type AutoValueFunction = (this: AutoValueContext, obj: any) => any;

interface AutoValueContext {
  /** Current field key */
  key: string;
  /** Whether the field value is explicitly set */
  isSet: boolean;
  /** Current field value (if set) */
  value: any;
  /** MongoDB update operator being used (if any) */
  operator: string | null;
  /** Access to other field values */
  field(key: string): FieldInfo<any>;
  /** Access to sibling field values */
  siblingField(key: string): FieldInfo<any>;
  /** Parent document being cleaned */
  parentDoc: any;
  /** Whether this is an insert operation */
  isInsert: boolean;
  /** Whether this is an update operation */
  isUpdate: boolean;
  /** Whether this is an upsert operation */
  isUpsert: boolean;
  /** Whether this is a modifier document */
  isModifier: boolean;
  /** Current user ID (if available) */
  userId?: string;
  /** Whether field is inside an array */
  isInArrayItemObject: boolean;
  /** Whether field is inside an object within an array */
  isInSubObject: boolean;
}

Usage Examples:

const schema = new SimpleSchema({
  title: String,
  slug: {
    type: String,
    autoValue() {
      if (!this.isSet && this.field('title').isSet) {
        // Auto-generate slug from title
        return this.field('title').value.toLowerCase().replace(/\s+/g, '-');
      }
    }
  },
  createdAt: {
    type: Date,
    autoValue() {
      if (this.isInsert && !this.isSet) {
        return new Date();
      }
    }
  },
  updatedAt: {
    type: Date,
    autoValue() {
      if (this.isUpdate) {
        return new Date();
      }
    }
  },
  userId: {
    type: String,
    autoValue() {
      if (this.isInsert && !this.isSet && this.userId) {
        return this.userId;
      }
    }
  }
});

const data = { title: "My Article" };

const cleaned = schema.clean(data, { 
  getAutoValues: true,
  extendedAutoValueContext: { userId: "user123" }
});
// Result: {
//   title: "My Article",
//   slug: "my-article", 
//   createdAt: new Date(),
//   userId: "user123"
// }

MongoDB Modifier Cleaning

Clean MongoDB update modifier documents.

interface CleanOptions {
  /** Whether the document being cleaned is a MongoDB modifier */
  isModifier?: boolean;
  /** Whether this is an upsert operation */
  isUpsert?: boolean;
  /** MongoObject instance for advanced modifier handling */
  mongoObject?: MongoObject;
}

Usage Examples:

const schema = new SimpleSchema({
  'profile.name': {
    type: String,
    trim: true
  },
  'profile.age': Number,
  tags: [String],
  updatedAt: {
    type: Date,
    autoValue() {
      if (this.isUpdate) {
        return new Date();
      }
    }
  }
});

// Clean $set modifier
const setModifier = {
  $set: {
    'profile.name': '  John Doe  ',
    'profile.age': '30'
  }
};

const cleanedSet = schema.clean(setModifier, {
  isModifier: true,
  trimStrings: true,
  autoConvert: true,
  getAutoValues: true
});
// Result: {
//   $set: {
//     'profile.name': 'John Doe',
//     'profile.age': 30,
//     updatedAt: new Date()
//   }
// }

// Clean $push modifier
const pushModifier = {
  $push: {
    tags: '  javascript  '
  }
};

const cleanedPush = schema.clean(pushModifier, {
  isModifier: true,
  trimStrings: true
});
// Result: {
//   $push: {
//     tags: 'javascript'
//   }
// }

Mutation Control

Control whether cleaning mutates the original object or returns a copy.

interface CleanOptions {
  /** Whether to mutate the original document (default: false) */
  mutate?: boolean;
}

Usage Examples:

const schema = new SimpleSchema({
  name: {
    type: String,
    trim: true
  }
});

const originalData = { name: "  John  " };

// Clean without mutation (default)
const cleaned1 = schema.clean(originalData, { trimStrings: true });
console.log(originalData); // { name: "  John  " } - unchanged
console.log(cleaned1);     // { name: "John" } - cleaned copy

// Clean with mutation
const cleaned2 = schema.clean(originalData, { 
  trimStrings: true, 
  mutate: true 
});
console.log(originalData); // { name: "John" } - mutated
console.log(cleaned2);     // { name: "John" } - same reference as originalData
console.log(originalData === cleaned2); // true

Default Value Assignment

Automatically assign default values during cleaning.

interface SchemaKeyDefinitionBase {
  /** Default value to assign if field is not set */
  defaultValue?: any;
}

// Default values are assigned during cleaning when getAutoValues is true

Usage Examples:

const schema = new SimpleSchema({
  name: String,
  status: {
    type: String,
    defaultValue: 'active'
  },
  settings: {
    type: Object,
    defaultValue: () => ({ theme: 'light', lang: 'en' })
  },
  'settings.theme': {
    type: String,
    defaultValue: 'light'
  },
  'settings.lang': {
    type: String, 
    defaultValue: 'en'
  }
});

const data = { name: "John" };

const cleaned = schema.clean(data, { getAutoValues: true });
// Result: {
//   name: "John",
//   status: "active",
//   settings: { theme: "light", lang: "en" }
// }

Types

interface CustomAutoValueContext {
  [key: string]: any;
}

interface MongoObject {
  [key: string]: any;
}

interface FieldInfo<ValueType> {
  value: ValueType;
  isSet: boolean;
  operator: string | null;
}

Install with Tessl CLI

npx tessl i tessl/npm-simpl-schema

docs

data-cleaning.md

index.md

schema-definition.md

schema-introspection.md

utility-functions.md

validation-context.md

validation.md

tile.json