or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

index.md linguistic-analysis.md pattern-matching.md plugin-system.md text-analysis.md text-transformation.md

tile.json

tessl/npm-compromise

Modest natural language processing library for JavaScript that enables text parsing, analysis, and manipulation in browsers and Node.js.

Workspace: tessl
Visibility: Public
Created: 3 months ago
Last updated: 3 months ago
Describes: pkg:npm/compromise@14.14.x

To install, run

npx @tessl/cli install tessl/npm-compromise@14.14.0

Compromise

Compromise is a comprehensive natural language processing library for JavaScript that enables developers to parse, analyze, and manipulate text data in web browsers and Node.js environments. It provides an intuitive API for common NLP tasks including part-of-speech tagging, verb conjugation, entity extraction, pattern matching using flexible syntax, and text transformation operations.

Package Information

Package Name: compromise
Package Type: npm
Language: JavaScript (ES modules)
Installation: npm install compromise

Core Imports

Default import (full functionality):

import nlp from "compromise";

Modular imports for smaller bundle sizes:

// Minimal tokenization and basic processing (~40kb)
import nlp from "compromise/one";

// Core NLP with tagging (~70kb)
import nlp from "compromise/two";

// Full library with all plugins (~150kb)
import nlp from "compromise/three";

// Alias for compromise/one
import nlp from "compromise/tokenize";

For CommonJS:

const nlp = require("compromise");
const nlp = require("compromise/one");

Basic Usage

import nlp from "compromise";

// Parse and analyze text
const doc = nlp("I walked to the store yesterday");

// Extract different types of information
const verbs = doc.verbs().out('text'); // ['walked']
const nouns = doc.nouns().out('text'); // ['store']
const people = doc.people().out('text'); // []

// Transform text
const pastTense = doc.verbs().toPastTense().out('text'); // 'walked'
const presentTense = doc.verbs().toPresentTense().out('text'); // 'walk'

// Pattern matching
const hasStore = doc.has('store'); // true
const beforeStore = doc.before('store').out('text'); // 'I walked to the'

// Text transformation
const normalized = doc.normalize().out('text');
const titleCase = doc.toTitleCase().out('text');

Architecture

Compromise is built around several key components:

Modular Design: Three levels of functionality (one, two, three) with increasing capabilities
View Objects: Fluent interface for chaining text analysis and transformation operations
Plugin System: Extensible architecture allowing custom functionality through plugins
Pattern Matching: Flexible syntax for finding and manipulating text patterns
Linguistic Models: Built-in models for part-of-speech tagging, verb conjugation, and entity recognition
Static Methods: Constructor-level utilities for lexicon management and optimization

Capabilities

Text Analysis and Processing

Core text analysis functionality including tokenization, pattern matching, and basic transformations. Available in all module levels.

function nlp(text: string, lexicon?: Lexicon): View;

Text Analysis

Part-of-Speech Tagging and Linguistic Analysis

Advanced linguistic analysis including POS tagging, entity recognition, and grammatical parsing. Available in compromise/two and above.

// Core linguistic methods on View objects
interface LinguisticView extends View {
  verbs(n?: number): Verbs;
  nouns(n?: number): Nouns; 
  adjectives(n?: number): Adjectives;
  adverbs(n?: number): Adverbs;
  people(n?: number): People;
  places(n?: number): View;
  organizations(n?: number): View;
}

Linguistic Analysis

Text Transformation and Conjugation

Text transformation capabilities including verb conjugation, noun pluralization, and tense conversion. Available in compromise/three.

// Text transformation methods
interface TransformationView extends View {
  toPastTense(): View;
  toPresentTense(): View;
  toFutureTense(): View;
  toPlural(): View;
  toSingular(): View;
  normalize(options?: object): View;
}

Text Transformation

Pattern Matching and Search

Advanced pattern matching with flexible syntax for finding and extracting specific text patterns.

// Pattern matching methods
interface PatternView extends View {
  match(pattern: string | Net, group?: string, options?: object): View;
  has(pattern: string | Net, group?: string, options?: object): boolean;
  before(pattern: string | Net, group?: string, options?: object): View;
  after(pattern: string | Net, group?: string, options?: object): View;
}

Pattern Matching

Plugin System and Extension

Plugin system for extending compromise with custom functionality and linguistic models.

// Static plugin methods
interface PluginSystem {
  plugin(plugin: Plugin): any;
  extend(plugin: Plugin): any;
  addWords(words: Lexicon, isFrozen?: boolean): any;
  addTags(tags: object): any;
  parseMatch(match: string, opts?: object): ParsedMatch;
  buildTrie(words: string[]): object;
  lazy(text: string, match?: string): View;
}

Plugin System

Types

interface View {
  found: boolean;
  docs: Document;
  document: Document;
  pointer: Pointer[] | null;
  fullPointer: Pointer[];
  methods: object;
  model: object;
  hooks: string[];
  length: number;
  isView: boolean;
  
  // Core methods available on all View objects
  clone(shallow?: boolean): View;
  text(options?: string | object): string;
  json(options?: JsonProps | string): any;
  out(format?: string): any;
  debug(): View;
  
  // Pattern matching
  match(pattern: string | Net, group?: string | number, options?: object): View;
  has(pattern: string | Net, group?: string | number, options?: object): boolean;
  
  // Utility
  compute(method: string | string[]): View;
  termList(): Term[];
}

interface Document extends Array<Term[]> {}

interface Pointer extends Array<number | string | undefined> {
  0?: number; // document index
  1?: number; // start term index
  2?: number; // end term index
  3?: string; // start term id
  4?: string; // end term id
}

interface Term {
  text: string;
  pre: string;
  post: string;
  normal: string;
  tags?: Set<string>;
  index?: [number, number];
  id?: string;
  chunk?: string;
  dirty?: boolean;
  syllables?: string[];
}

interface JsonProps {
  text?: boolean;
  normal?: boolean;
  reduced?: boolean;
  trim?: boolean;
  offset?: boolean;
  count?: boolean;
  unique?: boolean;
  index?: boolean;
  terms?: {
    text?: boolean;
    normal?: boolean;
    clean?: boolean;
    implicit?: boolean;
    tags?: boolean;
    whitespace?: boolean;
    id?: boolean;
    offset?: boolean;
    bestTag?: boolean;
  };
}

interface Lexicon {
  [key: string]: string;
}

interface Plugin {
  methods?: { [className: string]: { [methodName: string]: Function } };
  model?: { [category: string]: any };
  compute?: { [functionName: string]: Function };
  hooks?: string[];
  tags?: { [tagName: string]: TagDefinition };
  words?: { [word: string]: string };
  frozen?: { [word: string]: string };
  lib?: { [methodName: string]: Function };
  api?: (View: any) => void;
  mutate?: (world: object, nlp: any) => void;
}

interface TagDefinition {
  isA?: string;
  not?: string;
  [property: string]: any;
}

interface Net {
  hooks: object;
  always?: any;
  isNet: boolean;
}

interface Match {
  match: string;
  tag?: string | string[];
  unTag?: string | string[];
  group?: string | number;
  reason?: string;
  freeze?: boolean;
}

type ParsedMatch = object[]