CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-pinyin

Chinese character to Pinyin conversion with intelligent phrase matching and multiple pronunciation support

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

core-conversion.mddocs/

Core Conversion

Main Pinyin conversion functionality providing Chinese character to Pinyin romanization with configurable styles, modes, and intelligent phrase matching.

Capabilities

Main Conversion Function

Converts Chinese characters to Pinyin with comprehensive configuration options.

/**
 * Convert Chinese characters to Pinyin
 * @param hans - Chinese text to convert
 * @param options - Conversion configuration options
 * @returns Array of Pinyin arrays, each sub-array contains Pinyin for one character/phrase
 */
function pinyin(hans: string, options?: IPinyinOptions): string[][];

interface IPinyinOptions {
  /** Pinyin output style - controls tone marks, casing, etc. */
  style?: IPinyinStyle;
  /** Conversion mode - normal or surname-optimized */
  mode?: IPinyinMode;
  /** Text segmentation library for phrase recognition */
  segment?: IPinyinSegment | boolean;
  /** Enable multiple pronunciations for polyphonic characters */
  heteronym?: boolean;
  /** Group phrase Pinyin together instead of character-by-character */
  group?: boolean;
  /** Return all possible pronunciation combinations in compact form */
  compact?: boolean;
}

Usage Examples:

import pinyin from "pinyin";

// Basic character conversion
console.log(pinyin("你好"));
// Result: [["nǐ"], ["hǎo"]]

// Multiple pronunciation support
console.log(pinyin("中", { heteronym: true }));
// Result: [["zhōng", "zhòng"]]

// Different output styles
console.log(pinyin("你好", { style: "tone2" }));
// Result: [["ni3"], ["hao3"]]

console.log(pinyin("你好", { style: "normal" }));
// Result: [["ni"], ["hao"]]

// Segmentation for phrase accuracy
console.log(pinyin("北京大学", { segment: true }));
// Result: [["běi"], ["jīng"], ["dà"], ["xué"]]

// Grouped phrase output
console.log(pinyin("北京大学", { segment: true, group: true }));
// Result: [["běijīng"], ["dàxué"]]

// Compact pronunciation combinations
console.log(pinyin("你好吗", { heteronym: true, compact: true }));
// Result: [["nǐhǎoma"], ["nǐhǎomá"], ["nǐhǎomǎ"], ["nǐhàoma"], ...]

Conversion Modes

Controls how the conversion algorithm prioritizes different pronunciations.

type IPinyinMode = "normal" | "surname" | "NORMAL" | "SURNAME";

Mode Details:

  • normal: Standard conversion using most common pronunciations
  • surname: Optimized for Chinese surnames and names, prioritizing surname pronunciations

Usage Examples:

// Normal mode (default)
console.log(pinyin("华夫人"));
// Result: [["huá"], ["fū"], ["rén"]]

// Surname mode - better for names
console.log(pinyin("华夫人", { mode: "surname" }));
// Result: [["huà"], ["fū"], ["rén"]]

Multi-pronunciation Handling

Support for polyphonic characters that have multiple valid pronunciations.

interface IPinyinOptions {
  /** Enable multiple pronunciations for polyphonic characters */
  heteronym?: boolean;
}

Usage Examples:

// Single pronunciation (default)
console.log(pinyin("中"));
// Result: [["zhōng"]]

// Multiple pronunciations
console.log(pinyin("中", { heteronym: true }));
// Result: [["zhōng", "zhòng"]]

// Multiple characters with heteronyms
console.log(pinyin("中心", { heteronym: true }));
// Result: [["zhōng", "zhòng"], ["xīn"]]

Phrase Grouping

Controls whether phrase Pinyin is grouped together or split character-by-character.

interface IPinyinOptions {
  /** Group phrase Pinyin together instead of character-by-character */
  group?: boolean;
}

Usage Examples:

// Character-by-character (default)
console.log(pinyin("我喜欢你", { segment: true }));
// Result: [["wǒ"], ["xǐ"], ["huān"], ["nǐ"]]

// Grouped phrases
console.log(pinyin("我喜欢你", { segment: true, group: true }));
// Result: [["wǒ"], ["xǐhuān"], ["nǐ"]]

Compact Output

Generates all possible pronunciation combinations in a compact format.

interface IPinyinOptions {
  /** Return all possible pronunciation combinations in compact form */
  compact?: boolean;
}

Usage Examples:

// Normal heteronym output
console.log(pinyin("你好吗", { heteronym: true }));
// Result: [["nǐ"], ["hǎo", "hào"], ["ma", "má", "mǎ"]]

// Compact combinations
console.log(pinyin("你好吗", { heteronym: true, compact: true }));
// Result: [
//   ["nǐ", "hǎo", "ma"], ["nǐ", "hǎo", "má"], ["nǐ", "hǎo", "mǎ"],
//   ["nǐ", "hào", "ma"], ["nǐ", "hào", "má"], ["nǐ", "hào", "mǎ"]
// ]

Error Handling

The function handles various input edge cases gracefully:

  • Empty string: Returns empty array []
  • Non-string input: Returns empty array []
  • Non-Chinese characters: Returns original characters in result arrays
  • Mixed Chinese/non-Chinese: Processes Chinese characters, preserves others

Examples:

console.log(pinyin(""));           // []
console.log(pinyin("hello"));      // [["hello"]]
console.log(pinyin("你好world"));   // [["nǐ"], ["hǎo"], ["world"]]

Install with Tessl CLI

npx tessl i tessl/npm-pinyin

docs

core-conversion.md

index.md

output-styles.md

text-segmentation.md

utility-functions.md

tile.json