CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-natural

Comprehensive natural language processing library with tokenization, stemming, classification, sentiment analysis, phonetics, distance algorithms, and WordNet integration.

Pending
Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Pending

The risk profile of this skill

Overview
Eval results
Files

transliterators.mddocs/

Text Transliteration

Japanese text transliteration functionality for converting Hiragana and Katakana characters to romanized text using the modified Hepburn system. Essential for processing Japanese text in applications that need ASCII representation.

Capabilities

Japanese Transliteration

Converts Japanese Hiragana and Katakana text to romanized characters using the modified Hepburn system with comprehensive rules for accurate transliteration.

/**
 * Japanese text transliteration using modified Hepburn system
 * Converts Hiragana and Katakana to roman characters
 * Based on CLDR transform rule set with fixes for missing characters and edge cases
 */
class TransliterateJa {
  /** Transliterate Japanese text to romanized form */
  static transliterate(text: string): string;
}

Features:

  • Supports both Hiragana and Katakana character sets
  • Handles long vowel marks (ー) correctly
  • Processes small tsu (っ/ッ) for consonant doubling
  • Manages complex character combinations and diphthongs
  • Includes support for katakana middle dot (・)
  • Handles final small tsu and iteration marks
  • Supports modern extensions like small vowels and special combinations

Usage Examples:

const natural = require('natural');

// Basic transliteration
const result1 = natural.TransliterateJa.transliterate('こんにちは');
console.log(result1); // 'konnichiwa'

const result2 = natural.TransliterateJa.transliterate('カタカナ');
console.log(result2); // 'katakana'

// Complex examples with long vowels
const result3 = natural.TransliterateJa.transliterate('おとうさん');
console.log(result3); // 'otōsan'

const result4 = natural.TransliterateJa.transliterate('きょう');
console.log(result4); // 'kyō'

// Small tsu handling
const result5 = natural.TransliterateJa.transliterate('がっこう');
console.log(result5); // 'gakkō'

// Mixed Hiragana and Katakana
const result6 = natural.TransliterateJa.transliterate('ひらがなとカタカナ');
console.log(result6); // 'hiragana to katakana'

// Modern extensions
const result7 = natural.TransliterateJa.transliterate('ファイル・システム');
console.log(result7); // 'fairu shisutemu'

Transliteration Rules:

The system follows the modified Hepburn romanization with these key features:

  • Long vowels: Marked with macrons (ā, ī, ū, ē, ō)
  • Double consonants: Small tsu creates doubled consonants (kk, ss, tt, etc.)
  • N handling: Special rules for 'n' before vowels and certain consonants
  • Voiced marks: Properly handles dakuten (゛) and handakuten (゜)
  • Modern sounds: Supports katakana extensions for foreign words

Character Support:

  • All standard Hiragana characters (あ-ん)
  • All standard Katakana characters (ア-ン)
  • Small characters (ぁぃぅぇぉ, ァィゥェォ, etc.)
  • Combination characters (きゃ, キャ, etc.)
  • Long vowel mark (ー)
  • Katakana middle dot (・)
  • Iteration marks and special symbols

docs

classification.md

distance.md

index.md

ngrams-tfidf.md

phonetics.md

pos-tagging.md

sentiment.md

text-processing.md

transliterators.md

utilities.md

wordnet.md

tile.json