tessl/npm-natural

Comprehensive natural language processing library with tokenization, stemming, classification, sentiment analysis, phonetics, distance algorithms, and WordNet integration.

—

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Securityby

Pending

The risk profile of this skill

Overview

Eval results

Files

Text Transliteration

Name: tessl/npm-natural
Author: tessl

Japanese text transliteration functionality for converting Hiragana and Katakana characters to romanized text using the modified Hepburn system. Essential for processing Japanese text in applications that need ASCII representation.

Capabilities

Japanese Transliteration

Converts Japanese Hiragana and Katakana text to romanized characters using the modified Hepburn system with comprehensive rules for accurate transliteration.

/**
 * Japanese text transliteration using modified Hepburn system
 * Converts Hiragana and Katakana to roman characters
 * Based on CLDR transform rule set with fixes for missing characters and edge cases
 */
class TransliterateJa {
  /** Transliterate Japanese text to romanized form */
  static transliterate(text: string): string;
}

Features:

Supports both Hiragana and Katakana character sets
Handles long vowel marks (ー) correctly
Processes small tsu (っ/ッ) for consonant doubling
Manages complex character combinations and diphthongs
Includes support for katakana middle dot (・)
Handles final small tsu and iteration marks
Supports modern extensions like small vowels and special combinations

Usage Examples:

const natural = require('natural');

// Basic transliteration
const result1 = natural.TransliterateJa.transliterate('こんにちは');
console.log(result1); // 'konnichiwa'

const result2 = natural.TransliterateJa.transliterate('カタカナ');
console.log(result2); // 'katakana'

// Complex examples with long vowels
const result3 = natural.TransliterateJa.transliterate('おとうさん');
console.log(result3); // 'otōsan'

const result4 = natural.TransliterateJa.transliterate('きょう');
console.log(result4); // 'kyō'

// Small tsu handling
const result5 = natural.TransliterateJa.transliterate('がっこう');
console.log(result5); // 'gakkō'

// Mixed Hiragana and Katakana
const result6 = natural.TransliterateJa.transliterate('ひらがなとカタカナ');
console.log(result6); // 'hiragana to katakana'

// Modern extensions
const result7 = natural.TransliterateJa.transliterate('ファイル・システム');
console.log(result7); // 'fairu shisutemu'

Transliteration Rules:

The system follows the modified Hepburn romanization with these key features: