You are working with archived email messages from legacy systems that use ISO-2022 family encodings (ISO-2022-JP, ISO-2022-KR, ISO-2022-CN). These encodings use escape sequences to switch between character sets within a single document.

Your task is to build a utility that can:

Detect whether email content uses ISO-2022 encoding
Differentiate between ISO-2022-JP, ISO-2022-KR, and ISO-2022-CN variants
Extract confidence scores for the encoding detection

Requirements

Create a module that exports the following functions:

detectISO2022(buffer) - Returns the detected ISO-2022 encoding name if found, or null otherwise
analyzeISO2022Confidence(buffer) - Returns an object with encoding names and their confidence scores for all ISO-2022 variants detected

Function Specifications

detectISO2022(buffer)

Input: A Buffer or Uint8Array containing the raw email bytes
Output: A string with the encoding name (e.g., 'ISO-2022-JP') or null if no ISO-2022 encoding detected
Should return only ISO-2022 family encodings, not other encodings

analyzeISO2022Confidence(buffer)

Input: A Buffer or Uint8Array containing the raw email bytes
Output: An array of objects, where each object has:
- name: The encoding name
- confidence: A number between 0-100
- lang: Optional language code
Should return only ISO-2022 variants in the results
Results should be sorted by confidence in descending order

Test Cases

When given a Buffer containing ISO-2022-JP encoded text with escape sequences, detectISO2022 returns 'ISO-2022-JP' @test
When given a Buffer containing UTF-8 encoded text, detectISO2022 returns null @test
When given a Buffer containing ISO-2022-KR encoded text, analyzeISO2022Confidence returns results including ISO-2022-KR with a confidence score @test
When given a Buffer containing mixed ISO-2022 variants, analyzeISO2022Confidence returns multiple ISO-2022 results sorted by confidence @test

Implementation

@generates

API

/**
 * Detects if the input buffer contains ISO-2022 encoded text
 * @param {Buffer|Uint8Array} buffer - The raw bytes to analyze
 * @returns {string|null} The ISO-2022 encoding name or null
 */
function detectISO2022(buffer) {
  // IMPLEMENTATION HERE
}

/**
 * Analyzes confidence scores for ISO-2022 encoding variants
 * @param {Buffer|Uint8Array} buffer - The raw bytes to analyze
 * @returns {Array<{name: string, confidence: number, lang?: string}>} ISO-2022 variants with confidence scores
 */
function analyzeISO2022Confidence(buffer) {
  // IMPLEMENTATION HERE
}

module.exports = {
  detectISO2022,
  analyzeISO2022Confidence
};

Dependencies { .dependencies }

chardet { .dependency }

Provides character encoding detection with ISO-2022 escape sequence recognition support.

Version

Files

task.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-10/

Legacy Email Parser

Problem