CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-linkify-it

Links recognition library with FULL unicode support for detecting high-quality link patterns in plain text

Overall
score

97%

Overview
Eval results
Files

task.mdevals/scenario-5/

International Link Validator

Overview

Build a text processing utility that detects and validates links in multilingual content. The system should handle text containing links written in various international scripts (Latin, Cyrillic, Chinese, Arabic, etc.) and distinguish between valid linkable patterns and non-link text that may contain similar characters.

Requirements

Core Functionality

Implement a link detection system that:

  1. Detects links in international text: Process text containing links with international domain names and unicode characters
  2. Validates links across character sets: Correctly identify links regardless of the script or character set used (Cyrillic, Chinese characters, emoji, etc.)
  3. Handles fuzzy link detection: Recognize URLs without explicit protocols (e.g., example.com or сайт.рф)
  4. Supports custom TLDs: Allow configuration to recognize domain-specific or regional top-level domains

Input/Output

  • Input: Text strings that may contain URLs with international characters
  • Output: Extracted link information including the detected URL, its position in the text, and normalized form

Configuration

The system should support:

  • Enabling/disabling fuzzy link detection for URLs without protocols
  • Adding custom top-level domains (TLDs) for regional or specialized domains
  • IP address recognition in fuzzy mode

Dependencies { .dependencies }

linkify-it { .dependency }

Provides link recognition with full Unicode support.

Test Cases

Test 1: Basic International Link Detection @test

File: test/international-links.test.js

Test: Detect a link with Cyrillic characters

const text = "Visit our site at президент.рф for more info";
// Should detect президент.рф as a valid link

Expected behavior: The system should identify президент.рф as a linkable pattern and return its position and normalized URL.

Test 2: Multiple Scripts @test

File: test/international-links.test.js

Test: Detect links in text containing multiple international scripts

const text = "Check 中文.cn and مثال.مصر and test.com";
// Should detect all three domains

Expected behavior: The system should detect all three links: 中文.cn, مثال.مصر, and test.com.

Test 3: Custom TLD Support @test

File: test/international-links.test.js

Test: Recognize links with custom TLDs

const text = "Our internal site is intranet.local";
// After configuring 'local' as a valid TLD, should detect intranet.local

Expected behavior: After adding local as a custom TLD, the system should recognize intranet.local as a valid link.

Implementation Notes

  • Focus on correctly handling Unicode characters in domain names
  • Ensure proper normalization of international domain names
  • Consider character class boundaries when detecting link patterns
  • The solution should work with text containing mixed scripts and emoji

Install with Tessl CLI

npx tessl i tessl/npm-linkify-it

tile.json