tessl/npm-linkinator

Find broken links, missing images, etc in your HTML. Scurry around your site and find all those broken links.

—

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Securityby

Pending

The risk profile of this skill

Overview

Eval results

Files

Link Scanning

Name: tessl/npm-linkinator
Author: tessl

Core link checking functionality for validating URLs and local files. Provides both synchronous batch processing and event-driven real-time scanning with comprehensive error handling and retry mechanisms.

Capabilities

Check Function

Convenience method to perform a complete link scan without manually instantiating the LinkChecker class.

/**
 * Crawl a given url or path, and return a list of visited links along with status codes
 * @param options - Configuration options for the link checking operation
 * @returns Promise resolving to crawl results with pass/fail status and individual link results
 */
function check(options: CheckOptions): Promise<CrawlResult>;

Usage Examples:

import { check, LinkState } from "linkinator";

// Check a single URL
const result = await check({ path: "https://example.com" });
console.log(`Scan passed: ${result.passed}`);

// Check local directory with recursion
const localResult = await check({
  path: "./website/",
  recurse: true,
  markdown: true,
  concurrency: 50,
});

// Check multiple paths
const multiResult = await check({
  path: ["https://example.com", "https://api.example.com"],
  timeout: 10000,
});

LinkChecker Class

Instance class providing event-driven link checking with real-time progress updates and detailed control over the scanning process.

/**
 * Instance class used to perform a crawl job with event emission capabilities
 */
class LinkChecker extends EventEmitter {
  /**
   * Crawl given URLs or paths and return comprehensive results
   * @param options - Check options specifying what and how to scan
   * @returns Promise resolving to complete crawl results
   */
  check(options: CheckOptions): Promise<CrawlResult>;
  
  /**
   * Listen for individual link check results as they complete
   * @param event - 'link' event type
   * @param listener - Callback receiving LinkResult for each checked link
   */
  on(event: 'link', listener: (result: LinkResult) => void): this;
  
  /**
   * Listen for page scanning start events
   * @param event - 'pagestart' event type  
   * @param listener - Callback receiving URL of page being scanned
   */
  on(event: 'pagestart', listener: (link: string) => void): this;
  
  /**
   * Listen for retry attempts on failed requests
   * @param event - 'retry' event type
   * @param listener - Callback receiving retry details including timing
   */
  on(event: 'retry', listener: (details: RetryInfo) => void): this;
}

Usage Examples:

import { LinkChecker, LinkState } from "linkinator";

// Event-driven scanning with progress updates
const checker = new LinkChecker();
let checkedCount = 0;
let brokenCount = 0;

checker.on('link', (result) => {
  checkedCount++;
  if (result.state === LinkState.BROKEN) {
    brokenCount++;
    console.log(`Broken link found: ${result.url} (${result.status})`);
  }
  console.log(`Progress: ${checkedCount} links checked, ${brokenCount} broken`);
});

checker.on('pagestart', (url) => {
  console.log(`Scanning page: ${url}`);
});

checker.on('retry', (details) => {
  console.log(`Retrying ${details.url} in ${details.secondsUntilRetry} seconds`);
});

const result = await checker.check({
  path: "https://example.com",
  recurse: true,
  retry: true,
  retryErrors: true,
});

Link State Management

Link checking results are categorized into distinct states for easy filtering and processing.

/**
 * Enumeration of possible link states after checking
 */
enum LinkState {
  /** Link is accessible and returned a successful response */
  OK = 'OK',
  /** Link is broken, inaccessible, or returned an error response */
  BROKEN = 'BROKEN',
  /** Link was skipped due to filtering rules or unsupported protocol */
  SKIPPED = 'SKIPPED',
}

Usage Examples:

import { check, LinkState } from "linkinator";

const result = await check({ path: "https://example.com" });

// Filter results by state
const okLinks = result.links.filter(link => link.state === LinkState.OK);
const brokenLinks = result.links.filter(link => link.state === LinkState.BROKEN);
const skippedLinks = result.links.filter(link => link.state === LinkState.SKIPPED);

console.log(`✓ ${okLinks.length} working links`);
console.log(`✗ ${brokenLinks.length} broken links`);
console.log(`⊘ ${skippedLinks.length} skipped links`);

// Process broken links
brokenLinks.forEach(link => {
  console.log(`${link.url} (${link.status}) - found in ${link.parent}`);
  if (link.failureDetails) {
    console.log(`  Error details: ${link.failureDetails.length} failures`);
  }
});

Types

interface CrawlResult {
  /** Whether the scan passed (no broken links found) */
  passed: boolean;
  /** Array of results for each link that was checked */
  links: LinkResult[];
}

interface LinkResult {
  /** The URL that was checked */
  url: string;
  /** HTTP status code if available */
  status?: number;
  /** Current state of the link (OK/BROKEN/SKIPPED) */
  state: LinkState;
  /** Parent URL that contained this link */
  parent?: string;
  /** Detailed error information for failed links */
  failureDetails?: Array<Error | GaxiosResponse>;
}

interface RetryInfo {
  /** URL being retried */
  url: string;
  /** Number of seconds until the retry attempt */
  secondsUntilRetry: number;
  /** HTTP status code that triggered the retry */
  status: number;
}

interface GaxiosResponse {
  status: number;
  statusText: string;
  headers: Record<string, string>;
  data: any;
  config: any;
  request?: any;
}

interface ParsedUrl {
  /** The original link string that was parsed */
  link: string;
  /** Any error that occurred during URL parsing */
  error?: Error;
  /** The successfully parsed URL object */
  url?: URL;
}