CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-sitemap

Sitemap-generating library and CLI tool for creating XML sitemaps that comply with the sitemaps.org protocol

Pending
Overview
Eval results
Files

sitemap-index.mddocs/

Sitemap Index Generation

Advanced streaming capabilities for creating sitemap indices and managing multiple sitemap files for large websites. These classes handle the complexity of splitting large URL sets across multiple sitemaps and creating proper index files.

Capabilities

SitemapIndexStream

Transform stream for generating sitemap index XML that references multiple sitemap files.

/**
 * Transform stream that takes IndexItems or sitemap URL strings 
 * and outputs sitemap index XML
 * ⚠️ Must be read (piped) before 'finish' event will be emitted
 */
class SitemapIndexStream extends Transform {
  constructor(opts?: SitemapIndexStreamOptions);
  
  /** Whether to output only date portion of lastmod */
  lastmodDateOnly: boolean;
  
  /** Error handling level */
  level: ErrorLevel;
  
  /** XSL stylesheet URL */
  xslUrl?: string;
  
  /** Whether XML header has been output */
  private hasHeadOutput: boolean;
}

interface SitemapIndexStreamOptions extends TransformOptions {
  /** Whether to output the lastmod date only (no time) */
  lastmodDateOnly?: boolean;
  
  /** How to handle errors in passed in urls */
  level?: ErrorLevel;
  
  /** URL to an XSL stylesheet to include in the XML */
  xslUrl?: string;
}

Usage Examples:

import { SitemapIndexStream } from "sitemap";
import { createWriteStream } from "fs";

// Create sitemap index
const sitemapIndex = new SitemapIndexStream({
  lastmodDateOnly: false
});

// Add sitemap references
sitemapIndex.write({
  url: "https://example.com/sitemap-1.xml",
  lastmod: "2023-01-01T00:00:00.000Z"
});

sitemapIndex.write("https://example.com/sitemap-2.xml");

// Output to file
sitemapIndex.pipe(createWriteStream("sitemap-index.xml"));
sitemapIndex.end();

SitemapAndIndexStream

Advanced stream that automatically creates multiple sitemap files and generates an index when URL limits are reached.

/**
 * Transform stream that takes sitemap items, writes them to sitemap files,
 * adds the sitemap files to a sitemap index, and creates new sitemap files
 * when the count limit is reached
 * ⚠️ Must be read (piped) before 'finish' event will be emitted
 */
class SitemapAndIndexStream extends SitemapIndexStream {
  constructor(opts: SitemapAndIndexStreamOptions);
  
  /** Number of items written to current sitemap */
  private itemsWritten: number;
  
  /** Function to get new sitemap streams */
  private getSitemapStream: getSitemapStreamFunc;
  
  /** Current sitemap being written to */
  private currentSitemap?: SitemapStream;
  
  /** Maximum items per sitemap file */
  private limit: number;
  
  /** Current sitemap write stream */
  private currentSitemapPipeline?: WriteStream;
}

interface SitemapAndIndexStreamOptions extends SitemapIndexStreamOptions {
  /** Max number of items in each sitemap XML file (1-50,000) */
  limit?: number;
  
  /** Callback that creates a new sitemap stream for a given sitemap index */
  getSitemapStream: getSitemapStreamFunc;
}

type getSitemapStreamFunc = (
  i: number
) => [IndexItem | string, SitemapStream, WriteStream];

Usage Examples:

import { SitemapAndIndexStream, SitemapStream } from "sitemap";
import { createWriteStream } from "fs";
import { createGzip } from "zlib";

const sitemapAndIndex = new SitemapAndIndexStream({
  limit: 50000,
  getSitemapStream: (i) => {
    const sitemapStream = new SitemapStream({
      hostname: "https://example.com"
    });
    
    const path = `./sitemap-${i}.xml`;
    const writeStream = createWriteStream(path);
    
    return [
      `https://example.com/sitemap-${i}.xml`,
      sitemapStream,
      sitemapStream.pipe(writeStream)
    ];
  }
});

// Input large number of URLs
const urls = Array.from({ length: 100000 }, (_, i) => ({
  url: `/page-${i}`,
  changefreq: "weekly" as const,
  priority: 0.5
}));

// Stream will automatically create multiple sitemaps
const Readable = require("stream").Readable;
const urlStream = Readable.from(urls);

urlStream.pipe(sitemapAndIndex);
sitemapAndIndex.pipe(createWriteStream("sitemap-index.xml"));

Gzip Support Example

import { SitemapAndIndexStream, SitemapStream } from "sitemap";
import { createWriteStream } from "fs";
import { createGzip } from "zlib";

const sitemapAndIndex = new SitemapAndIndexStream({
  limit: 45000,
  getSitemapStream: (i) => {
    const sitemapStream = new SitemapStream({
      hostname: "https://example.com"
    });
    
    const path = `./sitemap-${i}.xml.gz`;
    const writeStream = sitemapStream
      .pipe(createGzip())
      .pipe(createWriteStream(path));
    
    return [
      `https://example.com/sitemap-${i}.xml.gz`,
      sitemapStream,
      writeStream
    ];
  }
});

Index Item Types

interface IndexItem {
  /** URL of the sitemap file */
  url: string;
  
  /** Last modification date of the sitemap file */
  lastmod?: string;
}

enum IndexTagNames {
  sitemap = 'sitemap',
  sitemapindex = 'sitemapindex',
  loc = 'loc',
  lastmod = 'lastmod'
}

Best Practices

Sitemap Limits

  • Keep individual sitemaps under 50,000 URLs
  • Keep sitemap files under 50MB (10MB recommended for faster processing)
  • Use SitemapAndIndexStream for large sites with many URLs
  • Consider gzip compression to reduce file sizes

Error Handling

import { SitemapAndIndexStream } from "sitemap";

const sitemapAndIndex = new SitemapAndIndexStream({
  limit: 50000,
  level: ErrorLevel.WARN, // Log warnings but continue processing
  getSitemapStream: (i) => {
    // ... stream creation logic
  }
});

// Handle stream errors
sitemapAndIndex.on('error', (error) => {
  console.error('Sitemap generation error:', error);
});

XML Output Structure

SitemapIndexStream generates XML with the following structure:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-0.xml</loc>
    <lastmod>2023-01-01T00:00:00.000Z</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-1.xml</loc>
    <lastmod>2023-01-01T12:00:00.000Z</lastmod>
  </sitemap>
</sitemapindex>

Install with Tessl CLI

npx tessl i tessl/npm-sitemap

docs

cli-interface.md

error-handling.md

index.md

simple-api.md

sitemap-index.md

sitemap-parsing.md

sitemap-streams.md

validation-utilities.md

xml-validation.md

tile.json