CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-sitemap

Sitemap-generating library and CLI tool for creating XML sitemaps that comply with the sitemaps.org protocol

Pending
Overview
Eval results
Files

cli-interface.mddocs/

CLI Interface

Command-line interface for sitemap generation, validation, and parsing. The sitemap CLI provides a convenient way to work with sitemaps from the command line without writing code.

Installation

The CLI is automatically available when you install the sitemap package:

npm install -g sitemap

Or use with npx without global installation:

npx sitemap [options] [file]

Basic Usage

Generate Sitemap from URL List

The most common use case is generating a sitemap from a list of URLs:

# From file
npx sitemap urls.txt > sitemap.xml

# From stdin
echo -e "/page1\n/page2\n/page3" | npx sitemap > sitemap.xml

# With hostname
echo -e "/page1\n/page2" | npx sitemap --prepend "https://example.com" > sitemap.xml

Validate Existing Sitemap

Validate an existing sitemap against the XML schema:

# Validate sitemap file
npx sitemap --validate sitemap.xml

# Validate from URL
curl -s https://example.com/sitemap.xml | npx sitemap --validate

Parse Sitemap to JSON

Convert XML sitemap to JSON format for analysis:

# Parse to JSON array
npx sitemap --parse sitemap.xml > sitemap.json

# Parse to line-separated JSON (one object per line)
npx sitemap --parse --single-line-json sitemap.xml > sitemap.jsonl

Command Line Options

npx sitemap [options] [input-file]

Options

interface CLIOptions {
  /** Show help information */
  '--help': boolean;
  '-h': boolean; // alias for --help
  
  /** Show version information */
  '--version': boolean;
  
  /** Validate XML sitemap against schema (requires xmllint) */
  '--validate': boolean;
  
  /** Parse existing sitemap to JSON format */
  '--parse': boolean;
  
  /** Output JSON as single-line format (one object per line) */
  '--single-line-json': boolean;
  
  /** Generate sitemap index with multiple sitemap files */
  '--index': boolean;
  
  /** Base URL for sitemap index */
  '--index-base-url': string;
  
  /** Maximum URLs per sitemap file (default: 50000) */
  '--limit': number;
  
  /** Prepend hostname to relative URLs */
  '--prepend': string;
  
  /** Compress output with gzip */
  '--gzip': boolean;
}

Advanced Usage Examples

Generate Large Sitemap with Index

For websites with more than 50,000 URLs, generate multiple sitemap files with an index:

# Create sitemap index with max 10,000 URLs per file
npx sitemap --index --limit 10000 --index-base-url "https://example.com" large-urls.txt

# This creates:
# - sitemap-0.xml
# - sitemap-1.xml  
# - sitemap-2.xml
# - sitemap-index.xml

Process JSON Input

Handle JSON input with sitemap metadata:

# Create JSON file with sitemap items
cat > urls.json << 'EOF'
{"url": "/page1", "changefreq": "daily", "priority": 0.8}
{"url": "/page2", "changefreq": "weekly", "priority": 0.6}
{"url": "/page3", "priority": 0.4}
EOF

# Generate sitemap from JSON
npx sitemap --prepend "https://example.com" urls.json > sitemap.xml

Compressed Output

Generate compressed sitemaps to save bandwidth:

# Generate compressed sitemap
npx sitemap --gzip --prepend "https://example.com" urls.txt > sitemap.xml.gz

# Generate compressed index and sitemaps
npx sitemap --index --gzip --limit 25000 \
  --index-base-url "https://example.com" \
  --prepend "https://example.com" \
  large-urls.txt

Validation Workflows

# Validate multiple sitemaps
for sitemap in sitemap-*.xml; do
  echo "Validating $sitemap..."
  npx sitemap --validate "$sitemap" || echo "❌ $sitemap is invalid"
done

# Parse and validate pipeline
npx sitemap --parse sitemap.xml | \
  jq '.[] | select(.priority > 0.5)' | \
  npx sitemap --prepend "https://example.com" > high-priority.xml

Input Formats

Plain Text URLs

/page1
/page2
/about
/contact

JSON Lines Format

{"url": "/page1", "changefreq": "daily", "priority": 0.8}
{"url": "/page2", "changefreq": "weekly", "priority": 0.6}
{"url": "/page3", "lastmod": "2023-01-15"}

JSON Array Format

[
  {"url": "/page1", "changefreq": "daily", "priority": 0.8},
  {"url": "/page2", "changefreq": "weekly", "priority": 0.6},
  {"url": "/page3", "lastmod": "2023-01-15"}
]

Output Examples

Standard Sitemap XML

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/page1</loc>
    <changefreq>daily</changefreq>
    <priority>0.8</priority>
  </url>
  <url>
    <loc>https://example.com/page2</loc>
    <changefreq>weekly</changefreq>
    <priority>0.6</priority>
  </url>
</urlset>

Sitemap Index XML

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-0.xml</loc>
    <lastmod>2023-01-15T10:30:00.000Z</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-1.xml</loc>
    <lastmod>2023-01-15T10:30:00.000Z</lastmod>
  </sitemap>
</sitemapindex>

Parsed JSON Output

[
  {
    "url": "https://example.com/page1",
    "changefreq": "daily", 
    "priority": 0.8,
    "img": [],
    "video": [],
    "links": []
  },
  {
    "url": "https://example.com/page2",
    "changefreq": "weekly",
    "priority": 0.6,
    "img": [],
    "video": [], 
    "links": []
  }
]

Error Handling

The CLI provides detailed error messages for common issues:

# Missing URL error
echo '{"priority": 0.8}' | npx sitemap
# Error: URL is required for sitemap items

# Invalid priority error  
echo '{"url": "/test", "priority": 2.0}' | npx sitemap
# Error: Priority must be between 0.0 and 1.0

# Validation error
npx sitemap --validate invalid-sitemap.xml
# Error: XML validation failed: element url: Schemas validity error

Integration Examples

CI/CD Pipeline

#!/bin/bash
# generate-sitemap.sh

# Generate sitemap from database export
mysql -u user -p database -e "SELECT CONCAT('/product/', id) FROM products" \
  --skip-column-names --batch | \
  npx sitemap --prepend "https://shop.example.com" \
  --gzip > sitemap.xml.gz

# Validate the generated sitemap
if npx sitemap --validate sitemap.xml.gz; then
  echo "✅ Sitemap generated and validated successfully"
  # Upload to CDN or web server
  aws s3 cp sitemap.xml.gz s3://my-bucket/sitemap.xml.gz
else
  echo "❌ Sitemap validation failed"
  exit 1
fi

Cron Job

# Daily sitemap generation
0 2 * * * /usr/local/bin/generate-sitemap.sh >> /var/log/sitemap.log 2>&1

Docker Usage

FROM node:18-alpine
RUN npm install -g sitemap
COPY urls.txt /app/
WORKDIR /app
CMD ["npx", "sitemap", "--prepend", "https://example.com", "urls.txt"]

Requirements

  • Node.js 14.0.0 or higher
  • NPM 6.0.0 or higher
  • For --validate option: xmllint (part of libxml2-utils package)

Installing xmllint

# Ubuntu/Debian
sudo apt-get install libxml2-utils

# macOS with Homebrew
brew install libxml2

# Alpine Linux (Docker)
apk add --no-cache libxml2-utils

Install with Tessl CLI

npx tessl i tessl/npm-sitemap

docs

cli-interface.md

error-handling.md

index.md

simple-api.md

sitemap-index.md

sitemap-parsing.md

sitemap-streams.md

validation-utilities.md

xml-validation.md

tile.json