CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-link-preview-js

JavaScript module to extract and fetch HTTP link information from blocks of text via OpenGraph and meta tag parsing.

Pending
Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Pending

The risk profile of this skill

Overview
Eval results
Files

security-configuration.mddocs/

Security and Configuration

Advanced security features and configuration options for customizing request behavior, handling redirects, and preventing SSRF attacks through DNS resolution validation and other security measures.

Capabilities

Configuration Options

Comprehensive configuration interface for customizing link preview behavior, security settings, and request parameters.

interface ILinkPreviewOptions {
  /** Custom HTTP headers for request */
  headers?: Record<string, string>;
  /** Property type for image meta tags (default: "og") */
  imagesPropertyType?: string;
  /** Proxy URL to prefix to the target URL */
  proxyUrl?: string;
  /** Request timeout in milliseconds (default: 3000) */
  timeout?: number;
  /** Redirect handling strategy (default: "error") */
  followRedirects?: "follow" | "error" | "manual";
  /** Function to resolve DNS for SSRF protection */
  resolveDNSHost?: (url: string) => Promise<string>;
  /** Function to validate redirects (required with followRedirects: "manual") */
  handleRedirects?: (baseURL: string, forwardedURL: string) => boolean;
  /** Callback to modify response object */
  onResponse?: (response: ILinkPreviewResponse, doc: cheerio.Root, url?: URL) => ILinkPreviewResponse;
}

Custom Headers

Configure custom HTTP headers for requests to handle authentication, user agents, and other requirements.

Basic Header Configuration:

import { getLinkPreview } from "link-preview-js";

// Custom User-Agent
const preview = await getLinkPreview("https://example.com", {
  headers: {
    "User-Agent": "MyBot/1.0 (+https://mysite.com/bot)"
  }
});

// Multiple headers
const customPreview = await getLinkPreview("https://example.com", {
  headers: {
    "User-Agent": "Mozilla/5.0 (compatible; MyBot/1.0)",
    "Accept-Language": "en-US,en;q=0.9",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Authorization": "Bearer your-token-here"
  }
});

Common Header Use Cases:

// Bot identification (recommended for crawlers)
const botHeaders = {
  "User-Agent": "GoogleBot/2.1 (+http://www.google.com/bot.html)"
};

// Language-specific content
const localizedHeaders = {
  "Accept-Language": "fr-FR,fr;q=0.8,en;q=0.6"
};

// CORS proxy requirements
const corsHeaders = {
  "Origin": "https://myapp.com",
  "X-Requested-With": "XMLHttpRequest"
};

Timeout Configuration

Configure request timeout to prevent hanging requests and control response times.

import { getLinkPreview } from "link-preview-js";

// Short timeout for fast responses
const quickPreview = await getLinkPreview("https://fast-site.com", {
  timeout: 1000 // 1 second
});

// Longer timeout for slow sites
const patientPreview = await getLinkPreview("https://slow-site.com", {
  timeout: 10000 // 10 seconds
});

// Handle timeout errors
try {
  await getLinkPreview("https://very-slow-site.com", { timeout: 2000 });
} catch (error) {
  if (error.message === "Request timeout") {
    console.log("Site is taking too long to respond");
  }
}

Image Property Type Configuration

Control which meta tag properties are used for image extraction.

import { getLinkPreview } from "link-preview-js";

// Use only OpenGraph images
const ogPreview = await getLinkPreview("https://example.com", {
  imagesPropertyType: "og"
});

// Use Twitter Card images
const twitterPreview = await getLinkPreview("https://example.com", {
  imagesPropertyType: "twitter"
});

// Use custom property type
const customPreview = await getLinkPreview("https://example.com", {
  imagesPropertyType: "custom"
});
// Looks for meta[property='custom:image'] or meta[name='custom:image']

Proxy Configuration

Configure proxy URLs for CORS bypass or corporate network requirements.

import { getLinkPreview } from "link-preview-js";

// Using CORS proxy
const proxyPreview = await getLinkPreview("https://target-site.com", {
  proxyUrl: "https://cors-anywhere.herokuapp.com/",
  headers: {
    "Origin": "https://myapp.com"
  }
});

// Corporate proxy
const corpPreview = await getLinkPreview("https://external-site.com", {
  proxyUrl: "http://corporate-proxy.company.com:8080/",
  headers: {
    "Proxy-Authorization": "Basic " + btoa("username:password")
  }
});

Redirect Handling

Configure how the library handles HTTP redirects with three strategies: follow, error, or manual.

Follow Redirects (Caution Required):

import { getLinkPreview } from "link-preview-js";

// Automatically follow redirects (security risk)
const preview = await getLinkPreview("http://shorturl.com/abc123", {
  followRedirects: "follow"
});

Error on Redirects (Default - Secure):

// Default behavior - throw error on redirects
try {
  const preview = await getLinkPreview("http://redirect-site.com", {
    followRedirects: "error" // Default value
  });
} catch (error) {
  console.log("Redirect detected and blocked for security");
}

Manual Redirect Handling (Recommended):

import { getLinkPreview } from "link-preview-js";

// Manual redirect validation
const preview = await getLinkPreview("https://short.ly/abc123", {
  followRedirects: "manual",
  handleRedirects: (baseURL: string, forwardedURL: string) => {
    const baseUrl = new URL(baseURL);
    const forwardedUrl = new URL(forwardedURL);
    
    // Allow same-domain redirects
    if (forwardedUrl.hostname === baseUrl.hostname) {
      return true;
    }
    
    // Allow HTTP to HTTPS upgrades
    if (baseUrl.protocol === "http:" && 
        forwardedUrl.protocol === "https:" &&
        forwardedUrl.hostname === baseUrl.hostname) {
      return true;
    }
    
    // Allow www subdomain redirects
    if (forwardedUrl.hostname === "www." + baseUrl.hostname ||
        "www." + forwardedUrl.hostname === baseUrl.hostname) {
      return true;
    }
    
    // Block all other redirects
    return false;
  }
});

SSRF Protection

Server-Side Request Forgery (SSRF) protection through DNS resolution validation and IP address filtering.

Built-in IP Filtering:

The library automatically blocks requests to private network ranges:

  • Loopback: 127.0.0.0/8
  • Private Class A: 10.0.0.0/8
  • Private Class B: 172.16.0.0/12
  • Private Class C: 192.168.0.0/16
  • Link-local: 169.254.0.0/16
  • Documentation ranges: 192.0.2.0/24, 198.51.100.0/24, 203.0.113.0/24
  • Carrier-Grade NAT: 100.64.0.0/10

DNS Resolution Validation:

import { getLinkPreview } from "link-preview-js";
import dns from "dns";

// DNS resolution for SSRF protection
const securePreview = await getLinkPreview("https://suspicious-domain.com", {
  resolveDNSHost: async (url: string) => {
    return new Promise((resolve, reject) => {
      const hostname = new URL(url).hostname;
      dns.lookup(hostname, (err, address, family) => {
        if (err) {
          reject(err);
          return;
        }
        
        // Additional custom validation
        if (address.startsWith("127.") || address.startsWith("192.168.")) {
          reject(new Error("Blocked private IP address"));
          return;
        }
        
        resolve(address);
      });
    });
  }
});

Advanced SSRF Protection:

import { getLinkPreview } from "link-preview-js";
import dns from "dns/promises";

async function advancedDnsCheck(url: string): Promise<string> {
  const hostname = new URL(url).hostname;
  
  try {
    // Resolve all A records
    const addresses = await dns.resolve4(hostname);
    
    // Check each resolved address
    for (const address of addresses) {
      const parts = address.split('.').map(Number);
      
      // Block RFC 1918 private networks
      if (parts[0] === 10) throw new Error("Private network blocked");
      if (parts[0] === 172 && parts[1] >= 16 && parts[1] <= 31) {
        throw new Error("Private network blocked");
      }
      if (parts[0] === 192 && parts[1] === 168) {
        throw new Error("Private network blocked");
      }
      
      // Block loopback
      if (parts[0] === 127) throw new Error("Loopback blocked");
      
      // Block multicast and reserved ranges
      if (parts[0] >= 224) throw new Error("Reserved range blocked");
    }
    
    return addresses[0];
  } catch (error) {
    throw new Error(`DNS resolution failed: ${error.message}`);
  }
}

const securePreview = await getLinkPreview("https://example.com", {
  resolveDNSHost: advancedDnsCheck
});

Response Processing Callbacks

Customize response objects with the onResponse callback for site-specific handling or data enhancement.

import { getLinkPreview } from "link-preview-js";

const customPreview = await getLinkPreview("https://example.com", {
  onResponse: (response, doc, url) => {
    // Site-specific customizations
    if (url?.hostname === "github.com") {
      // GitHub-specific enhancements
      const repoInfo = doc('meta[name="octolytics-dimension-repository_nwo"]').attr('content');
      if (repoInfo) {
        response.siteName = `GitHub - ${repoInfo}`;
      }
    }
    
    // Fallback description from first paragraph
    if (!response.description) {
      const firstParagraph = doc('p').first().text();
      if (firstParagraph) {
        response.description = firstParagraph.substring(0, 200) + "...";
      }
    }
    
    // Clean up image URLs
    response.images = response.images.filter(img => 
      img && !img.includes('tracking') && !img.includes('analytics')
    );
    
    // Add custom metadata
    const customData = doc('meta[name="custom-data"]').attr('content');
    if (customData) {
      (response as any).customField = customData;
    }
    
    return response;
  }
});

Structured Response Processing:

// Type-safe response enhancement
interface EnhancedResponse extends ILinkPreviewResponse {
  readingTime?: number;
  language?: string;
  keywords?: string[];
}

const enhancedPreview = await getLinkPreview("https://blog.example.com", {
  onResponse: (response, doc, url): EnhancedResponse => {
    const enhanced = response as EnhancedResponse;
    
    // Calculate reading time
    const textContent = doc('article, main, .content').text();
    const wordCount = textContent.split(/\s+/).length;
    enhanced.readingTime = Math.ceil(wordCount / 200); // Avg 200 WPM
    
    // Extract language
    enhanced.language = doc('html').attr('lang') || 'en';
    
    // Extract keywords
    const keywordsMeta = doc('meta[name="keywords"]').attr('content');
    enhanced.keywords = keywordsMeta ? keywordsMeta.split(',').map(k => k.trim()) : [];
    
    return enhanced;
  }
}) as EnhancedResponse;

console.log(`Reading time: ${enhancedPreview.readingTime} minutes`);

docs

index.md

link-preview.md

pre-fetched-content.md

security-configuration.md

tile.json