CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-crawler

A ready-to-use web spider that works with proxies, asynchrony, rate limit, configurable request pools, jQuery, and HTTP/2 support.

94

1.17x
Overview
Eval results
Files

task.mdevals/scenario-3/

Multi-Region Product Availability Checker

Build a web scraper that checks product availability across multiple regional e-commerce endpoints. Each region has different rate limiting requirements that must be respected to avoid being blocked.

Requirements

Your system must scrape product availability information from multiple regional endpoints (simulated as different URLs). Each region has its own rate limiting policy:

  • Region A: Maximum 1 request per 2 seconds
  • Region B: Maximum 1 request per 3 seconds
  • Region C: Maximum 1 request per 1 second

The scraper should:

  1. Accept a list of product URLs grouped by region
  2. Fetch product information from each URL while respecting per-region rate limits
  3. Allow multiple regions to be scraped concurrently (different regions should not block each other)
  4. Extract and return the product title from each page
  5. Handle the completion of all requests and report results

Implementation

@generates

API

/**
 * Creates and configures a multi-region product scraper.
 *
 * @param {Object} config - Configuration for the scraper
 * @param {Array<Object>} config.regions - Array of region configurations
 * @param {string} config.regions[].name - Region identifier
 * @param {number} config.regions[].rateLimit - Minimum milliseconds between requests for this region
 * @param {string} config.regions[].proxy - Proxy URL for this region (optional)
 * @returns {Object} Scraper instance with methods to add tasks and handle completion
 */
function createScraper(config) {
  // Returns an object with:
  // - addTask(regionName, url, callback): adds a scraping task for a specific region
  // - onComplete(callback): registers a callback for when all tasks finish
  // - start(): begins processing the queue
}

module.exports = { createScraper };

Capabilities

Rate limit enforcement per region

  • Given 2 URLs for Region A (2s rate limit), scraping both URLs takes at least 2 seconds but less than 5 seconds @test
  • When scraping 3 URLs from Region C (1s rate limit), the tasks complete in at least 2 seconds @test

Concurrent processing across regions

  • Given 1 URL for Region A (2s rate limit) and 1 URL for Region B (3s rate limit), scraping both concurrently takes approximately 3 seconds (not 5 seconds) @test

HTML content extraction

  • The scraper correctly extracts product titles from HTML responses @test

Dependencies { .dependencies }

crawler { .dependency }

Provides web scraping capabilities with rate limiting and queue management.

@satisfied-by

Install with Tessl CLI

npx tessl i tessl/npm-crawler

tile.json