CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-crawler

A ready-to-use web spider that works with proxies, asynchrony, rate limit, configurable request pools, jQuery, and HTTP/2 support.

94

1.17x
Overview
Eval results
Files

rubric.jsonevals/scenario-3/

{
  "context": "This criteria evaluates how well the engineer uses the crawler package's advanced proxy management features, specifically the ability to create multiple independent rate limiters that can operate concurrently. The focus is on proper usage of rateLimiterId, rateLimit configuration, and understanding how the rate limiter cluster system enables parallel scraping of different regions.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Crawler instantiation",
      "description": "Creates a Crawler instance using the 'crawler' or 'node-crawler' package constructor (new Crawler() or require('crawler')).",
      "max_score": 10
    },
    {
      "name": "Rate limiter assignment",
      "description": "Uses the rateLimiterId option to assign tasks to different rate limiters, with one unique rateLimiterId per region (e.g., rateLimiterId: 0 for Region A, rateLimiterId: 1 for Region B, etc.).",
      "max_score": 30
    },
    {
      "name": "Rate limit configuration",
      "description": "Configures the rateLimit option (in milliseconds) for each task or globally, matching the specified per-region rate limits (2000ms for Region A, 3000ms for Region B, 1000ms for Region C).",
      "max_score": 25
    },
    {
      "name": "Task queueing",
      "description": "Uses crawler.add() or crawler.queue() method to queue scraping tasks with appropriate options including url, rateLimiterId, rateLimit, and callback.",
      "max_score": 15
    },
    {
      "name": "HTML parsing",
      "description": "Uses the provided Cheerio instance (res.$ or response.$) in the callback to extract product titles from HTML content.",
      "max_score": 10
    },
    {
      "name": "Completion handling",
      "description": "Uses the 'drain' event (crawler.on('drain', callback)) to detect when all queued tasks have completed, or properly calls done() in callbacks to release queue slots.",
      "max_score": 10
    }
  ]
}

Install with Tessl CLI

npx tessl i tessl/npm-crawler

tile.json