A ready-to-use web spider that works with proxies, asynchrony, rate limit, configurable request pools, jQuery, and HTTP/2 support.
94
{
"context": "This criteria evaluates how well the engineer uses the crawler package's automatic retry mechanism to build a resilient web scraper. The focus is on proper configuration of retry parameters, timeout handling, and failure detection.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Crawler instantiation",
"description": "Creates a Crawler instance from the crawler package using 'new Crawler()' or 'import Crawler from \"crawler\"'",
"max_score": 10
},
{
"name": "Retry attempts configuration",
"description": "Configures the 'retries' option to specify the number of retry attempts (e.g., retries: 2, retries: 3)",
"max_score": 25
},
{
"name": "Retry interval configuration",
"description": "Configures the 'retryInterval' option to specify the delay in milliseconds between retry attempts (e.g., retryInterval: 3000)",
"max_score": 25
},
{
"name": "Timeout configuration",
"description": "Configures the 'timeout' option to detect failing endpoints that don't respond within a reasonable time (e.g., timeout: 5000)",
"max_score": 15
},
{
"name": "Request queueing",
"description": "Uses crawler.add() or crawler.queue() method to add URLs to the crawler queue for processing",
"max_score": 10
},
{
"name": "Callback handling",
"description": "Implements a callback function with signature (error, response, done) and properly calls done() to release the queue slot",
"max_score": 10
},
{
"name": "Error detection",
"description": "Checks for errors in the callback to distinguish successful requests from failed requests (e.g., if (error) { ... })",
"max_score": 5
}
]
}Install with Tessl CLI
npx tessl i tessl/npm-crawlerevals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10