A ready-to-use web spider that works with proxies, asynchrony, rate limit, configurable request pools, jQuery, and HTTP/2 support.
94
{
"context": "This evaluation assesses how effectively an engineer uses the crawler package to implement duplicate URL detection and HTML parsing. The focus is on proper utilization of the skipDuplicates option and Cheerio integration for extracting product data.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Crawler instantiation",
"description": "The solution imports and creates a Crawler instance from the crawler package using 'new Crawler(options)' or 'const Crawler = require(\"crawler\")' followed by instantiation",
"max_score": 10
},
{
"name": "skipDuplicates enabled",
"description": "The Crawler instance is configured with the skipDuplicates option set to true to enable automatic duplicate URL detection",
"max_score": 35
},
{
"name": "Cheerio title extraction",
"description": "The solution uses the response.$ Cheerio instance to parse HTML and extract product title from the .product-title selector",
"max_score": 15
},
{
"name": "Cheerio price extraction",
"description": "The solution uses the response.$ Cheerio instance to extract the price from the .price selector",
"max_score": 15
},
{
"name": "Callback and done()",
"description": "The crawler callback option follows the correct signature (error, response, done) and calls done() to release the queue slot and allow processing to continue",
"max_score": 15
},
{
"name": "drain event",
"description": "The solution uses crawler.on('drain', callback) to detect when all tasks are complete and trigger the onComplete callback with collected data",
"max_score": 10
}
]
}Install with Tessl CLI
npx tessl i tessl/npm-crawlerevals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10