A ready-to-use web spider that works with proxies, asynchrony, rate limit, configurable request pools, jQuery, and HTTP/2 support.
94
Build a news feed aggregator that collects article titles from multiple news websites with proper request prioritization and concurrency control.
Create a function that:
The system must respect priority levels where:
When multiple requests are queued, higher priority requests (lower priority number) should be processed before lower priority requests.
The crawler must limit the number of simultaneous active requests to the specified maxConnections value to avoid overwhelming servers.
For each website:
h2.article-title elementsReturn results as an array of objects, where each object contains:
source: the name of the news sourcearticles: array of article title strings from that sourceGiven a crawler with maxConnections set to 1:
Given a crawler with maxConnections set to 2:
Given a mock HTML page with three article titles:
Web crawling framework with priority queue support and concurrent request management.
@generates
/**
* Aggregates news from multiple sources with priority-based crawling
*
* @param {Array<{url: string, name: string, priority: number}>} sources - Array of news sources with URLs, names, and priorities
* @param {number} maxConnections - Maximum number of concurrent requests
* @returns {Promise<Array<{source: string, articles: Array<string>}>>} Promise that resolves with collected articles grouped by source
*/
function aggregateNews(sources, maxConnections) {
// Implementation here
}
module.exports = { aggregateNews };Install with Tessl CLI
npx tessl i tessl/npm-crawlerevals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10