CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-crawler

A ready-to-use web spider that works with proxies, asynchrony, rate limit, configurable request pools, jQuery, and HTTP/2 support.

94

1.17x
Overview
Eval results
Files

task.mdevals/scenario-10/

Session-Aware Web Scraper

Build a web scraper that maintains user session state across multiple page requests by persisting cookies between requests.

Problem Description

You need to create a scraper that can navigate through a multi-page website requiring authentication or session management. The scraper should maintain cookies throughout the crawling session, allowing it to access pages that depend on session state.

Implement a scraper that:

  1. Makes an initial request to establish a session (e.g., a login or homepage visit)
  2. Makes subsequent requests that rely on the session cookies from the first request
  3. Properly shares cookie state across all requests in the crawling session

Requirements

  • Create a module that exports a createSessionScraper function
  • The function should accept a configuration object with at least a callback function
  • All requests made by the scraper should share the same cookie jar
  • Cookies received from responses should be automatically stored and sent with subsequent requests
  • The scraper should queue multiple URLs and process them with shared session state

Test Cases

  • When the scraper makes two sequential requests to the same domain, cookies from the first response should be included in the second request @test
  • Multiple scrapers with different cookie jars should maintain independent session states @test

Implementation

@generates

API

/**
 * Creates a session-aware web scraper that maintains cookies across requests.
 *
 * @param {Object} options - Configuration options
 * @param {Function} options.callback - Callback function for processing responses
 * @returns {Object} Scraper instance with add() and drain event
 */
function createSessionScraper(options) {
  // Implementation
}

module.exports = { createSessionScraper };

Dependencies { .dependencies }

crawler { .dependency }

Provides web scraping functionality with cookie support.

@satisfied-by

tough-cookie { .dependency }

Provides cookie jar functionality for storing and managing cookies.

@satisfied-by

Install with Tessl CLI

npx tessl i tessl/npm-crawler

tile.json