CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-vega-datasets

Common repository for example datasets used by Vega related projects.

Pending
Overview
Eval results
Files

Vega Datasets

Vega Datasets provides 70 curated datasets commonly used in data visualization examples and documentation for Vega, Vega-Lite, Altair, and related projects. It offers programmatic access to datasets through TypeScript/JavaScript APIs and direct HTTP access via CDN, supporting JSON, CSV, TSV, Arrow, and Parquet formats.

Package Information

  • Package Name: vega-datasets
  • Package Type: npm
  • Language: TypeScript
  • Installation: npm install vega-datasets

Core Imports

import data from 'vega-datasets';

CommonJS:

const data = require('vega-datasets');

Basic Usage

import data from 'vega-datasets';

// Access dataset by calling the function
const cars = await data['cars.json']();
console.log(cars); // Array of car objects

// Access the CDN URL for a dataset
const carsUrl = data['cars.json'].url;
console.log(carsUrl); // "https://cdn.jsdelivr.net/npm/vega-datasets@3.2.1/data/cars.json"

// Get package version
console.log(data.version); // "3.2.1"

Capabilities

Dataset Access

Access to 70 curated datasets for data visualization and analysis. Each dataset is available as both a callable function and a URL property.

/**
 * Main data object providing access to all datasets and package version
 */
interface VegaDatasetsAPI {
  /** Package version string */
  version: string;
  
  /** Dataset accessor functions - dynamically generated for each dataset */
  [datasetName: string]: DatasetAccessor;
}

/**
 * Dataset accessor function with URL property
 */
interface DatasetAccessor {
  /** 
   * Fetch and parse the dataset
   * @returns Promise resolving to parsed data (JSON object/array, CSV array, or raw string)
   */
  (): Promise<any | any[] | string>;
  
  /** CDN URL for direct HTTP access to the dataset */
  url: string;
}

Usage Examples:

import data from 'vega-datasets';

// Fetch JSON dataset (returns parsed object/array)
const earthquakes = await data['earthquakes.json']();
const population = await data['population.json']();

// Fetch CSV dataset (returns parsed array with auto-typed columns)
const stocks = await data['stocks.csv']();
const weather = await data['weather.csv']();

// Fetch TSV dataset (returns raw text string)
const unemployment = await data['unemployment.tsv']();

// Access dataset URLs for direct fetching
const directUrl = data['cars.json'].url;
const response = await fetch(directUrl);
const carData = await response.json();

JSON Datasets (44 datasets)

Statistical, geographic, and domain-specific datasets in JSON format.

// Geographic datasets
'world-110m.json': DatasetAccessor;        // World geographic topology
'us-10m.json': DatasetAccessor;            // US geographic topology
'countries.json': DatasetAccessor;         // Country data
'us-state-capitals.json': DatasetAccessor; // US state capitals

// Economic datasets
'budget.json': DatasetAccessor;            // Budget allocation data
'budgets.json': DatasetAccessor;           // Multi-year budget data
'income.json': DatasetAccessor;            // Income distribution

// Scientific datasets
'earthquakes.json': DatasetAccessor;       // Earthquake data
'annual-precip.json': DatasetAccessor;     // Precipitation data
'volcano.json': DatasetAccessor;           // Volcanic activity

// Statistical examples
'anscombe.json': DatasetAccessor;          // Anscombe's quartet
'normal-2d.json': DatasetAccessor;         // 2D normal distribution
'uniform-2d.json': DatasetAccessor;        // 2D uniform distribution

// Transportation datasets
'cars.json': DatasetAccessor;              // Automotive specifications
'flights-2k.json': DatasetAccessor;        // Flight data (2k records)
'flights-5k.json': DatasetAccessor;        // Flight data (5k records)
'flights-10k.json': DatasetAccessor;       // Flight data (10k records)
'flights-20k.json': DatasetAccessor;       // Flight data (20k records)
'flights-200k.json': DatasetAccessor;      // Flight data (200k records)

CSV Datasets (23 datasets)

Tabular datasets parsed with d3-dsv for automatic type inference.

// Weather and climate
'seattle-weather.csv': DatasetAccessor;              // Seattle weather observations
'seattle-weather-hourly-normals.csv': DatasetAccessor; // Seattle weather normals
'weather.csv': DatasetAccessor;                      // General weather data
'co2-concentration.csv': DatasetAccessor;            // CO2 measurements
'global-temp.csv': DatasetAccessor;                  // Global temperature data

// Transportation and infrastructure
'airports.csv': DatasetAccessor;                     // Airport locations
'flights-airport.csv': DatasetAccessor;              // Flight/airport data
'birdstrikes.csv': DatasetAccessor;                  // Aviation bird strikes

// Economic and demographic
'gapminder-health-income.csv': DatasetAccessor;      // Health vs income
'us-employment.csv': DatasetAccessor;                // Employment statistics
'iowa-electricity.csv': DatasetAccessor;             // Electrical consumption
'zipcodes.csv': DatasetAccessor;                     // US ZIP codes

// Technology and development
'github.csv': DatasetAccessor;                       // GitHub statistics
'sp500.csv': DatasetAccessor;                        // S&P 500 data
'sp500-2000.csv': DatasetAccessor;                   // S&P 500 year 2000

Other Format Datasets

Support for specialized data formats.

// Tab-separated values
'unemployment.tsv': DatasetAccessor;        // Unemployment data (TSV format)

// Binary formats for large datasets
'flights-200k.arrow': DatasetAccessor;      // Apache Arrow format
'flights-3m.parquet': DatasetAccessor;      // Parquet format (3M records)

Data Processing Behavior

JSON Files

  • Returns parsed JavaScript objects or arrays
  • Automatically handles nested structures
  • Preserves original data types

CSV Files

  • Parsed using d3-dsv with automatic type inference
  • Numeric strings converted to numbers
  • Date strings converted to Date objects
  • Returns array of objects with column headers as keys

Other Formats

  • TSV files return raw text strings
  • Binary formats (Arrow, Parquet) return raw data

Error Handling

Dataset functions can throw errors in these cases:

  • Network errors: When CDN is unreachable
  • Parsing errors: For corrupted JSON or CSV files
  • HTTP errors: When dataset URLs return non-200 status codes
try {
  const data = await vegaData['cars.json']();
} catch (error) {
  console.error('Failed to load dataset:', error);
}

Types

/**
 * Union type of all available dataset names
 */
type DatasetName = 
  | 'annual-precip.json' | 'anscombe.json' | 'barley.json' | 'budget.json'
  | 'budgets.json' | 'burtin.json' | 'cars.json' | 'countries.json'
  | 'crimea.json' | 'driving.json' | 'earthquakes.json' | 'flare-dependencies.json'
  | 'flare.json' | 'flights-10k.json' | 'flights-200k.json' | 'flights-20k.json'
  | 'flights-2k.json' | 'flights-5k.json' | 'football.json' | 'gapminder.json'
  | 'income.json' | 'jobs.json' | 'londonBoroughs.json' | 'londonCentroids.json'
  | 'londonTubeLines.json' | 'miserables.json' | 'monarchs.json' | 'movies.json'
  | 'normal-2d.json' | 'obesity.json' | 'ohlc.json' | 'penguins.json'
  | 'platformer-terrain.json' | 'political-contributions.json' | 'population.json'
  | 'udistrict.json' | 'unemployment-across-industries.json' | 'uniform-2d.json'
  | 'us-10m.json' | 'us-state-capitals.json' | 'volcano.json' | 'weekly-weather.json'
  | 'wheat.json' | 'world-110m.json' | 'airports.csv' | 'birdstrikes.csv'
  | 'co2-concentration.csv' | 'disasters.csv' | 'flights-airport.csv'
  | 'gapminder-health-income.csv' | 'github.csv' | 'global-temp.csv'
  | 'iowa-electricity.csv' | 'la-riots.csv' | 'lookup_groups.csv'
  | 'lookup_people.csv' | 'population_engineers_hurricanes.csv'
  | 'seattle-weather-hourly-normals.csv' | 'seattle-weather.csv' | 'sp500-2000.csv'
  | 'sp500.csv' | 'species.csv' | 'stocks.csv' | 'us-employment.csv'
  | 'weather.csv' | 'windvectors.csv' | 'zipcodes.csv' | 'unemployment.tsv'
  | 'flights-200k.arrow' | 'flights-3m.parquet';

/**
 * Return type for dataset accessor functions
 */
type DatasetResult = any | any[] | string;

Install with Tessl CLI

npx tessl i tessl/npm-vega-datasets
Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/vega-datasets@3.2.x