or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

index.md
tile.json

tessl/npm-vega-datasets

Common repository for example datasets used by Vega related projects.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/vega-datasets@3.2.x

To install, run

npx @tessl/cli install tessl/npm-vega-datasets@3.2.0

index.mddocs/

Vega Datasets

Vega Datasets provides 70 curated datasets commonly used in data visualization examples and documentation for Vega, Vega-Lite, Altair, and related projects. It offers programmatic access to datasets through TypeScript/JavaScript APIs and direct HTTP access via CDN, supporting JSON, CSV, TSV, Arrow, and Parquet formats.

Package Information

  • Package Name: vega-datasets
  • Package Type: npm
  • Language: TypeScript
  • Installation: npm install vega-datasets

Core Imports

import data from 'vega-datasets';

CommonJS:

const data = require('vega-datasets');

Basic Usage

import data from 'vega-datasets';

// Access dataset by calling the function
const cars = await data['cars.json']();
console.log(cars); // Array of car objects

// Access the CDN URL for a dataset
const carsUrl = data['cars.json'].url;
console.log(carsUrl); // "https://cdn.jsdelivr.net/npm/vega-datasets@3.2.1/data/cars.json"

// Get package version
console.log(data.version); // "3.2.1"

Capabilities

Dataset Access

Access to 70 curated datasets for data visualization and analysis. Each dataset is available as both a callable function and a URL property.

/**
 * Main data object providing access to all datasets and package version
 */
interface VegaDatasetsAPI {
  /** Package version string */
  version: string;
  
  /** Dataset accessor functions - dynamically generated for each dataset */
  [datasetName: string]: DatasetAccessor;
}

/**
 * Dataset accessor function with URL property
 */
interface DatasetAccessor {
  /** 
   * Fetch and parse the dataset
   * @returns Promise resolving to parsed data (JSON object/array, CSV array, or raw string)
   */
  (): Promise<any | any[] | string>;
  
  /** CDN URL for direct HTTP access to the dataset */
  url: string;
}

Usage Examples:

import data from 'vega-datasets';

// Fetch JSON dataset (returns parsed object/array)
const earthquakes = await data['earthquakes.json']();
const population = await data['population.json']();

// Fetch CSV dataset (returns parsed array with auto-typed columns)
const stocks = await data['stocks.csv']();
const weather = await data['weather.csv']();

// Fetch TSV dataset (returns raw text string)
const unemployment = await data['unemployment.tsv']();

// Access dataset URLs for direct fetching
const directUrl = data['cars.json'].url;
const response = await fetch(directUrl);
const carData = await response.json();

JSON Datasets (44 datasets)

Statistical, geographic, and domain-specific datasets in JSON format.

// Geographic datasets
'world-110m.json': DatasetAccessor;        // World geographic topology
'us-10m.json': DatasetAccessor;            // US geographic topology
'countries.json': DatasetAccessor;         // Country data
'us-state-capitals.json': DatasetAccessor; // US state capitals

// Economic datasets
'budget.json': DatasetAccessor;            // Budget allocation data
'budgets.json': DatasetAccessor;           // Multi-year budget data
'income.json': DatasetAccessor;            // Income distribution

// Scientific datasets
'earthquakes.json': DatasetAccessor;       // Earthquake data
'annual-precip.json': DatasetAccessor;     // Precipitation data
'volcano.json': DatasetAccessor;           // Volcanic activity

// Statistical examples
'anscombe.json': DatasetAccessor;          // Anscombe's quartet
'normal-2d.json': DatasetAccessor;         // 2D normal distribution
'uniform-2d.json': DatasetAccessor;        // 2D uniform distribution

// Transportation datasets
'cars.json': DatasetAccessor;              // Automotive specifications
'flights-2k.json': DatasetAccessor;        // Flight data (2k records)
'flights-5k.json': DatasetAccessor;        // Flight data (5k records)
'flights-10k.json': DatasetAccessor;       // Flight data (10k records)
'flights-20k.json': DatasetAccessor;       // Flight data (20k records)
'flights-200k.json': DatasetAccessor;      // Flight data (200k records)

CSV Datasets (23 datasets)

Tabular datasets parsed with d3-dsv for automatic type inference.

// Weather and climate
'seattle-weather.csv': DatasetAccessor;              // Seattle weather observations
'seattle-weather-hourly-normals.csv': DatasetAccessor; // Seattle weather normals
'weather.csv': DatasetAccessor;                      // General weather data
'co2-concentration.csv': DatasetAccessor;            // CO2 measurements
'global-temp.csv': DatasetAccessor;                  // Global temperature data

// Transportation and infrastructure
'airports.csv': DatasetAccessor;                     // Airport locations
'flights-airport.csv': DatasetAccessor;              // Flight/airport data
'birdstrikes.csv': DatasetAccessor;                  // Aviation bird strikes

// Economic and demographic
'gapminder-health-income.csv': DatasetAccessor;      // Health vs income
'us-employment.csv': DatasetAccessor;                // Employment statistics
'iowa-electricity.csv': DatasetAccessor;             // Electrical consumption
'zipcodes.csv': DatasetAccessor;                     // US ZIP codes

// Technology and development
'github.csv': DatasetAccessor;                       // GitHub statistics
'sp500.csv': DatasetAccessor;                        // S&P 500 data
'sp500-2000.csv': DatasetAccessor;                   // S&P 500 year 2000

Other Format Datasets

Support for specialized data formats.

// Tab-separated values
'unemployment.tsv': DatasetAccessor;        // Unemployment data (TSV format)

// Binary formats for large datasets
'flights-200k.arrow': DatasetAccessor;      // Apache Arrow format
'flights-3m.parquet': DatasetAccessor;      // Parquet format (3M records)

Data Processing Behavior

JSON Files

  • Returns parsed JavaScript objects or arrays
  • Automatically handles nested structures
  • Preserves original data types

CSV Files

  • Parsed using d3-dsv with automatic type inference
  • Numeric strings converted to numbers
  • Date strings converted to Date objects
  • Returns array of objects with column headers as keys

Other Formats

  • TSV files return raw text strings
  • Binary formats (Arrow, Parquet) return raw data

Error Handling

Dataset functions can throw errors in these cases:

  • Network errors: When CDN is unreachable
  • Parsing errors: For corrupted JSON or CSV files
  • HTTP errors: When dataset URLs return non-200 status codes
try {
  const data = await vegaData['cars.json']();
} catch (error) {
  console.error('Failed to load dataset:', error);
}

Types

/**
 * Union type of all available dataset names
 */
type DatasetName = 
  | 'annual-precip.json' | 'anscombe.json' | 'barley.json' | 'budget.json'
  | 'budgets.json' | 'burtin.json' | 'cars.json' | 'countries.json'
  | 'crimea.json' | 'driving.json' | 'earthquakes.json' | 'flare-dependencies.json'
  | 'flare.json' | 'flights-10k.json' | 'flights-200k.json' | 'flights-20k.json'
  | 'flights-2k.json' | 'flights-5k.json' | 'football.json' | 'gapminder.json'
  | 'income.json' | 'jobs.json' | 'londonBoroughs.json' | 'londonCentroids.json'
  | 'londonTubeLines.json' | 'miserables.json' | 'monarchs.json' | 'movies.json'
  | 'normal-2d.json' | 'obesity.json' | 'ohlc.json' | 'penguins.json'
  | 'platformer-terrain.json' | 'political-contributions.json' | 'population.json'
  | 'udistrict.json' | 'unemployment-across-industries.json' | 'uniform-2d.json'
  | 'us-10m.json' | 'us-state-capitals.json' | 'volcano.json' | 'weekly-weather.json'
  | 'wheat.json' | 'world-110m.json' | 'airports.csv' | 'birdstrikes.csv'
  | 'co2-concentration.csv' | 'disasters.csv' | 'flights-airport.csv'
  | 'gapminder-health-income.csv' | 'github.csv' | 'global-temp.csv'
  | 'iowa-electricity.csv' | 'la-riots.csv' | 'lookup_groups.csv'
  | 'lookup_people.csv' | 'population_engineers_hurricanes.csv'
  | 'seattle-weather-hourly-normals.csv' | 'seattle-weather.csv' | 'sp500-2000.csv'
  | 'sp500.csv' | 'species.csv' | 'stocks.csv' | 'us-employment.csv'
  | 'weather.csv' | 'windvectors.csv' | 'zipcodes.csv' | 'unemployment.tsv'
  | 'flights-200k.arrow' | 'flights-3m.parquet';

/**
 * Return type for dataset accessor functions
 */
type DatasetResult = any | any[] | string;