or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.md

index.mddocs/

0

# Vega Datasets

1

2

Vega Datasets provides 70 curated datasets commonly used in data visualization examples and documentation for Vega, Vega-Lite, Altair, and related projects. It offers programmatic access to datasets through TypeScript/JavaScript APIs and direct HTTP access via CDN, supporting JSON, CSV, TSV, Arrow, and Parquet formats.

3

4

## Package Information

5

6

- **Package Name**: vega-datasets

7

- **Package Type**: npm

8

- **Language**: TypeScript

9

- **Installation**: `npm install vega-datasets`

10

11

## Core Imports

12

13

```typescript

14

import data from 'vega-datasets';

15

```

16

17

CommonJS:

18

19

```javascript

20

const data = require('vega-datasets');

21

```

22

23

## Basic Usage

24

25

```typescript

26

import data from 'vega-datasets';

27

28

// Access dataset by calling the function

29

const cars = await data['cars.json']();

30

console.log(cars); // Array of car objects

31

32

// Access the CDN URL for a dataset

33

const carsUrl = data['cars.json'].url;

34

console.log(carsUrl); // "https://cdn.jsdelivr.net/npm/vega-datasets@3.2.1/data/cars.json"

35

36

// Get package version

37

console.log(data.version); // "3.2.1"

38

```

39

40

## Capabilities

41

42

### Dataset Access

43

44

Access to 70 curated datasets for data visualization and analysis. Each dataset is available as both a callable function and a URL property.

45

46

```typescript { .api }

47

/**

48

* Main data object providing access to all datasets and package version

49

*/

50

interface VegaDatasetsAPI {

51

/** Package version string */

52

version: string;

53

54

/** Dataset accessor functions - dynamically generated for each dataset */

55

[datasetName: string]: DatasetAccessor;

56

}

57

58

/**

59

* Dataset accessor function with URL property

60

*/

61

interface DatasetAccessor {

62

/**

63

* Fetch and parse the dataset

64

* @returns Promise resolving to parsed data (JSON object/array, CSV array, or raw string)

65

*/

66

(): Promise<any | any[] | string>;

67

68

/** CDN URL for direct HTTP access to the dataset */

69

url: string;

70

}

71

```

72

73

**Usage Examples:**

74

75

```typescript

76

import data from 'vega-datasets';

77

78

// Fetch JSON dataset (returns parsed object/array)

79

const earthquakes = await data['earthquakes.json']();

80

const population = await data['population.json']();

81

82

// Fetch CSV dataset (returns parsed array with auto-typed columns)

83

const stocks = await data['stocks.csv']();

84

const weather = await data['weather.csv']();

85

86

// Fetch TSV dataset (returns raw text string)

87

const unemployment = await data['unemployment.tsv']();

88

89

// Access dataset URLs for direct fetching

90

const directUrl = data['cars.json'].url;

91

const response = await fetch(directUrl);

92

const carData = await response.json();

93

```

94

95

### JSON Datasets (44 datasets)

96

97

Statistical, geographic, and domain-specific datasets in JSON format.

98

99

```typescript { .api }

100

// Geographic datasets

101

'world-110m.json': DatasetAccessor; // World geographic topology

102

'us-10m.json': DatasetAccessor; // US geographic topology

103

'countries.json': DatasetAccessor; // Country data

104

'us-state-capitals.json': DatasetAccessor; // US state capitals

105

106

// Economic datasets

107

'budget.json': DatasetAccessor; // Budget allocation data

108

'budgets.json': DatasetAccessor; // Multi-year budget data

109

'income.json': DatasetAccessor; // Income distribution

110

111

// Scientific datasets

112

'earthquakes.json': DatasetAccessor; // Earthquake data

113

'annual-precip.json': DatasetAccessor; // Precipitation data

114

'volcano.json': DatasetAccessor; // Volcanic activity

115

116

// Statistical examples

117

'anscombe.json': DatasetAccessor; // Anscombe's quartet

118

'normal-2d.json': DatasetAccessor; // 2D normal distribution

119

'uniform-2d.json': DatasetAccessor; // 2D uniform distribution

120

121

// Transportation datasets

122

'cars.json': DatasetAccessor; // Automotive specifications

123

'flights-2k.json': DatasetAccessor; // Flight data (2k records)

124

'flights-5k.json': DatasetAccessor; // Flight data (5k records)

125

'flights-10k.json': DatasetAccessor; // Flight data (10k records)

126

'flights-20k.json': DatasetAccessor; // Flight data (20k records)

127

'flights-200k.json': DatasetAccessor; // Flight data (200k records)

128

```

129

130

### CSV Datasets (23 datasets)

131

132

Tabular datasets parsed with d3-dsv for automatic type inference.

133

134

```typescript { .api }

135

// Weather and climate

136

'seattle-weather.csv': DatasetAccessor; // Seattle weather observations

137

'seattle-weather-hourly-normals.csv': DatasetAccessor; // Seattle weather normals

138

'weather.csv': DatasetAccessor; // General weather data

139

'co2-concentration.csv': DatasetAccessor; // CO2 measurements

140

'global-temp.csv': DatasetAccessor; // Global temperature data

141

142

// Transportation and infrastructure

143

'airports.csv': DatasetAccessor; // Airport locations

144

'flights-airport.csv': DatasetAccessor; // Flight/airport data

145

'birdstrikes.csv': DatasetAccessor; // Aviation bird strikes

146

147

// Economic and demographic

148

'gapminder-health-income.csv': DatasetAccessor; // Health vs income

149

'us-employment.csv': DatasetAccessor; // Employment statistics

150

'iowa-electricity.csv': DatasetAccessor; // Electrical consumption

151

'zipcodes.csv': DatasetAccessor; // US ZIP codes

152

153

// Technology and development

154

'github.csv': DatasetAccessor; // GitHub statistics

155

'sp500.csv': DatasetAccessor; // S&P 500 data

156

'sp500-2000.csv': DatasetAccessor; // S&P 500 year 2000

157

```

158

159

### Other Format Datasets

160

161

Support for specialized data formats.

162

163

```typescript { .api }

164

// Tab-separated values

165

'unemployment.tsv': DatasetAccessor; // Unemployment data (TSV format)

166

167

// Binary formats for large datasets

168

'flights-200k.arrow': DatasetAccessor; // Apache Arrow format

169

'flights-3m.parquet': DatasetAccessor; // Parquet format (3M records)

170

```

171

172

## Data Processing Behavior

173

174

### JSON Files

175

- Returns parsed JavaScript objects or arrays

176

- Automatically handles nested structures

177

- Preserves original data types

178

179

### CSV Files

180

- Parsed using d3-dsv with automatic type inference

181

- Numeric strings converted to numbers

182

- Date strings converted to Date objects

183

- Returns array of objects with column headers as keys

184

185

### Other Formats

186

- TSV files return raw text strings

187

- Binary formats (Arrow, Parquet) return raw data

188

189

## Error Handling

190

191

Dataset functions can throw errors in these cases:

192

193

- **Network errors**: When CDN is unreachable

194

- **Parsing errors**: For corrupted JSON or CSV files

195

- **HTTP errors**: When dataset URLs return non-200 status codes

196

197

```typescript

198

try {

199

const data = await vegaData['cars.json']();

200

} catch (error) {

201

console.error('Failed to load dataset:', error);

202

}

203

```

204

205

## Types

206

207

```typescript { .api }

208

/**

209

* Union type of all available dataset names

210

*/

211

type DatasetName =

212

| 'annual-precip.json' | 'anscombe.json' | 'barley.json' | 'budget.json'

213

| 'budgets.json' | 'burtin.json' | 'cars.json' | 'countries.json'

214

| 'crimea.json' | 'driving.json' | 'earthquakes.json' | 'flare-dependencies.json'

215

| 'flare.json' | 'flights-10k.json' | 'flights-200k.json' | 'flights-20k.json'

216

| 'flights-2k.json' | 'flights-5k.json' | 'football.json' | 'gapminder.json'

217

| 'income.json' | 'jobs.json' | 'londonBoroughs.json' | 'londonCentroids.json'

218

| 'londonTubeLines.json' | 'miserables.json' | 'monarchs.json' | 'movies.json'

219

| 'normal-2d.json' | 'obesity.json' | 'ohlc.json' | 'penguins.json'

220

| 'platformer-terrain.json' | 'political-contributions.json' | 'population.json'

221

| 'udistrict.json' | 'unemployment-across-industries.json' | 'uniform-2d.json'

222

| 'us-10m.json' | 'us-state-capitals.json' | 'volcano.json' | 'weekly-weather.json'

223

| 'wheat.json' | 'world-110m.json' | 'airports.csv' | 'birdstrikes.csv'

224

| 'co2-concentration.csv' | 'disasters.csv' | 'flights-airport.csv'

225

| 'gapminder-health-income.csv' | 'github.csv' | 'global-temp.csv'

226

| 'iowa-electricity.csv' | 'la-riots.csv' | 'lookup_groups.csv'

227

| 'lookup_people.csv' | 'population_engineers_hurricanes.csv'

228

| 'seattle-weather-hourly-normals.csv' | 'seattle-weather.csv' | 'sp500-2000.csv'

229

| 'sp500.csv' | 'species.csv' | 'stocks.csv' | 'us-employment.csv'

230

| 'weather.csv' | 'windvectors.csv' | 'zipcodes.csv' | 'unemployment.tsv'

231

| 'flights-200k.arrow' | 'flights-3m.parquet';

232

233

/**

234

* Return type for dataset accessor functions

235

*/

236

type DatasetResult = any | any[] | string;

237

```