or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

bands.mdcatalogue.mdcomputation.mdconstants.mddatasets.mdindex.mdplotting.md

datasets.mddocs/

0

# Sample Datasets

1

2

Built-in sample datasets for testing, examples, and educational purposes. Provides both satellite imagery and spectral reflectance data in multiple formats, enabling users to quickly test spectral index computations and explore library functionality without requiring external data sources.

3

4

## Capabilities

5

6

### Dataset Loading Function

7

8

Opens built-in sample datasets with different formats optimized for various use cases and data types.

9

10

```python { .api }

11

def open(dataset: str) -> Any:

12

"""

13

Opens a built-in sample dataset.

14

15

Parameters:

16

- dataset: Dataset name ("sentinel" or "spectral")

17

18

Returns:

19

Dataset in appropriate format:

20

- "sentinel": xarray.DataArray with Sentinel-2 sample image (10m bands)

21

- "spectral": pandas.DataFrame with Landsat 8 reflectance samples

22

23

Raises:

24

Exception: If dataset name is not valid

25

"""

26

```

27

28

**Usage Examples:**

29

30

```python

31

import spyndex.datasets

32

33

# Load Sentinel-2 sample dataset

34

sentinel_data = spyndex.datasets.open("sentinel")

35

print(type(sentinel_data)) # <class 'xarray.core.dataarray.DataArray'>

36

print(sentinel_data.shape) # (4, 300, 300)

37

38

# Load spectral reflectance samples

39

spectral_data = spyndex.datasets.open("spectral")

40

print(type(spectral_data)) # <class 'pandas.core.frame.DataFrame'>

41

print(spectral_data.shape) # (120, 9)

42

```

43

44

## Available Datasets

45

46

### Sentinel Dataset

47

48

Multi-band satellite image from Sentinel-2 satellite with 10-meter resolution bands suitable for vegetation analysis and multi-spectral index computation.

49

50

```python

51

import spyndex.datasets

52

import spyndex

53

54

# Load Sentinel-2 sample

55

sentinel = spyndex.datasets.open("sentinel")

56

57

# Explore dataset structure

58

print(sentinel)

59

# Output: <xarray.DataArray (band: 4, x: 300, y: 300)>

60

# Coordinates:

61

# * band (band) <U3 'B02' 'B03' 'B04' 'B08'

62

# Dimensions without coordinates: x, y

63

64

print(f"Bands available: {list(sentinel.coords['band'].values)}")

65

# Output: ['B02', 'B03', 'B04', 'B08']

66

67

print(f"Spatial dimensions: {sentinel.sizes['x']} x {sentinel.sizes['y']}")

68

# Output: 300 x 300

69

70

# Compute spectral indices using Sentinel-2 data

71

ndvi = spyndex.computeIndex(

72

"NDVI",

73

params={

74

"N": sentinel.sel(band="B08"), # NIR band

75

"R": sentinel.sel(band="B04") # Red band

76

}

77

)

78

79

print(f"NDVI result shape: {ndvi.shape}") # (300, 300)

80

print(f"NDVI range: {ndvi.min().values:.3f} to {ndvi.max().values:.3f}")

81

82

# Compute multiple indices

83

indices = spyndex.computeIndex(

84

["NDVI", "GNDVI"],

85

params={

86

"N": sentinel.sel(band="B08"), # NIR

87

"R": sentinel.sel(band="B04"), # Red

88

"G": sentinel.sel(band="B03") # Green

89

}

90

)

91

92

print(f"Multiple indices shape: {indices.shape}") # (2, 300, 300)

93

print(f"Index names: {list(indices.coords['index'].values)}")

94

```

95

96

### Spectral Dataset

97

98

Landsat 8 surface reflectance samples representing three different land cover types, ideal for exploring spectral signatures and testing classification-oriented indices.

99

100

```python

101

import spyndex.datasets

102

import spyndex

103

104

# Load spectral reflectance samples

105

spectral = spyndex.datasets.open("spectral")

106

107

# Explore dataset structure

108

print(spectral.dtypes)

109

# Output:

110

# SR_B1 float64 # Coastal Aerosol

111

# SR_B2 float64 # Blue

112

# SR_B3 float64 # Green

113

# SR_B4 float64 # Red

114

# SR_B5 float64 # NIR

115

# SR_B6 float64 # SWIR1

116

# SR_B7 float64 # SWIR2

117

# ST_B10 float64 # Thermal

118

# class object # Land cover class

119

# dtype: object

120

121

print(f"Dataset shape: {spectral.shape}") # (120, 9)

122

print(f"Land cover classes: {spectral['class'].unique()}")

123

# Output: ['Water' 'Vegetation' 'Urban']

124

125

# Analyze spectral signatures by class

126

for land_class in spectral['class'].unique():

127

class_data = spectral[spectral['class'] == land_class]

128

print(f"\n{land_class} samples: {len(class_data)}")

129

print(f"Average NIR reflectance: {class_data['SR_B5'].mean():.3f}")

130

print(f"Average Red reflectance: {class_data['SR_B4'].mean():.3f}")

131

132

# Compute indices for all samples

133

ndvi_all = spyndex.computeIndex(

134

"NDVI",

135

params={

136

"N": spectral["SR_B5"], # NIR

137

"R": spectral["SR_B4"] # Red

138

}

139

)

140

141

# Add NDVI to dataframe for analysis

142

spectral_with_ndvi = spectral.copy()

143

spectral_with_ndvi["NDVI"] = ndvi_all

144

145

# Analyze NDVI by land cover class

146

for land_class in spectral_with_ndvi['class'].unique():

147

class_ndvi = spectral_with_ndvi[spectral_with_ndvi['class'] == land_class]['NDVI']

148

print(f"{land_class} NDVI: {class_ndvi.mean():.3f} ± {class_ndvi.std():.3f}")

149

```

150

151

## Dataset Integration Examples

152

153

### Complete Workflow Examples

154

155

Using datasets for comprehensive spectral index analysis:

156

157

```python

158

import spyndex.datasets

159

import spyndex

160

import matplotlib.pyplot as plt

161

import numpy as np

162

163

def analyze_dataset_indices(dataset_name, indices_list):

164

"""Analyze multiple spectral indices on a sample dataset."""

165

166

if dataset_name == "sentinel":

167

data = spyndex.datasets.open("sentinel")

168

169

# Compute indices on spatial data

170

results = spyndex.computeIndex(

171

indices_list,

172

params={

173

"N": data.sel(band="B08"), # NIR

174

"R": data.sel(band="B04"), # Red

175

"G": data.sel(band="B03"), # Green

176

"B": data.sel(band="B02") # Blue

177

}

178

)

179

180

# Visualize results

181

fig, axes = plt.subplots(2, 2, figsize=(12, 10))

182

axes = axes.flatten()

183

184

for i, idx_name in enumerate(indices_list):

185

if i < len(axes):

186

im = axes[i].imshow(results.sel(index=idx_name), cmap='RdYlGn')

187

axes[i].set_title(f"{idx_name}")

188

axes[i].axis('off')

189

plt.colorbar(im, ax=axes[i], shrink=0.8)

190

191

plt.tight_layout()

192

plt.show()

193

194

elif dataset_name == "spectral":

195

data = spyndex.datasets.open("spectral")

196

197

# Compute indices on tabular data

198

results = {}

199

for idx_name in indices_list:

200

try:

201

idx_values = spyndex.computeIndex(

202

idx_name,

203

params={

204

"N": data["SR_B5"],

205

"R": data["SR_B4"],

206

"G": data["SR_B3"],

207

"B": data["SR_B2"]

208

}

209

)

210

results[idx_name] = idx_values

211

except Exception as e:

212

print(f"Could not compute {idx_name}: {e}")

213

214

# Analyze by land cover class

215

for land_class in data['class'].unique():

216

class_mask = data['class'] == land_class

217

print(f"\n{land_class} class:")

218

219

for idx_name, values in results.items():

220

class_values = values[class_mask]

221

print(f" {idx_name}: {class_values.mean():.3f} ± {class_values.std():.3f}")

222

223

# Example usage

224

analyze_dataset_indices("sentinel", ["NDVI", "GNDVI", "EVI", "CI"])

225

analyze_dataset_indices("spectral", ["NDVI", "NDWI", "NBR"])

226

```

227

228

### Machine Learning Integration

229

230

Using datasets for supervised learning and classification:

231

232

```python

233

import spyndex.datasets

234

import spyndex

235

import pandas as pd

236

from sklearn.ensemble import RandomForestClassifier

237

from sklearn.model_selection import train_test_split

238

from sklearn.metrics import classification_report

239

240

def create_spectral_features():

241

"""Create feature matrix using spectral indices."""

242

243

# Load dataset

244

data = spyndex.datasets.open("spectral")

245

246

# Define vegetation-related indices

247

vegetation_indices = ["NDVI", "GNDVI", "SAVI", "EVI", "CI", "RDVI"]

248

249

# Compute all indices

250

features = pd.DataFrame()

251

252

for idx_name in vegetation_indices:

253

try:

254

idx_values = spyndex.computeIndex(

255

idx_name,

256

params={

257

"N": data["SR_B5"], # NIR

258

"R": data["SR_B4"], # Red

259

"G": data["SR_B3"], # Green

260

"B": data["SR_B2"], # Blue

261

"L": spyndex.constants.L.default # For SAVI

262

}

263

)

264

features[idx_name] = idx_values

265

except:

266

print(f"Skipping {idx_name} - missing parameters")

267

268

# Add original bands as features

269

band_features = ["SR_B2", "SR_B3", "SR_B4", "SR_B5", "SR_B6", "SR_B7"]

270

for band in band_features:

271

features[band] = data[band]

272

273

# Target classes

274

y = data["class"]

275

276

return features, y

277

278

# Create feature matrix and train classifier

279

X, y = create_spectral_features()

280

281

print(f"Feature matrix shape: {X.shape}")

282

print(f"Features: {list(X.columns)}")

283

print(f"Classes: {y.unique()}")

284

285

# Train classifier

286

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

287

288

rf = RandomForestClassifier(n_estimators=100, random_state=42)

289

rf.fit(X_train, y_train)

290

291

# Evaluate

292

y_pred = rf.predict(X_test)

293

print("\nClassification Results:")

294

print(classification_report(y_test, y_pred))

295

296

# Feature importance

297

feature_importance = pd.DataFrame({

298

'feature': X.columns,

299

'importance': rf.feature_importances_

300

}).sort_values('importance', ascending=False)

301

302

print("\nTop 10 Most Important Features:")

303

print(feature_importance.head(10))

304

```

305

306

## Dataset Specifications

307

308

### Sentinel Dataset Details

309

- **Source**: Sentinel-2 MSI Level-2A

310

- **Spatial Resolution**: 10 meters

311

- **Bands**: B02 (Blue), B03 (Green), B04 (Red), B08 (NIR)

312

- **Array Size**: 300 × 300 pixels

313

- **Data Type**: xarray.DataArray

314

- **Coordinate System**: Standard x, y pixel coordinates

315

- **Value Range**: Surface reflectance (0-1 typically)

316

317

### Spectral Dataset Details

318

- **Source**: Landsat 8 OLI/TIRS Level-2

319

- **Samples**: 120 total (40 per land cover class)

320

- **Classes**: Water, Vegetation, Urban

321

- **Bands**: SR_B1-B7 (surface reflectance), ST_B10 (thermal)

322

- **Data Type**: pandas.DataFrame

323

- **Value Range**: Surface reflectance values and brightness temperature

324

325

## Error Handling

326

327

```python

328

import spyndex.datasets

329

330

# Invalid dataset name

331

try:

332

invalid_data = spyndex.datasets.open("nonexistent")

333

except Exception as e:

334

print(f"Error: {e}")

335

# Output: Error: nonexistent is not a valid dataset. Please use one of ['sentinel','spectral']

336

337

# Valid dataset names only

338

valid_datasets = ["sentinel", "spectral"]

339

for dataset in valid_datasets:

340

data = spyndex.datasets.open(dataset)

341

print(f"Successfully loaded {dataset} dataset")

342

```

343

344

The sample datasets provide immediately usable data for testing spectral index computations, developing analysis workflows, and learning about remote sensing applications without requiring external data acquisition or preprocessing.