0
# Data Format Conversion
1
2
Convert ERDDAP data URLs and responses into various Python data analysis formats including pandas DataFrames, xarray Datasets, netCDF4 objects, and iris CubeLists. These functions provide the bridge between ERDDAP's web-based data services and Python's scientific computing ecosystem.
3
4
## Capabilities
5
6
### Pandas DataFrame Conversion
7
8
Convert ERDDAP CSV responses into pandas DataFrames for tabular data analysis.
9
10
```python { .api }
11
def to_pandas(
12
url: str,
13
requests_kwargs: dict | None = None,
14
pandas_kwargs: dict | None = None
15
) -> pd.DataFrame:
16
"""
17
Convert ERDDAP URL to pandas DataFrame.
18
19
Fetches data from the URL and parses it as CSV using pandas.read_csv.
20
Typically used with tabledap URLs that return tabular data.
21
22
Parameters:
23
- url: ERDDAP data URL (usually with .csv or .csvp response)
24
- requests_kwargs: Arguments passed to HTTP request (auth, timeout, etc.)
25
- pandas_kwargs: Arguments passed to pandas.read_csv (parse_dates, dtype, etc.)
26
27
Returns:
28
- pandas.DataFrame with the downloaded data
29
30
Raises:
31
- ValueError: If URL cannot be read or parsed as CSV
32
"""
33
```
34
35
**Usage Examples:**
36
37
```python
38
from erddapy.core.interfaces import to_pandas
39
from erddapy import ERDDAP
40
41
# Direct URL conversion
42
url = "https://gliders.ioos.us/erddap/tabledap/ru29-20150623T1046.csv?time,latitude,longitude,temperature&time>=2015-06-23T10:46:00Z&time<=2015-06-24T10:46:00Z"
43
df = to_pandas(url)
44
print(df.head())
45
46
# With custom pandas options
47
df = to_pandas(
48
url,
49
pandas_kwargs={
50
'parse_dates': ['time'],
51
'dtype': {'temperature': 'float32'}
52
}
53
)
54
55
# Via ERDDAP instance (recommended)
56
e = ERDDAP(server="NGDAC", protocol="tabledap")
57
e.dataset_id = "ru29-20150623T1046"
58
e.constraints = {
59
'time>=': '2015-06-23T10:46:00Z',
60
'time<=': '2015-06-24T10:46:00Z'
61
}
62
df = e.to_pandas() # Uses to_pandas internally
63
```
64
65
### xarray Dataset Conversion
66
67
Convert ERDDAP responses into xarray Datasets for N-dimensional labeled array analysis.
68
69
```python { .api }
70
def to_xarray(
71
url: str,
72
response: str = "opendap",
73
requests_kwargs: dict | None = None,
74
xarray_kwargs: dict | None = None
75
) -> xr.Dataset:
76
"""
77
Convert ERDDAP URL to xarray Dataset.
78
79
Handles different response formats (NetCDF, OPeNDAP) and opens them
80
with xarray. Particularly useful for gridded data from griddap servers.
81
82
Parameters:
83
- url: ERDDAP data URL
84
- response: Response type ('nc', 'opendap', 'ncCF')
85
- requests_kwargs: HTTP request arguments including auth
86
- xarray_kwargs: Arguments passed to xarray.open_dataset
87
88
Returns:
89
- xarray.Dataset with labeled dimensions and coordinates
90
"""
91
```
92
93
**Usage Examples:**
94
95
```python
96
from erddapy.core.interfaces import to_xarray
97
from erddapy import ERDDAP
98
99
# Direct conversion from NetCDF URL
100
nc_url = "https://coastwatch.pfeg.noaa.gov/erddap/griddap/jplMURSST41.nc?analysed_sst[(2020-01-01T09:00:00Z)][(89.99):(-89.99)][(179.99):(-179.99)]"
101
ds = to_xarray(
102
nc_url,
103
response="nc",
104
requests_kwargs={},
105
xarray_kwargs={'decode_times': True}
106
)
107
print(ds)
108
109
# Via ERDDAP instance for griddap data
110
e = ERDDAP(server="CSWC", protocol="griddap")
111
e.dataset_id = "jplMURSST41"
112
e.constraints = {
113
'time': '2020-01-01T09:00:00Z',
114
'latitude': slice(40, 50),
115
'longitude': slice(-130, -120)
116
}
117
ds = e.to_xarray() # Automatically selects appropriate response format
118
print(f"Dataset dimensions: {list(ds.dims)}")
119
print(f"Data variables: {list(ds.data_vars)}")
120
```
121
122
### netCDF4 Dataset Conversion
123
124
Convert ERDDAP responses into netCDF4 Dataset objects for low-level NetCDF file access.
125
126
```python { .api }
127
def to_ncCF(
128
url: str,
129
protocol: str | None = None,
130
requests_kwargs: dict | None = None
131
) -> netCDF4.Dataset:
132
"""
133
Convert ERDDAP URL to CF-compliant netCDF4 Dataset.
134
135
Downloads data and opens it as a netCDF4 Dataset object,
136
providing direct access to NetCDF attributes and methods.
137
138
Parameters:
139
- url: ERDDAP data URL (typically .ncCF response)
140
- protocol: 'tabledap' or 'griddap' (affects processing)
141
- requests_kwargs: HTTP request arguments
142
143
Returns:
144
- netCDF4.Dataset object
145
"""
146
```
147
148
**Usage Examples:**
149
150
```python
151
from erddapy.core.interfaces import to_ncCF
152
import netCDF4
153
154
# Convert URL to netCDF4 Dataset
155
url = "https://coastwatch.pfeg.noaa.gov/erddap/griddap/jplMURSST41.ncCF?analysed_sst[(2020-01-01T09:00:00Z)][(40):(50)][(-130):(-120)]"
156
nc_ds = to_ncCF(url, protocol="griddap")
157
158
# Access netCDF4 methods and attributes
159
print("Global attributes:")
160
for attr in nc_ds.ncattrs():
161
print(f" {attr}: {getattr(nc_ds, attr)}")
162
163
print("\nVariables:")
164
for var_name, var in nc_ds.variables.items():
165
print(f" {var_name}: {var.shape} {var.dtype}")
166
167
# Access data
168
sst = nc_ds.variables['analysed_sst'][:]
169
print(f"SST data shape: {sst.shape}")
170
171
# Close when done
172
nc_ds.close()
173
174
# Via ERDDAP instance
175
e = ERDDAP(server="CSWC", protocol="griddap")
176
e.dataset_id = "jplMURSST41"
177
nc_ds = e.to_ncCF()
178
```
179
180
### iris CubeList Conversion
181
182
Convert ERDDAP responses into iris CubeLists for Earth science data analysis with CF conventions.
183
184
```python { .api }
185
def to_iris(
186
url: str,
187
iris_kwargs: dict = None
188
) -> iris.cube.CubeList:
189
"""
190
Convert ERDDAP URL to iris CubeList.
191
192
Downloads NetCDF data and loads it with iris, providing
193
Earth science-specific data structures and analysis tools.
194
195
Parameters:
196
- url: ERDDAP data URL (NetCDF format)
197
- iris_kwargs: Arguments passed to iris.load_raw
198
199
Returns:
200
- iris.cube.CubeList containing loaded cubes
201
"""
202
```
203
204
**Usage Examples:**
205
206
```python
207
from erddapy.core.interfaces import to_iris
208
import iris
209
210
# Convert URL to iris CubeList
211
url = "https://coastwatch.pfeg.noaa.gov/erddap/griddap/jplMURSST41.nc?analysed_sst[(2020-01-01T09:00:00Z)][(40):(50)][(-130):(-120)]"
212
cubes = to_iris(url)
213
214
# Work with iris cubes
215
for cube in cubes:
216
print(f"Cube: {cube.name()}")
217
print(f"Shape: {cube.shape}")
218
print(f"Units: {cube.units}")
219
print(f"Coordinates: {[coord.name() for coord in cube.coords()]}")
220
221
# Access first cube
222
if cubes:
223
sst_cube = cubes[0]
224
225
# iris provides rich metadata access
226
print(f"Standard name: {sst_cube.standard_name}")
227
print(f"Long name: {sst_cube.long_name}")
228
229
# Access coordinate information
230
for coord in sst_cube.coords():
231
print(f"Coordinate {coord.name()}: {coord.units}")
232
233
# Via ERDDAP instance
234
e = ERDDAP(server="CSWC", protocol="griddap")
235
e.dataset_id = "jplMURSST41"
236
cubes = e.to_iris()
237
```
238
239
## Format Selection Guidelines
240
241
Choose the appropriate data format based on your analysis needs:
242
243
### pandas DataFrame
244
- **Best for:** Tabular data, time series, station data
245
- **Use when:** Working with tabledap data, CSV-like datasets
246
- **Advantages:** Familiar API, excellent for data manipulation, filtering, grouping
247
- **Example datasets:** Glider tracks, buoy time series, cruise data
248
249
### xarray Dataset
250
- **Best for:** Multi-dimensional gridded data, labeled arrays
251
- **Use when:** Working with griddap data, satellite imagery, model output
252
- **Advantages:** Labeled dimensions, broadcasting, CF conventions support
253
- **Example datasets:** Satellite SST, ocean models, atmospheric reanalysis
254
255
### netCDF4 Dataset
256
- **Best for:** Low-level NetCDF access, custom attribute handling
257
- **Use when:** Need direct NetCDF file manipulation, specific format requirements
258
- **Advantages:** Complete NetCDF API access, metadata control
259
- **Example datasets:** Any NetCDF data requiring specialized processing
260
261
### iris CubeList
262
- **Best for:** Earth science analysis with CF conventions
263
- **Use when:** Working with meteorological/oceanographic data, need CF-aware processing
264
- **Advantages:** CF conventions, coordinate systems, Earth science specific tools
265
- **Example datasets:** Weather models, climate data, ocean reanalysis
266
267
## HTTP Request Configuration
268
269
All conversion functions support HTTP request customization:
270
271
```python
272
from erddapy.core.interfaces import to_pandas
273
274
# Authentication
275
requests_config = {
276
'auth': ('username', 'password'),
277
'timeout': 60,
278
'headers': {'User-Agent': 'MyApp/1.0'}
279
}
280
281
df = to_pandas(url, requests_kwargs=requests_config)
282
283
# SSL configuration
284
requests_config = {
285
'verify': False, # Skip SSL verification (not recommended)
286
'cert': ('client.cert', 'client.key') # Client certificates
287
}
288
```
289
290
## Supported Download Formats
291
292
ERDDAP supports numerous output formats for data download. The complete list of available formats:
293
294
```python { .api }
295
download_formats = [
296
"asc", "csv", "csvp", "csv0", "dataTable", "das", "dds", "dods",
297
"esriCsv", "fgdc", "geoJson", "graph", "help", "html", "iso19115",
298
"itx", "json", "jsonlCSV1", "jsonlCSV", "jsonlKVP", "mat", "nc",
299
"ncHeader", "ncCF", "ncCFHeader", "ncCFMA", "ncCFMAHeader", "nccsv",
300
"nccsvMetadata", "ncoJson", "odvTxt", "subset", "tsv", "tsvp", "tsv0",
301
"wav", "xhtml", "kml", "smallPdf", "pdf", "largePdf", "smallPng",
302
"png", "largePng", "transparentPng"
303
]
304
```
305
306
These formats can be used with the `response` parameter in URL building functions and the `file_type` parameter in the `download_file` method.
307
308
## Error Handling
309
310
The conversion functions provide informative error messages:
311
312
```python
313
from erddapy.core.interfaces import to_pandas
314
315
try:
316
df = to_pandas("https://invalid-url.com/data.csv")
317
except ValueError as e:
318
print(f"Conversion failed: {e}")
319
320
# Handle different error types
321
try:
322
df = to_pandas(
323
"https://coastwatch.pfeg.noaa.gov/erddap/tabledap/nonexistent.csv",
324
requests_kwargs={'timeout': 10}
325
)
326
except Exception as e:
327
print(f"Request failed: {type(e).__name__}: {e}")
328
```