0
# CDF File Writing
1
2
Complete API for creating and writing CDF files with support for global attributes, variable definitions, data writing, and file-level compression. The CDF writer provides fine-grained control over file structure and metadata.
3
4
## Capabilities
5
6
### CDF File Creation
7
8
Initialize a new CDF file with optional specifications for encoding, compression, and structure.
9
10
```python { .api }
11
class CDF:
12
def __init__(self, path, cdf_spec=None, delete=False):
13
"""
14
Create a new CDF file for writing.
15
16
Parameters:
17
- path (str | Path): Output file path (with or without .cdf extension)
18
- cdf_spec (dict, optional): CDF file specifications with keys:
19
- 'Majority': 'row_major' or 'column_major' (default: 'column_major')
20
- 'Encoding': Data encoding scheme (string or numeric, default: 'host')
21
- 'Checksum': Enable data validation (bool, default: False)
22
- 'rDim_sizes': Dimensional sizes for rVariables (list)
23
- 'Compressed': File-level compression (0-9 or True/False, default: 0/False)
24
- delete (bool): Whether to delete existing file if it exists (default: False)
25
26
Returns:
27
CDF writer instance
28
"""
29
```
30
31
**Usage Examples:**
32
33
```python
34
import cdflib
35
36
# Simple file creation
37
cdf = cdflib.cdfwrite.CDF('output.cdf')
38
39
# File with specifications
40
spec = {
41
'Majority': 'row_major',
42
'Encoding': 'host',
43
'Checksum': True,
44
'Compressed': 5 # Compression level 5
45
}
46
cdf = cdflib.cdfwrite.CDF('output.cdf', cdf_spec=spec)
47
48
# File with record dimensions for rVariables
49
spec_with_dims = {
50
'rDim_sizes': [10, 20] # Two record dimensions
51
}
52
cdf = cdflib.cdfwrite.CDF('output.cdf', cdf_spec=spec_with_dims)
53
```
54
55
### File Closure
56
57
Properly close the CDF file and flush all data to disk.
58
59
```python { .api }
60
def close(self):
61
"""
62
Close the CDF file and flush all pending data to disk.
63
64
Raises:
65
OSError: If the file is already closed
66
"""
67
```
68
69
**Usage Example:**
70
71
```python
72
cdf = cdflib.cdfwrite.CDF('output.cdf')
73
# ... write data ...
74
cdf.close()
75
76
# Or use context manager (recommended)
77
with cdflib.cdfwrite.CDF('output.cdf') as cdf:
78
# ... write data ...
79
pass # File automatically closed
80
```
81
82
### Global Attributes Writing
83
84
Write global attributes that apply to the entire CDF file.
85
86
```python { .api }
87
def write_globalattrs(self, globalAttrs):
88
"""
89
Write global attributes to the CDF file.
90
91
Parameters:
92
- globalAttrs (dict): Dictionary mapping attribute names to values.
93
Values can be strings, numbers, or numpy arrays.
94
For multiple entries, use lists of values.
95
"""
96
```
97
98
**Usage Examples:**
99
100
```python
101
cdf = cdflib.cdfwrite.CDF('output.cdf')
102
103
# Write global attributes
104
global_attrs = {
105
'TITLE': 'My Scientific Dataset',
106
'VERSION': '1.0',
107
'CREATED': '2023-01-01T00:00:00Z',
108
'AUTHOR': 'Research Team',
109
'INSTITUTION': ['University A', 'Institute B'], # Multiple entries
110
'PI_NAME': 'Dr. Smith',
111
'PROJECT': 'Space Weather Study'
112
}
113
cdf.write_globalattrs(global_attrs)
114
```
115
116
### Variable Attributes Writing
117
118
Write attributes that apply to specific variables.
119
120
```python { .api }
121
def write_variableattrs(self, variableAttrs):
122
"""
123
Write variable attributes to the CDF file.
124
125
Parameters:
126
- variableAttrs (dict): Dictionary mapping variable names to their attributes.
127
Each variable's attributes are a dict mapping attribute
128
names to values.
129
"""
130
```
131
132
**Usage Examples:**
133
134
```python
135
cdf = cdflib.cdfwrite.CDF('output.cdf')
136
137
# Write variable attributes
138
var_attrs = {
139
'temperature': {
140
'UNITS': 'Kelvin',
141
'FILLVAL': -999.0,
142
'VALIDMIN': 200.0,
143
'VALIDMAX': 400.0,
144
'CATDESC': 'Atmospheric temperature measurements',
145
'FIELDNAM': 'Temperature'
146
},
147
'pressure': {
148
'UNITS': 'hPa',
149
'FILLVAL': -1.0,
150
'VALIDMIN': 0.0,
151
'VALIDMAX': 1100.0,
152
'CATDESC': 'Atmospheric pressure measurements',
153
'FIELDNAM': 'Pressure'
154
}
155
}
156
cdf.write_variableattrs(var_attrs)
157
```
158
159
### Variable Data Writing
160
161
Define variables and write their data with comprehensive specification options.
162
163
```python { .api }
164
def write_var(self, var_spec, var_attrs=None, var_data=None):
165
"""
166
Write a variable to the CDF file.
167
168
Parameters:
169
- var_spec (dict): Variable specification with required keys:
170
- 'Variable': Variable name (str)
171
- 'Data_Type': CDF data type constant (int)
172
- 'Num_Elements': Number of elements per value (int)
173
- 'Dims': List of dimension sizes (empty for scalar)
174
Optional keys:
175
- 'Rec_Vary': Record variance (bool, default: True)
176
- 'Dim_Vary': Dimension variance list (default: all True)
177
- 'Compress': Compression type (int, default: no compression)
178
- 'Block_Factor': Blocking factor for performance (int)
179
- 'Sparse': Sparseness type (default: no sparseness)
180
- 'Pad': Pad value for missing data
181
182
- var_attrs (dict, optional): Variable attributes to write
183
- var_data (array-like, optional): Variable data to write
184
"""
185
```
186
187
**Usage Examples:**
188
189
```python
190
import cdflib
191
import numpy as np
192
193
cdf = cdflib.cdfwrite.CDF('output.cdf')
194
195
# Write scalar variable
196
scalar_spec = {
197
'Variable': 'station_id',
198
'Data_Type': cdf.CDF_INT4,
199
'Num_Elements': 1,
200
'Dims': []
201
}
202
cdf.write_var(scalar_spec, var_data=np.array([12345]))
203
204
# Write 1D time series variable
205
timeseries_spec = {
206
'Variable': 'temperature',
207
'Data_Type': cdf.CDF_REAL4,
208
'Num_Elements': 1,
209
'Dims': [],
210
'Rec_Vary': True,
211
'Compress': 5 # GZIP compression level 5
212
}
213
temp_data = np.array([20.5, 21.0, 19.8, 22.1, 20.9])
214
cdf.write_var(timeseries_spec, var_data=temp_data)
215
216
# Write 2D spatial grid variable
217
grid_spec = {
218
'Variable': 'wind_speed',
219
'Data_Type': cdf.CDF_REAL4,
220
'Num_Elements': 1,
221
'Dims': [100, 200], # 100x200 spatial grid
222
'Rec_Vary': True,
223
'Dim_Vary': [True, True]
224
}
225
wind_data = np.random.rand(50, 100, 200) # 50 time records
226
cdf.write_var(grid_spec, var_data=wind_data)
227
228
# Write string variable
229
string_spec = {
230
'Variable': 'station_name',
231
'Data_Type': cdf.CDF_CHAR,
232
'Num_Elements': 20, # String length
233
'Dims': []
234
}
235
cdf.write_var(string_spec, var_data=['Weather Station Alpha'])
236
237
# Write epoch variable (time)
238
epoch_spec = {
239
'Variable': 'Epoch',
240
'Data_Type': cdf.CDF_EPOCH,
241
'Num_Elements': 1,
242
'Dims': []
243
}
244
# Create epochs for the temperature data
245
import cdflib.cdfepoch as cdfepoch
246
epochs = [cdfepoch.compute_epoch([2023, 1, 1, i, 0, 0, 0]) for i in range(5)]
247
cdf.write_var(epoch_spec, var_data=np.array(epochs))
248
249
cdf.close()
250
```
251
252
### Combined Variable Writing
253
254
Write variable specification, attributes, and data in a single operation.
255
256
```python
257
cdf = cdflib.cdfwrite.CDF('output.cdf')
258
259
# Define variable with spec, attributes, and data together
260
var_spec = {
261
'Variable': 'magnetic_field',
262
'Data_Type': cdf.CDF_REAL8,
263
'Num_Elements': 1,
264
'Dims': [3], # 3-component vector
265
'Rec_Vary': True,
266
'Compress': 9
267
}
268
269
var_attrs = {
270
'UNITS': 'nanoTesla',
271
'FILLVAL': -1e31,
272
'CATDESC': '3-component magnetic field vector',
273
'DEPEND_0': 'Epoch',
274
'LABL_PTR_1': 'B_field_labels'
275
}
276
277
# 100 records of 3-component vectors
278
mag_data = np.random.rand(100, 3) * 50000 # Typical magnetometer data
279
280
cdf.write_var(var_spec, var_attrs=var_attrs, var_data=mag_data)
281
282
# Write corresponding labels
283
label_spec = {
284
'Variable': 'B_field_labels',
285
'Data_Type': cdf.CDF_CHAR,
286
'Num_Elements': 10,
287
'Dims': [3],
288
'Rec_Vary': False
289
}
290
labels = ['Bx', 'By', 'Bz']
291
cdf.write_var(label_spec, var_data=labels)
292
293
cdf.close()
294
```
295
296
## CDF Data Types
297
298
Constants for specifying variable data types in CDF files.
299
300
```python { .api }
301
# Integer types
302
CDF_INT1 = 1 # 1-byte signed integer
303
CDF_INT2 = 2 # 2-byte signed integer
304
CDF_INT4 = 4 # 4-byte signed integer
305
CDF_INT8 = 8 # 8-byte signed integer
306
307
# Unsigned integer types
308
CDF_UINT1 = 11 # 1-byte unsigned integer
309
CDF_UINT2 = 12 # 2-byte unsigned integer
310
CDF_UINT4 = 14 # 4-byte unsigned integer
311
312
# Floating point types
313
CDF_REAL4 = 21 # 4-byte IEEE floating point
314
CDF_REAL8 = 22 # 8-byte IEEE floating point
315
CDF_FLOAT = 44 # 4-byte IEEE floating point (alias)
316
CDF_DOUBLE = 45 # 8-byte IEEE floating point (alias)
317
318
# Time epoch types
319
CDF_EPOCH = 31 # CDF_EPOCH (8-byte float, milliseconds since Year 0)
320
CDF_EPOCH16 = 32 # CDF_EPOCH16 (16-byte, picoseconds since Year 0)
321
CDF_TIME_TT2000 = 33 # TT2000 (8-byte int, nanoseconds since J2000)
322
323
# Character types
324
CDF_CHAR = 51 # 1-byte signed character
325
CDF_UCHAR = 52 # 1-byte unsigned character
326
327
# Legacy aliases
328
CDF_BYTE = 41 # 1-byte signed integer (same as CDF_INT1)
329
```
330
331
## Encoding Constants
332
333
Platform-specific data encoding options for cross-platform compatibility.
334
335
```python { .api }
336
NETWORK_ENCODING = 1 # Network byte order (big-endian)
337
SUN_ENCODING = 2 # Sun/SPARC encoding
338
VAX_ENCODING = 3 # VAX encoding (little-endian)
339
DECSTATION_ENCODING = 4 # DECstation encoding
340
SGi_ENCODING = 5 # Silicon Graphics encoding
341
IBMPC_ENCODING = 6 # IBM PC encoding (little-endian)
342
```
343
344
## Complete Example: Scientific Dataset
345
346
```python
347
import cdflib
348
import numpy as np
349
import cdflib.cdfepoch as cdfepoch
350
351
# Create CDF file with specifications
352
spec = {
353
'Majority': 'row_major',
354
'Encoding': 'host',
355
'Checksum': True,
356
'Compressed': 6
357
}
358
359
with cdflib.cdfwrite.CDF('scientific_data.cdf', cdf_spec=spec) as cdf:
360
361
# Write global attributes
362
global_attrs = {
363
'TITLE': 'Atmospheric Measurements',
364
'PROJECT': 'Climate Study 2023',
365
'DISCIPLINE': 'Space Physics>Magnetospheric Science',
366
'DATA_TYPE': 'survey>magnetic field',
367
'DESCRIPTOR': 'MAG>Magnetic Field',
368
'INSTRUMENT_TYPE': 'Magnetometer',
369
'MISSION_GROUP': 'Research Mission',
370
'PI_NAME': 'Dr. Jane Smith',
371
'PI_AFFILIATION': 'Space Research Institute',
372
'TEXT': 'High-resolution atmospheric measurements from ground station network'
373
}
374
cdf.write_globalattrs(global_attrs)
375
376
# Create time axis (100 measurements over 1 hour)
377
start_time = [2023, 6, 15, 12, 0, 0, 0]
378
epochs = [cdfepoch.compute_epoch([2023, 6, 15, 12, 0, i*36, 0]) for i in range(100)]
379
380
# Write Epoch variable
381
epoch_spec = {
382
'Variable': 'Epoch',
383
'Data_Type': cdf.CDF_EPOCH,
384
'Num_Elements': 1,
385
'Dims': []
386
}
387
epoch_attrs = {
388
'UNITS': 'ms',
389
'TIME_BASE': 'J2000',
390
'CATDESC': 'Default time',
391
'FIELDNAM': 'Time since Jan 1, 0000',
392
'FILLVAL': -1e31,
393
'VALIDMIN': '01-Jan-1990 00:00:00.000',
394
'VALIDMAX': '31-Dec-2029 23:59:59.999'
395
}
396
cdf.write_var(epoch_spec, var_attrs=epoch_attrs, var_data=np.array(epochs))
397
398
# Write temperature data
399
temp_spec = {
400
'Variable': 'Temperature',
401
'Data_Type': cdf.CDF_REAL4,
402
'Num_Elements': 1,
403
'Dims': [],
404
'Compress': 9
405
}
406
temp_attrs = {
407
'UNITS': 'K',
408
'CATDESC': 'Atmospheric temperature',
409
'DEPEND_0': 'Epoch',
410
'FIELDNAM': 'Temperature',
411
'FILLVAL': -999.0,
412
'VALIDMIN': 200.0,
413
'VALIDMAX': 400.0,
414
'SCALEMIN': 250.0,
415
'SCALEMAX': 350.0
416
}
417
temp_data = 290 + 10 * np.sin(np.linspace(0, 4*np.pi, 100)) + np.random.normal(0, 2, 100)
418
cdf.write_var(temp_spec, var_attrs=temp_attrs, var_data=temp_data)
419
420
# Write 3D magnetic field vector
421
mag_spec = {
422
'Variable': 'B_field',
423
'Data_Type': cdf.CDF_REAL8,
424
'Num_Elements': 1,
425
'Dims': [3],
426
'Rec_Vary': True,
427
'Dim_Vary': [True]
428
}
429
mag_attrs = {
430
'UNITS': 'nT',
431
'CATDESC': 'Magnetic field vector in GSM coordinates',
432
'DEPEND_0': 'Epoch',
433
'DEPEND_1': 'B_field_labels',
434
'FIELDNAM': 'Magnetic Field',
435
'FILLVAL': -1e31,
436
'VALIDMIN': -100000.0,
437
'VALIDMAX': 100000.0
438
}
439
# Generate synthetic magnetic field data
440
mag_data = np.column_stack([
441
25000 + 5000 * np.sin(np.linspace(0, 2*np.pi, 100)), # Bx
442
15000 + 3000 * np.cos(np.linspace(0, 2*np.pi, 100)), # By
443
-5000 + 1000 * np.random.normal(0, 1, 100) # Bz
444
])
445
cdf.write_var(mag_spec, var_attrs=mag_attrs, var_data=mag_data)
446
447
# Write coordinate labels
448
label_spec = {
449
'Variable': 'B_field_labels',
450
'Data_Type': cdf.CDF_CHAR,
451
'Num_Elements': 2,
452
'Dims': [3],
453
'Rec_Vary': False
454
}
455
label_attrs = {
456
'CATDESC': 'Magnetic field component labels',
457
'FIELDNAM': 'Component labels'
458
}
459
cdf.write_var(label_spec, var_attrs=label_attrs, var_data=['Bx', 'By', 'Bz'])
460
461
print("Scientific dataset created successfully!")
462
```
463
464
## Error Handling
465
466
The CDF writer raises exceptions for various error conditions:
467
468
- **OSError**: When trying to write to a closed file
469
- **ValueError**: For invalid data types, dimensions, or specifications
470
- **TypeError**: For incompatible data types or malformed specifications
471
- **MemoryError**: When insufficient memory for large datasets
472
473
**Example Error Handling:**
474
475
```python
476
import cdflib
477
import numpy as np
478
479
try:
480
cdf = cdflib.cdfwrite.CDF('output.cdf')
481
482
# This will raise ValueError for invalid data type
483
bad_spec = {
484
'Variable': 'test',
485
'Data_Type': 999, # Invalid data type
486
'Num_Elements': 1,
487
'Dims': []
488
}
489
cdf.write_var(bad_spec)
490
491
except ValueError as e:
492
print(f"Invalid specification: {e}")
493
494
try:
495
# This will raise OSError if file is closed
496
cdf.close()
497
cdf.write_var(good_spec) # Writing to closed file
498
except OSError as e:
499
print(f"File operation error: {e}")
500
```