CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-cdflib

A Python CDF reader toolkit for reading and writing CDF files without requiring NASA CDF library installation

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

cdf-writing.mddocs/

CDF File Writing

Complete API for creating and writing CDF files with support for global attributes, variable definitions, data writing, and file-level compression. The CDF writer provides fine-grained control over file structure and metadata.

Capabilities

CDF File Creation

Initialize a new CDF file with optional specifications for encoding, compression, and structure.

class CDF:
    def __init__(self, path, cdf_spec=None, delete=False):
        """
        Create a new CDF file for writing.
        
        Parameters:
        - path (str | Path): Output file path (with or without .cdf extension)
        - cdf_spec (dict, optional): CDF file specifications with keys:
            - 'Majority': 'row_major' or 'column_major' (default: 'column_major')
            - 'Encoding': Data encoding scheme (string or numeric, default: 'host')
            - 'Checksum': Enable data validation (bool, default: False)
            - 'rDim_sizes': Dimensional sizes for rVariables (list)
            - 'Compressed': File-level compression (0-9 or True/False, default: 0/False)
        - delete (bool): Whether to delete existing file if it exists (default: False)
        
        Returns:
        CDF writer instance
        """

Usage Examples:

import cdflib

# Simple file creation
cdf = cdflib.cdfwrite.CDF('output.cdf')

# File with specifications
spec = {
    'Majority': 'row_major',
    'Encoding': 'host',
    'Checksum': True,
    'Compressed': 5  # Compression level 5
}
cdf = cdflib.cdfwrite.CDF('output.cdf', cdf_spec=spec)

# File with record dimensions for rVariables
spec_with_dims = {
    'rDim_sizes': [10, 20]  # Two record dimensions
}
cdf = cdflib.cdfwrite.CDF('output.cdf', cdf_spec=spec_with_dims)

File Closure

Properly close the CDF file and flush all data to disk.

def close(self):
    """
    Close the CDF file and flush all pending data to disk.
    
    Raises:
    OSError: If the file is already closed
    """

Usage Example:

cdf = cdflib.cdfwrite.CDF('output.cdf')
# ... write data ...
cdf.close()

# Or use context manager (recommended)
with cdflib.cdfwrite.CDF('output.cdf') as cdf:
    # ... write data ...
    pass  # File automatically closed

Global Attributes Writing

Write global attributes that apply to the entire CDF file.

def write_globalattrs(self, globalAttrs):
    """
    Write global attributes to the CDF file.
    
    Parameters:
    - globalAttrs (dict): Dictionary mapping attribute names to values.
                         Values can be strings, numbers, or numpy arrays.
                         For multiple entries, use lists of values.
    """

Usage Examples:

cdf = cdflib.cdfwrite.CDF('output.cdf')

# Write global attributes
global_attrs = {
    'TITLE': 'My Scientific Dataset',
    'VERSION': '1.0',
    'CREATED': '2023-01-01T00:00:00Z',
    'AUTHOR': 'Research Team',
    'INSTITUTION': ['University A', 'Institute B'],  # Multiple entries
    'PI_NAME': 'Dr. Smith',
    'PROJECT': 'Space Weather Study'
}
cdf.write_globalattrs(global_attrs)

Variable Attributes Writing

Write attributes that apply to specific variables.

def write_variableattrs(self, variableAttrs):
    """
    Write variable attributes to the CDF file.
    
    Parameters:
    - variableAttrs (dict): Dictionary mapping variable names to their attributes.
                           Each variable's attributes are a dict mapping attribute 
                           names to values.
    """

Usage Examples:

cdf = cdflib.cdfwrite.CDF('output.cdf')

# Write variable attributes
var_attrs = {
    'temperature': {
        'UNITS': 'Kelvin',
        'FILLVAL': -999.0,
        'VALIDMIN': 200.0,
        'VALIDMAX': 400.0,
        'CATDESC': 'Atmospheric temperature measurements',
        'FIELDNAM': 'Temperature'
    },
    'pressure': {
        'UNITS': 'hPa',
        'FILLVAL': -1.0,
        'VALIDMIN': 0.0,
        'VALIDMAX': 1100.0,
        'CATDESC': 'Atmospheric pressure measurements',
        'FIELDNAM': 'Pressure'
    }
}
cdf.write_variableattrs(var_attrs)

Variable Data Writing

Define variables and write their data with comprehensive specification options.

def write_var(self, var_spec, var_attrs=None, var_data=None):
    """
    Write a variable to the CDF file.
    
    Parameters:
    - var_spec (dict): Variable specification with required keys:
        - 'Variable': Variable name (str)
        - 'Data_Type': CDF data type constant (int)
        - 'Num_Elements': Number of elements per value (int)
        - 'Dims': List of dimension sizes (empty for scalar)
        Optional keys:
        - 'Rec_Vary': Record variance (bool, default: True)
        - 'Dim_Vary': Dimension variance list (default: all True)
        - 'Compress': Compression type (int, default: no compression)
        - 'Block_Factor': Blocking factor for performance (int)
        - 'Sparse': Sparseness type (default: no sparseness)
        - 'Pad': Pad value for missing data
        
    - var_attrs (dict, optional): Variable attributes to write
    - var_data (array-like, optional): Variable data to write
    """

Usage Examples:

import cdflib
import numpy as np

cdf = cdflib.cdfwrite.CDF('output.cdf')

# Write scalar variable
scalar_spec = {
    'Variable': 'station_id',
    'Data_Type': cdf.CDF_INT4,
    'Num_Elements': 1,
    'Dims': []
}
cdf.write_var(scalar_spec, var_data=np.array([12345]))

# Write 1D time series variable
timeseries_spec = {
    'Variable': 'temperature',
    'Data_Type': cdf.CDF_REAL4,
    'Num_Elements': 1,
    'Dims': [],
    'Rec_Vary': True,
    'Compress': 5  # GZIP compression level 5
}
temp_data = np.array([20.5, 21.0, 19.8, 22.1, 20.9])
cdf.write_var(timeseries_spec, var_data=temp_data)

# Write 2D spatial grid variable  
grid_spec = {
    'Variable': 'wind_speed',
    'Data_Type': cdf.CDF_REAL4,
    'Num_Elements': 1,
    'Dims': [100, 200],  # 100x200 spatial grid
    'Rec_Vary': True,
    'Dim_Vary': [True, True]
}
wind_data = np.random.rand(50, 100, 200)  # 50 time records
cdf.write_var(grid_spec, var_data=wind_data)

# Write string variable
string_spec = {
    'Variable': 'station_name',
    'Data_Type': cdf.CDF_CHAR,
    'Num_Elements': 20,  # String length
    'Dims': []
}
cdf.write_var(string_spec, var_data=['Weather Station Alpha'])

# Write epoch variable (time)
epoch_spec = {
    'Variable': 'Epoch',
    'Data_Type': cdf.CDF_EPOCH,
    'Num_Elements': 1,
    'Dims': []
}
# Create epochs for the temperature data
import cdflib.cdfepoch as cdfepoch
epochs = [cdfepoch.compute_epoch([2023, 1, 1, i, 0, 0, 0]) for i in range(5)]
cdf.write_var(epoch_spec, var_data=np.array(epochs))

cdf.close()

Combined Variable Writing

Write variable specification, attributes, and data in a single operation.

cdf = cdflib.cdfwrite.CDF('output.cdf')

# Define variable with spec, attributes, and data together
var_spec = {
    'Variable': 'magnetic_field',
    'Data_Type': cdf.CDF_REAL8,
    'Num_Elements': 1,
    'Dims': [3],  # 3-component vector
    'Rec_Vary': True,
    'Compress': 9
}

var_attrs = {
    'UNITS': 'nanoTesla',
    'FILLVAL': -1e31,
    'CATDESC': '3-component magnetic field vector',
    'DEPEND_0': 'Epoch',
    'LABL_PTR_1': 'B_field_labels'
}

# 100 records of 3-component vectors
mag_data = np.random.rand(100, 3) * 50000  # Typical magnetometer data

cdf.write_var(var_spec, var_attrs=var_attrs, var_data=mag_data)

# Write corresponding labels
label_spec = {
    'Variable': 'B_field_labels',
    'Data_Type': cdf.CDF_CHAR,
    'Num_Elements': 10,
    'Dims': [3],
    'Rec_Vary': False
}
labels = ['Bx', 'By', 'Bz']
cdf.write_var(label_spec, var_data=labels)

cdf.close()

CDF Data Types

Constants for specifying variable data types in CDF files.

# Integer types
CDF_INT1 = 1      # 1-byte signed integer
CDF_INT2 = 2      # 2-byte signed integer  
CDF_INT4 = 4      # 4-byte signed integer
CDF_INT8 = 8      # 8-byte signed integer

# Unsigned integer types
CDF_UINT1 = 11    # 1-byte unsigned integer
CDF_UINT2 = 12    # 2-byte unsigned integer
CDF_UINT4 = 14    # 4-byte unsigned integer

# Floating point types
CDF_REAL4 = 21    # 4-byte IEEE floating point
CDF_REAL8 = 22    # 8-byte IEEE floating point
CDF_FLOAT = 44    # 4-byte IEEE floating point (alias)
CDF_DOUBLE = 45   # 8-byte IEEE floating point (alias)

# Time epoch types
CDF_EPOCH = 31    # CDF_EPOCH (8-byte float, milliseconds since Year 0)
CDF_EPOCH16 = 32  # CDF_EPOCH16 (16-byte, picoseconds since Year 0)
CDF_TIME_TT2000 = 33  # TT2000 (8-byte int, nanoseconds since J2000)

# Character types  
CDF_CHAR = 51     # 1-byte signed character
CDF_UCHAR = 52    # 1-byte unsigned character

# Legacy aliases
CDF_BYTE = 41     # 1-byte signed integer (same as CDF_INT1)

Encoding Constants

Platform-specific data encoding options for cross-platform compatibility.

NETWORK_ENCODING = 1      # Network byte order (big-endian)
SUN_ENCODING = 2          # Sun/SPARC encoding
VAX_ENCODING = 3          # VAX encoding (little-endian)
DECSTATION_ENCODING = 4   # DECstation encoding
SGi_ENCODING = 5          # Silicon Graphics encoding
IBMPC_ENCODING = 6        # IBM PC encoding (little-endian)

Complete Example: Scientific Dataset

import cdflib
import numpy as np
import cdflib.cdfepoch as cdfepoch

# Create CDF file with specifications
spec = {
    'Majority': 'row_major',
    'Encoding': 'host',
    'Checksum': True,
    'Compressed': 6
}

with cdflib.cdfwrite.CDF('scientific_data.cdf', cdf_spec=spec) as cdf:
    
    # Write global attributes
    global_attrs = {
        'TITLE': 'Atmospheric Measurements',
        'PROJECT': 'Climate Study 2023',
        'DISCIPLINE': 'Space Physics>Magnetospheric Science',
        'DATA_TYPE': 'survey>magnetic field',
        'DESCRIPTOR': 'MAG>Magnetic Field',
        'INSTRUMENT_TYPE': 'Magnetometer',
        'MISSION_GROUP': 'Research Mission',
        'PI_NAME': 'Dr. Jane Smith',
        'PI_AFFILIATION': 'Space Research Institute',
        'TEXT': 'High-resolution atmospheric measurements from ground station network'
    }
    cdf.write_globalattrs(global_attrs)
    
    # Create time axis (100 measurements over 1 hour)
    start_time = [2023, 6, 15, 12, 0, 0, 0]
    epochs = [cdfepoch.compute_epoch([2023, 6, 15, 12, 0, i*36, 0]) for i in range(100)]
    
    # Write Epoch variable
    epoch_spec = {
        'Variable': 'Epoch',
        'Data_Type': cdf.CDF_EPOCH,
        'Num_Elements': 1,
        'Dims': []
    }
    epoch_attrs = {
        'UNITS': 'ms',
        'TIME_BASE': 'J2000',
        'CATDESC': 'Default time',
        'FIELDNAM': 'Time since Jan 1, 0000',
        'FILLVAL': -1e31,
        'VALIDMIN': '01-Jan-1990 00:00:00.000',
        'VALIDMAX': '31-Dec-2029 23:59:59.999'
    }
    cdf.write_var(epoch_spec, var_attrs=epoch_attrs, var_data=np.array(epochs))
    
    # Write temperature data
    temp_spec = {
        'Variable': 'Temperature',
        'Data_Type': cdf.CDF_REAL4,
        'Num_Elements': 1,
        'Dims': [],
        'Compress': 9
    }
    temp_attrs = {
        'UNITS': 'K',
        'CATDESC': 'Atmospheric temperature',
        'DEPEND_0': 'Epoch',
        'FIELDNAM': 'Temperature',
        'FILLVAL': -999.0,
        'VALIDMIN': 200.0,
        'VALIDMAX': 400.0,
        'SCALEMIN': 250.0,
        'SCALEMAX': 350.0
    }
    temp_data = 290 + 10 * np.sin(np.linspace(0, 4*np.pi, 100)) + np.random.normal(0, 2, 100)
    cdf.write_var(temp_spec, var_attrs=temp_attrs, var_data=temp_data)
    
    # Write 3D magnetic field vector
    mag_spec = {
        'Variable': 'B_field',
        'Data_Type': cdf.CDF_REAL8,
        'Num_Elements': 1,
        'Dims': [3],
        'Rec_Vary': True,
        'Dim_Vary': [True]
    }
    mag_attrs = {
        'UNITS': 'nT',
        'CATDESC': 'Magnetic field vector in GSM coordinates',
        'DEPEND_0': 'Epoch',
        'DEPEND_1': 'B_field_labels',
        'FIELDNAM': 'Magnetic Field',
        'FILLVAL': -1e31,
        'VALIDMIN': -100000.0,
        'VALIDMAX': 100000.0
    }
    # Generate synthetic magnetic field data
    mag_data = np.column_stack([
        25000 + 5000 * np.sin(np.linspace(0, 2*np.pi, 100)),  # Bx
        15000 + 3000 * np.cos(np.linspace(0, 2*np.pi, 100)),  # By  
        -5000 + 1000 * np.random.normal(0, 1, 100)            # Bz
    ])
    cdf.write_var(mag_spec, var_attrs=mag_attrs, var_data=mag_data)
    
    # Write coordinate labels
    label_spec = {
        'Variable': 'B_field_labels',
        'Data_Type': cdf.CDF_CHAR,
        'Num_Elements': 2,
        'Dims': [3],
        'Rec_Vary': False
    }
    label_attrs = {
        'CATDESC': 'Magnetic field component labels',
        'FIELDNAM': 'Component labels'
    }
    cdf.write_var(label_spec, var_attrs=label_attrs, var_data=['Bx', 'By', 'Bz'])

print("Scientific dataset created successfully!")

Error Handling

The CDF writer raises exceptions for various error conditions:

  • OSError: When trying to write to a closed file
  • ValueError: For invalid data types, dimensions, or specifications
  • TypeError: For incompatible data types or malformed specifications
  • MemoryError: When insufficient memory for large datasets

Example Error Handling:

import cdflib
import numpy as np

try:
    cdf = cdflib.cdfwrite.CDF('output.cdf')
    
    # This will raise ValueError for invalid data type
    bad_spec = {
        'Variable': 'test',
        'Data_Type': 999,  # Invalid data type
        'Num_Elements': 1,
        'Dims': []
    }
    cdf.write_var(bad_spec)
    
except ValueError as e:
    print(f"Invalid specification: {e}")
    
try:
    # This will raise OSError if file is closed
    cdf.close()
    cdf.write_var(good_spec)  # Writing to closed file
except OSError as e:
    print(f"File operation error: {e}")

Install with Tessl CLI

npx tessl i tessl/pypi-cdflib

docs

cdf-reading.md

cdf-writing.md

epochs.md

index.md

xarray-integration.md

tile.json