CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-cdflib

A Python CDF reader toolkit for reading and writing CDF files without requiring NASA CDF library installation

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

cdf-reading.mddocs/

CDF File Reading

Complete API for reading CDF files including metadata extraction, variable data access, and attribute retrieval. The CDF reader supports local files, HTTP/HTTPS URLs, and S3 buckets with optional MD5 validation and custom string encoding.

Capabilities

CDF File Opening

Initialize a CDF reader with support for various data sources and configuration options.

class CDF:
    def __init__(self, path, validate=False, string_encoding='ascii', s3_read_method=1):
        """
        Open a CDF file for reading.
        
        Parameters:
        - path (str | Path): Path to CDF file, HTTP/HTTPS URL, or S3 URL (s3://bucket/key)
        - validate (bool): If True, validate MD5 checksum of the file
        - string_encoding (str): Character encoding for strings (default: 'ascii')
        - s3_read_method (int): S3 read method (1=memory, 2=tmp file, 3=streaming)
        
        Returns:
        CDF reader instance
        """

Usage Example:

import cdflib

# Local file
cdf = cdflib.CDF('/path/to/file.cdf')

# Remote URL
cdf = cdflib.CDF('https://example.com/data.cdf')

# S3 bucket
cdf = cdflib.CDF('s3://my-bucket/data.cdf')

# With validation and UTF-8 encoding
cdf = cdflib.CDF('/path/to/file.cdf', validate=True, string_encoding='utf-8')

File Information Retrieval

Get comprehensive information about the CDF file structure and contents.

def cdf_info(self):
    """
    Get general information about the CDF file.
    
    Returns:
    CDFInfo: Object containing file metadata including:
        - rVariables: List of record-varying variable names
        - zVariables: List of non-record-varying variable names  
        - Attributes: List of attribute names
        - Version: CDF library version info
        - Encoding: Data encoding information
        - Majority: Row/column majority
        - Format: Single/multi-file format
        - Compressed: Compression status
        - Checksum: Checksum validation status
    """

Usage Example:

cdf = cdflib.CDF('data.cdf')
info = cdf.cdf_info()

print(f"Variables: {info['zVariables']}")
print(f"Attributes: {info['Attributes']}")
print(f"Compressed: {info['Compressed']}")
print(f"Encoding: {info['Encoding']}")

Variable Information Inquiry

Get detailed information about specific variables including data types, dimensions, and record variance.

def varinq(self, variable):
    """
    Get detailed information about a specific variable.
    
    Parameters:
    - variable (str): Variable name
    
    Returns:
    VDRInfo: Variable descriptor containing:
        - Variable: Variable name
        - Num_Elements: Number of elements per value
        - Data_Type: CDF data type constant
        - Data_Type_Description: Human-readable data type
        - Num_Dims: Number of dimensions
        - Dim_Sizes: List of dimension sizes
        - Sparse: Sparseness type
        - Last_Rec: Last written record number
        - Rec_Vary: Record variance (True/False)
        - Dim_Vary: Dimension variance list
        - Pad: Pad value
        - Compress: Compression type
        - Block_Factor: Blocking factor
    """

Usage Example:

cdf = cdflib.CDF('data.cdf')
var_info = cdf.varinq('temperature')

print(f"Data type: {var_info['Data_Type_Description']}")
print(f"Dimensions: {var_info['Dim_Sizes']}")
print(f"Record varying: {var_info['Rec_Vary']}")
print(f"Last record: {var_info['Last_Rec']}")

Variable Data Reading

Read variable data with optional record range filtering and epoch-based selection.

def varget(self, variable=None, epoch=None, starttime=None, endtime=None, 
           startrec=0, endrec=None):
    """
    Read variable data from the CDF file.
    
    Parameters:
    - variable (str, optional): Variable name to read
    - epoch (str, optional): Name of time variable for time-based filtering
    - starttime (list, optional): Start time components for time range filtering
    - endtime (list, optional): End time components for time range filtering
    - startrec (int): Starting record number (0-based, default: 0)
    - endrec (int, optional): Ending record number (inclusive, default: all records)
    
    Returns:
    str | numpy.ndarray: Variable data array with appropriate dimensions and data type
    
    Notes:
    Time components should be provided as:
    - CDF_EPOCH: [year, month, day, hour, minute, second, millisecond]
    - CDF_EPOCH16: [year, month, day, hour, minute, second, ms, us, ns, ps]
    - TT2000: [year, month, day, hour, minute, second, ms, us, ns]
    """

Usage Examples:

import cdflib

cdf = cdflib.CDF('data.cdf')

# Read all data for a variable
all_data = cdf.varget('temperature')

# Read specific record range
subset = cdf.varget('temperature', startrec=100, endrec=200)

# Read data for specific time range using time components
time_subset = cdf.varget('temperature', 
                        starttime=[2023, 1, 1, 0, 0, 0, 0],
                        endtime=[2023, 1, 2, 0, 0, 0, 0])

# Read with specific epoch variable name for time filtering
spatial_subset = cdf.varget('grid_data', 
                           epoch='Epoch',
                           starttime=[2023, 1, 1, 0, 0, 0, 0],
                           endtime=[2023, 1, 1, 12, 0, 0, 0])

Attribute Information Inquiry

Get information about global or variable attributes including data types and entry counts.

def attinq(self, attribute):
    """
    Get information about a specific attribute.
    
    Parameters:
    - attribute (str | int): Attribute name or number
    
    Returns:
    ADRInfo: Attribute descriptor containing:
        - Attribute: Attribute name
        - Scope: 'GLOBAL_SCOPE' or 'VARIABLE_SCOPE'
        - Max_Entry: Maximum entry number
        - Num_Entries: Total number of entries
        - Data_Type: Data type of entries
        - Num_Elements: Number of elements per entry
    """

Attribute Data Reading

Read attribute data for global attributes or specific variable attribute entries.

def attget(self, attribute, entry=None):
    """
    Read attribute data.
    
    Parameters:
    - attribute (str | int): Attribute name or number
    - entry (str | int, optional): Entry number for variable attributes, 
                                  or variable name for variable attributes
    
    Returns:
    AttData: Attribute data object containing:
        - Data: The attribute value(s) as numpy array or scalar
        - Data_Type: CDF data type constant
        - Num_Elements: Number of elements
    """

Usage Examples:

cdf = cdflib.CDF('data.cdf')

# Get attribute information
attr_info = cdf.attinq('TITLE')
print(f"Scope: {attr_info['Scope']}")
print(f"Entries: {attr_info['Num_Entries']}")

# Read global attribute
title_data = cdf.attget('TITLE')
print(f"Title: {title_data['Data']}")

# Read variable attribute for specific variable
units_data = cdf.attget('UNITS', 'temperature')
print(f"Temperature units: {units_data['Data']}")

Bulk Attribute Reading

Efficiently read all global or variable attributes at once.

def globalattsget(self):
    """
    Get all global attributes as a dictionary.
    
    Returns:
    dict: Dictionary mapping attribute names to lists of values.
          Each value list contains all entries for that attribute.
    """

def varattsget(self, variable):
    """
    Get all attributes for a specific variable.
    
    Parameters:
    - variable (str | int): Variable name or number
    
    Returns:
    dict: Dictionary mapping attribute names to values.
          Values are numpy arrays or scalars depending on the attribute.
    """

Usage Examples:

cdf = cdflib.CDF('data.cdf')

# Get all global attributes
global_attrs = cdf.globalattsget()
for attr_name, attr_values in global_attrs.items():
    print(f"{attr_name}: {attr_values}")

# Get all attributes for a specific variable
temp_attrs = cdf.varattsget('temperature')
print(f"Units: {temp_attrs.get('UNITS', 'N/A')}")
print(f"Fill value: {temp_attrs.get('FILLVAL', 'N/A')}")
print(f"Valid range: {temp_attrs.get('VALIDMIN', 'N/A')} to {temp_attrs.get('VALIDMAX', 'N/A')}")

Advanced Variable Descriptor Access

Get detailed variable descriptor record information for advanced use cases.

def vdr_info(self, variable):
    """
    Get detailed variable descriptor record information.
    
    Parameters:
    - variable (str | int): Variable name or number
    
    Returns:
    VDR: Detailed variable descriptor record with low-level CDF information
    including internal record structure, compression details, and storage layout.
    """

Error Handling

The CDF reader raises standard Python exceptions for common error conditions:

  • FileNotFoundError: When the specified CDF file cannot be found
  • OSError: For file I/O errors or permission issues
  • ValueError: For invalid parameter values or malformed CDF data
  • KeyError: When accessing non-existent variables or attributes

Example Error Handling:

import cdflib

try:
    cdf = cdflib.CDF('nonexistent.cdf')
except FileNotFoundError:
    print("CDF file not found")

try:
    cdf = cdflib.CDF('data.cdf')
    data = cdf.varget('INVALID_VARIABLE')
except KeyError as e:
    print(f"Variable not found: {e}")

try:
    data = cdf.varget('temperature', startrec=-1)
except ValueError as e:
    print(f"Invalid parameter: {e}")

Types

class CDFInfo:
    """CDF file information dictionary containing metadata about the file structure."""

class VDRInfo:
    """Variable descriptor record information dictionary with variable metadata."""

class ADRInfo:
    """Attribute descriptor record information dictionary with attribute metadata."""
    
class AttData:
    """Attribute data container with 'Data', 'Data_Type', and 'Num_Elements' keys."""

class VDR:
    """Detailed variable descriptor record with low-level CDF internal information."""

Install with Tessl CLI

npx tessl i tessl/pypi-cdflib

docs

cdf-reading.md

cdf-writing.md

epochs.md

index.md

xarray-integration.md

tile.json