0
# File Format I/O
1
2
Support for 67+ seismological file formats with automatic format detection, unified read/write interfaces, and format-specific optimizations for waveforms, events, and station metadata. ObsPy's I/O system provides seamless interoperability across the global seismological data ecosystem.
3
4
## Capabilities
5
6
### Universal Read/Write Functions
7
8
Format-agnostic functions with automatic format detection that handle the complexity of seismological data formats transparently.
9
10
```python { .api }
11
def read(pathname_or_url, format=None, headonly=False, starttime=None,
12
endtime=None, nearest_sample=True, dtype=None, apply_calib=False,
13
check_compression=True, **kwargs) -> Stream:
14
"""
15
Read waveform files into Stream object with automatic format detection.
16
17
Args:
18
pathname_or_url: File path, URL, file-like object, or glob pattern
19
format: Format hint (auto-detected if None)
20
headonly: Read metadata only, skip waveform data
21
starttime: Start time for reading window (UTCDateTime)
22
endtime: End time for reading window (UTCDateTime)
23
nearest_sample: Align times to nearest available sample
24
dtype: Convert data to specified NumPy dtype
25
apply_calib: Apply calibration factor from metadata
26
check_compression: Verify compressed data integrity
27
**kwargs: Format-specific reading options
28
29
Returns:
30
Stream object containing traces with waveform data and metadata
31
32
Supported Formats:
33
MiniSEED (.mseed, .seed), SAC (.sac), GSE2 (.gse), SEG-Y (.segy, .sgy),
34
WIN (.win), CSS (.wfdisc), SEISAN (.seisan), AH (.ah), WAV (.wav),
35
GCF (.gcf), RefTek (.rt130), PDAS (.pd), Y (.y), SEG-2 (.sg2),
36
SH (.qhd, .qbn), Kinemetrics (.evt), NIED (.knet), RG16 (.rg16),
37
DMX (.dmx), ALSEP (.pse, .wtn, .wth), ASCII formats, and others
38
"""
39
40
def read_events(pathname_or_url, format=None, **kwargs):
41
"""
42
Read earthquake event files into Catalog object with automatic format detection.
43
44
Args:
45
pathname_or_url: File path, URL, file-like object, or glob pattern
46
format: Format hint (auto-detected if None)
47
**kwargs: Format-specific options
48
49
Returns:
50
Catalog object containing earthquake events
51
52
Supported Formats:
53
QuakeML (.xml), NDK (.ndk), CMTSOLUTION (.cmt), Nordic (.nordic),
54
NonLinLoc (.hyp, .obs), SC3ML (.xml), ZMAP (.zmap), JSON (.json),
55
MCHEDR (.txt), CNV (.cnv), FOCMEC (.foc), HypoDD (.pha),
56
SCARDEC (.txt), GSE2 bulletin, IMS1.0 bulletin, and others
57
"""
58
59
def read_inventory(pathname_or_url, format=None, **kwargs):
60
"""
61
Read station metadata files into Inventory object with automatic format detection.
62
63
Args:
64
pathname_or_url: File path, URL, file-like object, or glob pattern
65
format: Format hint (auto-detected if None)
66
**kwargs: Format-specific options
67
68
Returns:
69
Inventory object containing station/channel metadata and responses
70
71
Supported Formats:
72
StationXML (.xml), SEED/XSEED (.seed, .xml), Dataless SEED (.seed),
73
RESP files (.resp), SACPZ (.pz), CSS station files (.site),
74
Station text (.txt), SC3ML inventory, ArcLink XML, and others
75
"""
76
```
77
78
### Write Methods
79
80
Integrated write functionality in core data objects for exporting data in multiple formats.
81
82
```python { .api }
83
# Stream write methods
84
Stream.write(self, filename: str, format: str, **kwargs):
85
"""
86
Write stream to file in specified format.
87
88
Args:
89
filename: Output filename (extension used for format detection)
90
format: Output format (required for some formats)
91
**kwargs: Format-specific writing options
92
93
Supported Write Formats:
94
MiniSEED, SAC, GSE2, SEG-Y, WAV, GCF, ASCII, PICKLE, and others
95
"""
96
97
# Catalog write methods
98
Catalog.write(self, filename: str, format: str, **kwargs):
99
"""
100
Write catalog to file in specified format.
101
102
Args:
103
filename: Output filename
104
format: Output format (QuakeML, NDK, CMTSOLUTION, etc.)
105
**kwargs: Format-specific options
106
107
Supported Write Formats:
108
QuakeML, NDK, CMTSOLUTION, ZMAP, JSON, CNV, NORDIC,
109
SHAPEFILE, KML, and others
110
"""
111
112
# Inventory write methods
113
Inventory.write(self, filename: str, format: str, **kwargs):
114
"""
115
Write inventory to file in specified format.
116
117
Args:
118
filename: Output filename
119
format: Output format
120
**kwargs: Format-specific options
121
122
Supported Write Formats:
123
StationXML, SACPZ, CSS, STATIONTXT, SHAPEFILE, KML, and others
124
"""
125
```
126
127
### Format-Specific Features
128
129
#### MiniSEED Format
130
131
Most comprehensive support with advanced features for the seismological standard format.
132
133
```python { .api }
134
# Import from obspy.io.mseed
135
class InternalMSEEDError(Exception):
136
"""Internal MiniSEED library error."""
137
pass
138
139
class InternalMSEEDWarning(UserWarning):
140
"""Internal MiniSEED library warning."""
141
pass
142
143
class ObsPyMSEEDError(Exception):
144
"""ObsPy MiniSEED-specific error."""
145
pass
146
147
class ObsPyMSEEDFilesizeTooSmallError(ObsPyMSEEDError):
148
"""MiniSEED file too small to contain valid data."""
149
pass
150
151
# MiniSEED-specific functions
152
def get_record_information(filename: str, offset: int = 0):
153
"""
154
Get detailed information about MiniSEED record.
155
156
Args:
157
filename: MiniSEED filename
158
offset: Byte offset in file
159
160
Returns:
161
Dictionary with record header information
162
"""
163
```
164
165
#### SAC Format
166
167
Seismic Analysis Code format with extensive header support.
168
169
```python { .api }
170
# SAC format supports rich metadata through header variables
171
# Automatic conversion between ObsPy Stats and SAC header format
172
# Includes support for SAC XY format for non-time-series data
173
174
# Read with SAC-specific options
175
st = read('seismic.sac', debug_headers=True, checksum=True)
176
177
# Write with SAC format preservation
178
st.write('output.sac', format='SAC')
179
```
180
181
### File Format Categories
182
183
#### Waveform Formats (26 formats)
184
185
```python { .api }
186
# Complete list of supported waveform formats
187
WAVEFORM_FORMATS = {
188
'MSEED': 'MiniSEED format - seismological standard',
189
'SAC': 'Seismic Analysis Code format',
190
'GSE2': 'Group of Scientific Experts format',
191
'SEGY': 'Society of Exploration Geophysicists Y format',
192
'WIN': 'WIN format from NIED Japan',
193
'CSS': 'Center for Seismic Studies waveform format',
194
'SEISAN': 'SEISAN seismology software format',
195
'AH': 'Ad Hoc format',
196
'WAV': 'WAV audio format',
197
'GCF': 'Guralp Compressed Format',
198
'REFTEK130': 'RefTek RT-130 format',
199
'PDAS': 'PDAS format',
200
'Y': 'Nanometrics Y format',
201
'SEG2': 'SEG-2 format',
202
'SH_ASC': 'Seismic Handler ASCII format',
203
'Q': 'Seismic Handler Q format',
204
'KINEMETRICS_EVT': 'Kinemetrics EVT format',
205
'KNET': 'NIED K-NET ASCII format',
206
'RG16': 'Fairfield RG-16 format',
207
'DMX': 'DMX format',
208
'ALSEP_PSE': 'Apollo Lunar PSE format',
209
'ALSEP_WTN': 'Apollo Lunar WTN format',
210
'ALSEP_WTH': 'Apollo Lunar WTH format',
211
'TSPAIR': 'Time-sample pair ASCII',
212
'SLIST': 'Sample list ASCII',
213
'PICKLE': 'Python pickle format'
214
}
215
```
216
217
#### Event Formats (18 formats)
218
219
```python { .api }
220
EVENT_FORMATS = {
221
'QUAKEML': 'QuakeML - FDSN standard XML format',
222
'NDK': 'Harvard CMT NDK format',
223
'CMTSOLUTION': 'CMT solution format',
224
'NORDIC': 'Nordic format from NORSAR',
225
'NLLOC_HYP': 'NonLinLoc hypocenter format',
226
'SC3ML': 'SeisComp3 ML format',
227
'ZMAP': 'ZMAP format',
228
'JSON': 'JSON event format',
229
'MCHEDR': 'PDE MCHEDR format',
230
'CNV': 'CNV format',
231
'FOCMEC': 'FOCMEC focal mechanism format',
232
'HYPODD_PHA': 'HypoDD phase format',
233
'SCARDEC': 'SCARDEC format',
234
'SHAPEFILE': 'ESRI Shapefile export',
235
'KML': 'Google Earth KML format',
236
'FNETMT': 'F-net moment tensor format',
237
'GSE2': 'GSE2 bulletin format',
238
'IMS10BULLETIN': 'IMS1.0 bulletin format'
239
}
240
```
241
242
#### Inventory Formats (9 formats)
243
244
```python { .api }
245
INVENTORY_FORMATS = {
246
'STATIONXML': 'FDSN StationXML - metadata standard',
247
'SEED': 'SEED format with full response',
248
'XSEED': 'XML-SEED format',
249
'RESP': 'RESP response file format',
250
'SACPZ': 'SAC pole-zero format',
251
'CSS': 'CSS station table format',
252
'STATIONTXT': 'FDSN station text format',
253
'SC3ML': 'SeisComp3 inventory format',
254
'INVENTORYXML': 'ArcLink inventory XML'
255
}
256
```
257
258
## Usage Examples
259
260
### Basic File I/O Operations
261
262
```python
263
from obspy import read, read_events, read_inventory
264
265
# Read various waveform formats (automatic detection)
266
st_mseed = read('data.mseed')
267
st_sac = read('data.sac')
268
st_segy = read('data.segy')
269
st_multiple = read('data*.mseed') # Read multiple files
270
271
# Read with specific parameters
272
st = read('data.mseed',
273
starttime=UTCDateTime("2023-01-01T10:00:00"),
274
endtime=UTCDateTime("2023-01-01T11:00:00"),
275
headonly=False)
276
277
# Read event catalogs
278
catalog_quakeml = read_events('events.xml')
279
catalog_ndk = read_events('catalog.ndk')
280
catalog_json = read_events('events.json')
281
282
# Read station metadata
283
inventory_xml = read_inventory('stations.xml')
284
inventory_seed = read_inventory('dataless.seed')
285
inventory_resp = read_inventory('RESP.IU.ANMO.00.BHZ')
286
```
287
288
### Format Conversion Workflows
289
290
```python
291
from obspy import read, read_events, read_inventory
292
293
# Convert waveform formats
294
st = read('input.sac')
295
st.write('output.mseed', format='MSEED')
296
st.write('output.segy', format='SEGY',
297
data_encoding=1, # 4-byte IBM floating point
298
byteorder='>') # Big-endian
299
300
# Convert event formats
301
catalog = read_events('events.xml', format='QUAKEML')
302
catalog.write('events.ndk', format='NDK')
303
catalog.write('events.txt', format='ZMAP')
304
catalog.write('events.kml', format='KML')
305
306
# Convert station metadata formats
307
inventory = read_inventory('stations.xml')
308
inventory.write('stations.pz', format='SACPZ')
309
inventory.write('stations.txt', format='STATIONTXT')
310
inventory.write('stations.kml', format='KML')
311
```
312
313
### Advanced Format Options
314
315
```python
316
from obspy import read, Stream
317
from obspy.io.mseed import InternalMSEEDError
318
319
# MiniSEED with specific options
320
try:
321
st = read('data.mseed',
322
apply_calib=True, # Apply calibration
323
check_compression=True, # Verify integrity
324
details=True, # Get detailed info
325
headonly=False) # Read full data
326
except InternalMSEEDError as e:
327
print(f"MiniSEED error: {e}")
328
329
# SAC format with header debugging
330
st = read('data.sac', debug_headers=True, checksum=True)
331
332
# Write MiniSEED with compression
333
st.write('compressed.mseed', format='MSEED',
334
encoding='STEIM2', # Compression algorithm
335
reclen=512, # Record length
336
byteorder='>', # Big-endian
337
flush=True) # Flush buffers
338
339
# Write SAC with specific byte order
340
st.write('output.sac', format='SAC', byteorder='little')
341
```
342
343
### Bulk File Processing
344
345
```python
346
import glob
347
from obspy import read, Stream
348
349
# Process multiple files efficiently
350
all_files = glob.glob('seismic_data/*.mseed')
351
master_stream = Stream()
352
353
for filename in all_files:
354
try:
355
st = read(filename)
356
master_stream += st # Concatenate streams
357
except Exception as e:
358
print(f"Error reading {filename}: {e}")
359
360
# Merge and clean up
361
master_stream.merge(method=1) # Merge overlapping traces
362
master_stream.sort() # Sort by time and metadata
363
364
# Write combined dataset
365
master_stream.write('combined_data.mseed', format='MSEED')
366
367
# Split by station and write separately
368
for network_station in set([f"{tr.stats.network}.{tr.stats.station}"
369
for tr in master_stream]):
370
net, sta = network_station.split('.')
371
st_station = master_stream.select(network=net, station=sta)
372
st_station.write(f'{net}_{sta}_data.mseed', format='MSEED')
373
```
374
375
### Format-Specific Error Handling
376
377
```python
378
from obspy import read
379
from obspy.io.mseed import ObsPyMSEEDError, ObsPyMSEEDFilesizeTooSmallError
380
from obspy.io.sac import SacIOError
381
from obspy.core.util import ObsPyReadingError
382
383
def robust_read(filename):
384
"""Robust file reading with format-specific error handling."""
385
try:
386
return read(filename)
387
388
except ObsPyMSEEDFilesizeTooSmallError:
389
print(f"MiniSEED file {filename} is too small")
390
return None
391
392
except ObsPyMSEEDError as e:
393
print(f"MiniSEED error in {filename}: {e}")
394
# Try reading with relaxed checks
395
try:
396
return read(filename, check_compression=False)
397
except:
398
return None
399
400
except SacIOError as e:
401
print(f"SAC format error in {filename}: {e}")
402
return None
403
404
except ObsPyReadingError as e:
405
print(f"General reading error in {filename}: {e}")
406
return None
407
408
except Exception as e:
409
print(f"Unexpected error reading {filename}: {e}")
410
return None
411
412
# Use robust reader
413
filenames = ['file1.mseed', 'file2.sac', 'file3.segy']
414
streams = [robust_read(f) for f in filenames]
415
streams = [s for s in streams if s is not None] # Filter out failures
416
```
417
418
## Types
419
420
```python { .api }
421
# Format detection result
422
FormatInfo = {
423
'format': str, # Detected format name
424
'confidence': float, # Detection confidence (0-1)
425
'extensions': list[str], # Typical file extensions
426
'description': str # Format description
427
}
428
429
# File header information (format-dependent)
430
HeaderInfo = {
431
'sampling_rate': float, # Sampling rate in Hz
432
'npts': int, # Number of data points
433
'starttime': UTCDateTime, # Start time
434
'network': str, # Network code
435
'station': str, # Station code
436
'channel': str, # Channel code
437
# Additional format-specific fields
438
}
439
```