0
# Segments and Data Quality
1
2
Data quality flag handling and time segment management for identifying valid analysis periods and detector operational states. These tools are essential for gravitational-wave data analysis to ensure only high-quality data is used in scientific analyses.
3
4
## Capabilities
5
6
### Segment - Time Interval Representation
7
8
Represents a single time interval with start and end times, following the convention [start, end).
9
10
```python { .api }
11
class Segment:
12
def __init__(self, start, end):
13
"""
14
Create a time segment.
15
16
Parameters:
17
- start: float, segment start time (GPS)
18
- end: float, segment end time (GPS)
19
"""
20
21
def __contains__(self, other):
22
"""
23
Test if another segment/time is contained within this segment.
24
25
Parameters:
26
- other: Segment, float, or other time-like object
27
28
Returns:
29
bool, True if contained
30
"""
31
32
def intersects(self, other):
33
"""
34
Test if this segment intersects another.
35
36
Parameters:
37
- other: Segment, another time segment
38
39
Returns:
40
bool, True if segments intersect
41
"""
42
43
def protract(self, x):
44
"""
45
Expand segment by amount x on both sides.
46
47
Parameters:
48
- x: float, amount to expand (seconds)
49
50
Returns:
51
New expanded Segment
52
"""
53
54
def contract(self, x):
55
"""
56
Contract segment by amount x on both sides.
57
58
Parameters:
59
- x: float, amount to contract (seconds)
60
61
Returns:
62
New contracted Segment
63
"""
64
65
@property
66
def start(self):
67
"""Start time of segment."""
68
69
@property
70
def end(self):
71
"""End time of segment."""
72
73
@property
74
def duration(self):
75
"""Duration of segment in seconds."""
76
```
77
78
### SegmentList - Collection of Time Segments
79
80
List of time segments with set-like operations for combining and manipulating segment collections.
81
82
```python { .api }
83
class SegmentList(list):
84
def __init__(self, segments=None):
85
"""
86
Create a list of segments.
87
88
Parameters:
89
- segments: iterable, initial segments
90
"""
91
92
def coalesce(self):
93
"""
94
Merge overlapping and adjacent segments.
95
96
Returns:
97
New coalesced SegmentList
98
"""
99
100
def intersects(self, other):
101
"""
102
Find intersection with another SegmentList.
103
104
Parameters:
105
- other: SegmentList or Segment
106
107
Returns:
108
SegmentList with intersecting segments
109
"""
110
111
def union(self, other):
112
"""
113
Find union with another SegmentList.
114
115
Parameters:
116
- other: SegmentList
117
118
Returns:
119
SegmentList with combined segments
120
"""
121
122
def __sub__(self, other):
123
"""
124
Remove segments (set difference).
125
126
Parameters:
127
- other: SegmentList or Segment
128
129
Returns:
130
SegmentList with segments removed
131
"""
132
133
def protract(self, x):
134
"""
135
Expand all segments.
136
137
Parameters:
138
- x: float, expansion amount
139
140
Returns:
141
SegmentList with expanded segments
142
"""
143
144
def contract(self, x):
145
"""
146
Contract all segments.
147
148
Parameters:
149
- x: float, contraction amount
150
151
Returns:
152
SegmentList with contracted segments
153
"""
154
155
@property
156
def extent(self):
157
"""
158
Total extent from earliest start to latest end.
159
160
Returns:
161
Segment spanning the full extent
162
"""
163
164
@property
165
def livetime(self):
166
"""
167
Total duration of all segments.
168
169
Returns:
170
float, total time in seconds
171
"""
172
```
173
174
### DataQualityFlag - Data Quality Flag Management
175
176
Represents a data quality flag with active segments (when flag is set) and valid segments (when data exists).
177
178
```python { .api }
179
class DataQualityFlag:
180
def __init__(self, name=None, active=None, valid=None, **kwargs):
181
"""
182
Create a data quality flag.
183
184
Parameters:
185
- name: str, flag name (e.g., 'H1:DMT-ANALYSIS_READY:1')
186
- active: SegmentList, times when flag is active/True
187
- valid: SegmentList, times when data is valid/available
188
"""
189
190
@classmethod
191
def query(cls, flag, start, end, **kwargs):
192
"""
193
Query data quality flag from segment database.
194
195
Parameters:
196
- flag: str, flag name to query
197
- start: float, start time (GPS)
198
- end: float, end time (GPS)
199
- url: str, segment server URL
200
201
Returns:
202
DataQualityFlag object
203
"""
204
205
@classmethod
206
def read(cls, source, flag=None, **kwargs):
207
"""
208
Read flag from file.
209
210
Parameters:
211
- source: str, file path
212
- flag: str, specific flag name
213
- format: str, file format
214
215
Returns:
216
DataQualityFlag object
217
"""
218
219
def write(self, target, **kwargs):
220
"""
221
Write flag to file.
222
223
Parameters:
224
- target: str, output file path
225
- format: str, output format
226
"""
227
228
def plot(self, **kwargs):
229
"""
230
Plot the data quality flag.
231
232
Returns:
233
Plot showing active and valid segments
234
"""
235
236
def __and__(self, other):
237
"""
238
Logical AND with another flag.
239
240
Parameters:
241
- other: DataQualityFlag
242
243
Returns:
244
Combined DataQualityFlag
245
"""
246
247
def __or__(self, other):
248
"""
249
Logical OR with another flag.
250
251
Parameters:
252
- other: DataQualityFlag
253
254
Returns:
255
Combined DataQualityFlag
256
"""
257
258
def __invert__(self):
259
"""
260
Logical NOT (invert flag).
261
262
Returns:
263
Inverted DataQualityFlag
264
"""
265
266
@property
267
def livetime(self):
268
"""Total active time in seconds."""
269
270
@property
271
def efficiency(self):
272
"""Efficiency: active livetime / valid livetime."""
273
```
274
275
### DataQualityDict - Collection of Data Quality Flags
276
277
Dictionary container for multiple data quality flags with batch operations.
278
279
```python { .api }
280
class DataQualityDict(dict):
281
def __init__(self, *args, **kwargs):
282
"""
283
Dictionary of DataQualityFlag objects.
284
"""
285
286
@classmethod
287
def query(cls, flags, start, end, **kwargs):
288
"""
289
Query multiple flags from segment database.
290
291
Parameters:
292
- flags: list, flag names to query
293
- start: float, start time
294
- end: float, end time
295
296
Returns:
297
DataQualityDict with all flags
298
"""
299
300
@classmethod
301
def read(cls, source, **kwargs):
302
"""
303
Read multiple flags from file.
304
305
Returns:
306
DataQualityDict with all flags from file
307
"""
308
309
def plot(self, **kwargs):
310
"""
311
Plot all flags in a multi-panel figure.
312
313
Returns:
314
Plot with separate panels for each flag
315
"""
316
317
def intersection(self):
318
"""
319
Find intersection of all flags.
320
321
Returns:
322
DataQualityFlag representing intersection
323
"""
324
325
def union(self):
326
"""
327
Find union of all flags.
328
329
Returns:
330
DataQualityFlag representing union
331
"""
332
```
333
334
### Usage Examples
335
336
#### Basic Segment Operations
337
338
```python
339
from gwpy.segments import Segment, SegmentList
340
341
# Create individual segments
342
seg1 = Segment(1000, 1100) # 100 second segment
343
seg2 = Segment(1050, 1150) # Overlapping segment
344
seg3 = Segment(1200, 1300) # Non-overlapping segment
345
346
# Create segment list
347
segments = SegmentList([seg1, seg2, seg3])
348
349
# Coalesce overlapping segments
350
coalesced = segments.coalesce()
351
print(f"Original: {len(segments)} segments")
352
print(f"Coalesced: {len(coalesced)} segments")
353
354
# Calculate total livetime
355
total_time = segments.livetime
356
print(f"Total livetime: {total_time} seconds")
357
358
# Find extent
359
extent = segments.extent
360
print(f"Data spans from {extent.start} to {extent.end}")
361
```
362
363
#### Data Quality Flag Analysis
364
365
```python
366
from gwpy.segments import DataQualityFlag
367
368
# Query standard analysis-ready flag for LIGO Hanford
369
start_time = 1126259446
370
end_time = 1126259478
371
372
analysis_ready = DataQualityFlag.query('H1:DMT-ANALYSIS_READY:1',
373
start=start_time,
374
end=end_time)
375
376
print(f"Analysis ready efficiency: {analysis_ready.efficiency:.2%}")
377
print(f"Active livetime: {analysis_ready.livetime} seconds")
378
379
# Plot the flag
380
plot = analysis_ready.plot()
381
plot.set_title('H1 Analysis Ready Flag')
382
plot.set_xlabel('Time [GPS]')
383
plot.show()
384
```
385
386
#### Multi-Flag Analysis
387
388
```python
389
from gwpy.segments import DataQualityDict
390
391
# Query multiple data quality flags
392
flags = ['H1:DMT-ANALYSIS_READY:1',
393
'H1:DMT-CALIBRATED:1',
394
'H1:DMT-UP:1',
395
'H1:LSC-DARM_LOCKED:1']
396
397
dq_flags = DataQualityDict.query(flags,
398
start=start_time,
399
end=end_time)
400
401
# Plot all flags
402
plot = dq_flags.plot(figsize=(12, 8))
403
plot.set_title('H1 Data Quality Flags')
404
plot.show()
405
406
# Find intersection (when all flags are active)
407
science_time = dq_flags.intersection()
408
print(f"Science-quality livetime: {science_time.livetime} seconds")
409
410
# Calculate individual efficiencies
411
for name, flag in dq_flags.items():
412
print(f"{name}: {flag.efficiency:.2%} efficient")
413
```
414
415
#### Segment-Based Data Selection
416
417
```python
418
from gwpy.timeseries import TimeSeries
419
420
# Get science-quality segments
421
science_segments = science_time.active
422
423
# Read data only during science time
424
science_data = []
425
for segment in science_segments:
426
if segment.duration >= 64: # Only use segments ≥64s
427
data = TimeSeries.read('data.gwf', 'H1:STRAIN',
428
start=segment.start,
429
end=segment.end)
430
science_data.append(data)
431
432
# Join all science segments
433
if science_data:
434
full_science_data = TimeSeries.concatenate(science_data)
435
print(f"Total science data: {full_science_data.duration} seconds")
436
```
437
438
#### Custom Flag Creation
439
440
```python
441
# Create custom segments based on analysis criteria
442
loud_segments = SegmentList()
443
444
# Find times when PSD is anomalous
445
strain = TimeSeries.fetch_open_data('H1', start_time, end_time)
446
spec = strain.spectrogram(stride=60, fftlength=4)
447
448
# Identify loud times (simplified example)
449
for i, time in enumerate(spec.times):
450
if spec[i].max() > threshold:
451
loud_segments.append(Segment(time-30, time+30))
452
453
# Create custom data quality flag
454
loud_flag = DataQualityFlag(name='LOUD_TIMES',
455
active=loud_segments.coalesce(),
456
valid=SegmentList([Segment(start_time, end_time)]))
457
458
# Remove loud times from science segments
459
clean_science = science_time - loud_flag
460
print(f"Clean science time: {clean_science.livetime} seconds")
461
```
462
463
#### Working with Vetoes
464
465
```python
466
# Query category 1 (hardware) vetoes
467
cat1_vetoes = DataQualityDict.query(['H1:DMT-ETMY_ESD_DAC_OVERFLOW:1',
468
'H1:DMT-ETMX_ESD_DAC_OVERFLOW:1'],
469
start=start_time, end=end_time)
470
471
# Apply vetoes to remove bad data
472
vetoed_science = science_time
473
for veto_flag in cat1_vetoes.values():
474
vetoed_science = vetoed_science - veto_flag
475
476
print(f"Science time after vetoes: {vetoed_science.livetime} seconds")
477
print(f"Veto efficiency: {(science_time.livetime - vetoed_science.livetime) / science_time.livetime:.2%}")
478
```
479
480
#### Segment I/O Operations
481
482
```python
483
# Save segments to file
484
analysis_ready.write('analysis_ready_segments.xml', format='ligolw')
485
486
# Read segments from file
487
loaded_flag = DataQualityFlag.read('analysis_ready_segments.xml')
488
489
# Export to different formats
490
segments_only = analysis_ready.active
491
segments_only.write('segments.txt', format='segwizard')
492
493
# Read from SegWizard format
494
segwiz_segments = SegmentList.read('segments.txt', format='segwizard')
495
```