0
# Index Operations
1
2
High-performance indexing system for fast message searching and retrieval. Creates optimized indexes on GRIB keys for efficient filtering of large files.
3
4
## Capabilities
5
6
### Index Creation
7
8
Create indexes on GRIB keys for fast message searching.
9
10
```python { .api }
11
class index:
12
def __init__(self, filename, *args):
13
"""
14
Create GRIB index for fast message searching.
15
16
Parameters:
17
- filename: str, GRIB filename or saved index filename
18
- *args: str, GRIB key names to index on
19
If no keys provided, filename is treated as saved index file
20
21
Key types can be specified by appending type:
22
- 'key:l' for long/integer values
23
- 'key:s' for string values
24
- 'key:d' for double/float values
25
"""
26
```
27
28
Usage example:
29
```python
30
# Create index on multiple keys
31
grbindx = pygrib.index('weather.grb', 'shortName', 'typeOfLevel', 'level')
32
33
# Create index with typed keys
34
grbindx = pygrib.index('weather.grb', 'shortName:s', 'level:l', 'step:l')
35
36
# Open previously saved index (no keys specified)
37
grbindx = pygrib.index('weather.grb.idx')
38
```
39
40
### Index Properties
41
42
Access information about the created index.
43
44
```python { .api }
45
class index:
46
@property
47
def keys(self) -> list:
48
"""
49
List of indexed key names.
50
None when opening a saved index file.
51
"""
52
53
@property
54
def types(self) -> list:
55
"""
56
List of key type declarations ('l', 's', 'd').
57
None when opening a saved index file.
58
"""
59
```
60
61
Usage example:
62
```python
63
grbindx = pygrib.index('weather.grb', 'shortName', 'level', 'step')
64
65
print(f"Indexed keys: {grbindx.keys}")
66
print(f"Key types: {grbindx.types}")
67
68
# For saved index files
69
saved_idx = pygrib.index('weather.grb.idx')
70
print(f"Saved index keys: {saved_idx.keys}") # Returns None
71
```
72
73
### Fast Message Selection
74
75
Select messages using the index for high-performance filtering.
76
77
```python { .api }
78
class index:
79
def select(self, **kwargs):
80
"""
81
Select messages matching key/value criteria using index.
82
Much faster than open.select() for large files.
83
84
Parameters:
85
- **kwargs: GRIB key/value pairs
86
Only exact matches supported (no sequences or functions)
87
88
Returns:
89
List of gribmessage objects matching criteria
90
"""
91
92
def __call__(self, **kwargs):
93
"""Alias for select() method"""
94
```
95
96
Usage example:
97
```python
98
grbindx = pygrib.index('weather.grb', 'shortName', 'typeOfLevel', 'level')
99
100
# Fast selection using index
101
temp_500 = grbindx.select(shortName='t', typeOfLevel='isobaricInhPa', level=500)
102
wind_250 = grbindx.select(shortName='u', typeOfLevel='isobaricInhPa', level=250)
103
104
# Using __call__ method (equivalent)
105
geopotential = grbindx(shortName='gh', level=500)
106
107
# Process results
108
for grb in temp_500:
109
data = grb['values']
110
print(f"Temperature at 500mb: {data.min():.1f} to {data.max():.1f} K")
111
```
112
113
### Index Persistence
114
115
Save and reload indexes for reuse across sessions.
116
117
```python { .api }
118
class index:
119
def write(self, filename: str):
120
"""
121
Save index to file for later reuse.
122
123
Parameters:
124
- filename: str, output filename for saved index
125
"""
126
127
def close(self):
128
"""Close and deallocate index resources"""
129
```
130
131
Usage example:
132
```python
133
# Create and use index
134
grbindx = pygrib.index('weather.grb', 'shortName', 'typeOfLevel', 'level')
135
selected = grbindx.select(shortName='t', level=500)
136
137
# Save index for later use
138
grbindx.write('weather.grb.idx')
139
grbindx.close()
140
141
# Later session - reload saved index
142
grbindx = pygrib.index('weather.grb.idx') # No keys needed
143
selected = grbindx.select(shortName='t', level=500)
144
grbindx.close()
145
```
146
147
### Index Performance Comparison
148
149
Examples showing performance differences between index and sequential search.
150
151
```python
152
import time
153
154
# Sequential search (slower for large files)
155
grbs = pygrib.open('large_file.grb')
156
start = time.time()
157
temp_msgs = grbs.select(shortName='t', level=500)
158
seq_time = time.time() - start
159
grbs.close()
160
161
# Index search (much faster)
162
grbindx = pygrib.index('large_file.grb', 'shortName', 'level')
163
start = time.time()
164
temp_msgs = grbindx.select(shortName='t', level=500)
165
idx_time = time.time() - start
166
grbindx.close()
167
168
print(f"Sequential search: {seq_time:.3f} seconds")
169
print(f"Index search: {idx_time:.3f} seconds")
170
print(f"Speed improvement: {seq_time/idx_time:.1f}x faster")
171
```
172
173
## Index Limitations and Considerations
174
175
### Multi-Field Message Warning
176
177
**Important**: Indexes cannot search within multi-field GRIB messages and may return incorrect results. This commonly affects NCEP files where u/v wind components are stored in single multi-field messages.
178
179
```python
180
# Check for multi-field messages
181
grbs = pygrib.open('ncep_file.grb')
182
if grbs.has_multi_field_msgs:
183
print("Warning: File contains multi-field messages")
184
print("Use open.select() instead of index for reliable results")
185
186
# Use sequential search for multi-field files
187
wind_data = grbs.select(shortName=['u', 'v'], level=250)
188
else:
189
# Safe to use index
190
grbindx = pygrib.index('ncep_file.grb', 'shortName', 'level')
191
wind_data = grbindx.select(shortName='u', level=250)
192
```
193
194
### Index Selection Constraints
195
196
Unlike `open.select()`, index selection has limitations:
197
198
```python
199
grbindx = pygrib.index('weather.grb', 'shortName', 'level')
200
201
# Supported: exact value matching
202
temp_500 = grbindx.select(shortName='t', level=500)
203
204
# NOT supported with indexes:
205
# - Sequence matching: level=(500, 700, 850)
206
# - Function matching: level=lambda l: l > 500
207
# - Multiple values for same key
208
209
# Use open.select() for complex criteria
210
grbs = pygrib.open('weather.grb')
211
multi_level = grbs.select(shortName='t', level=(500, 700, 850))
212
conditional = grbs.select(shortName='t', level=lambda l: l > 500)
213
```
214
215
### Choosing Keys for Indexing
216
217
Select keys that you'll frequently use for filtering:
218
219
```python
220
# Good key choices for meteorological data
221
common_idx = pygrib.index('weather.grb',
222
'shortName', # Parameter name
223
'typeOfLevel', # Level type
224
'level', # Level value
225
'step') # Forecast step
226
227
# Analysis-specific indexing
228
analysis_idx = pygrib.index('analysis.grb',
229
'paramId', # Parameter ID
230
'dataDate', # Analysis date
231
'dataTime') # Analysis time
232
233
# Forecast-specific indexing
234
forecast_idx = pygrib.index('forecast.grb',
235
'shortName',
236
'validityDate', # Valid date
237
'validityTime') # Valid time
238
```