0
# Version Store Operations
1
2
Versioned storage for pandas DataFrames and Series with complete audit trails, point-in-time snapshots, and efficient data retrieval. Supports temporal data access, metadata management, and multi-version data handling optimized for financial time series data.
3
4
## Capabilities
5
6
### Symbol Management
7
8
Operations for managing symbols (data identifiers) within the version store, including listing and existence checking.
9
10
```python { .api }
11
def list_symbols(self, all_symbols=False, snapshot=None, regex=None, **kwargs):
12
"""
13
List symbols in version store.
14
15
Parameters:
16
- all_symbols: Include symbols from all snapshots (default: False)
17
- snapshot: List symbols from specific snapshot
18
- regex: Filter symbols by regular expression pattern
19
- **kwargs: Additional filtering parameters
20
21
Returns:
22
List of symbol names
23
"""
24
25
def has_symbol(self, symbol, as_of=None):
26
"""
27
Check if symbol exists at given time.
28
29
Parameters:
30
- symbol: Symbol name to check
31
- as_of: Check existence at specific datetime (default: latest)
32
33
Returns:
34
bool: True if symbol exists
35
"""
36
```
37
38
### Read Operations
39
40
Methods for retrieving versioned data and metadata with temporal filtering and version-specific access.
41
42
```python { .api }
43
def read(self, symbol, as_of=None, date_range=None, from_version=None,
44
allow_secondary=None, **kwargs):
45
"""
46
Read symbol data with version and temporal filtering.
47
48
Parameters:
49
- symbol: Symbol name to read
50
- as_of: Read data as of specific datetime
51
- date_range: DateRange object for temporal filtering
52
- from_version: Read from specific version number
53
- allow_secondary: Allow reads from MongoDB secondary nodes
54
- **kwargs: Additional read parameters
55
56
Returns:
57
VersionedItem: Object containing data, metadata, and version info
58
59
Raises:
60
- NoDataFoundException: If symbol or version doesn't exist
61
"""
62
63
def read_metadata(self, symbol, as_of=None, allow_secondary=None):
64
"""
65
Read symbol metadata without loading data.
66
67
Parameters:
68
- symbol: Symbol name
69
- as_of: Read metadata as of specific datetime
70
- allow_secondary: Allow reads from secondary nodes
71
72
Returns:
73
dict: Symbol metadata
74
75
Raises:
76
- NoDataFoundException: If symbol doesn't exist
77
"""
78
```
79
80
### Write Operations
81
82
Methods for storing and updating versioned data with metadata support and version management.
83
84
```python { .api }
85
def write(self, symbol, data, metadata=None, prune_previous_version=True, **kwargs):
86
"""
87
Write/overwrite symbol data creating new version.
88
89
Parameters:
90
- symbol: Symbol name to write
91
- data: Data to store (pandas DataFrame/Series or numpy array)
92
- metadata: Optional metadata dictionary
93
- prune_previous_version: Remove previous version to save space
94
- **kwargs: Additional write parameters
95
96
Returns:
97
VersionedItem: Written data with version information
98
99
Raises:
100
- QuotaExceededException: If write would exceed storage quota
101
- UnhandledDtypeException: If data type not supported
102
"""
103
104
def append(self, symbol, data, metadata=None, prune_previous_version=True,
105
upsert=True, **kwargs):
106
"""
107
Append data to existing symbol or create if doesn't exist.
108
109
Parameters:
110
- symbol: Symbol name
111
- data: Data to append
112
- metadata: Optional metadata dictionary
113
- prune_previous_version: Remove previous version after append
114
- upsert: Create symbol if doesn't exist
115
- **kwargs: Additional append parameters
116
117
Returns:
118
VersionedItem: Updated data with version information
119
120
Raises:
121
- OverlappingDataException: If appended data overlaps existing data
122
- UnorderedDataException: If data not properly time-ordered
123
"""
124
125
def write_metadata(self, symbol, metadata, prune_previous_version=True, **kwargs):
126
"""
127
Write metadata only without changing data.
128
129
Parameters:
130
- symbol: Symbol name
131
- metadata: Metadata dictionary to write
132
- prune_previous_version: Remove previous version
133
- **kwargs: Additional parameters
134
135
Returns:
136
VersionedItem: Symbol data with updated metadata
137
"""
138
```
139
140
### Version Management
141
142
Operations for managing multiple versions of data including listing, restoration, and cleanup.
143
144
```python { .api }
145
def list_versions(self, symbol=None, snapshot=None, latest_only=False):
146
"""
147
List versions for symbol(s).
148
149
Parameters:
150
- symbol: Specific symbol name (default: all symbols)
151
- snapshot: List versions from specific snapshot
152
- latest_only: Return only latest version for each symbol
153
154
Returns:
155
List of version information dictionaries
156
"""
157
158
def restore_version(self, symbol, as_of, prune_previous_version=True):
159
"""
160
Restore symbol to previous version.
161
162
Parameters:
163
- symbol: Symbol name to restore
164
- as_of: Datetime or version number to restore from
165
- prune_previous_version: Remove current version after restore
166
167
Returns:
168
VersionedItem: Restored data with version information
169
170
Raises:
171
- NoDataFoundException: If version doesn't exist
172
"""
173
174
def delete(self, symbol):
175
"""
176
Delete symbol and all its versions permanently.
177
178
Parameters:
179
- symbol: Symbol name to delete
180
181
Raises:
182
- NoDataFoundException: If symbol doesn't exist
183
"""
184
```
185
186
### Snapshot Management
187
188
Creating and managing named snapshots for point-in-time data consistency across multiple symbols.
189
190
```python { .api }
191
def snapshot(self, snap_name, metadata=None, skip_symbols=None, versions=None):
192
"""
193
Create named snapshot of current data state.
194
195
Parameters:
196
- snap_name: Name for the snapshot
197
- metadata: Optional snapshot metadata
198
- skip_symbols: List of symbols to exclude from snapshot
199
- versions: Specific versions to include (dict: symbol -> version)
200
201
Returns:
202
Snapshot information dictionary
203
204
Raises:
205
- DuplicateSnapshotException: If snapshot name already exists
206
"""
207
208
def delete_snapshot(self, snap_name):
209
"""
210
Delete named snapshot.
211
212
Parameters:
213
- snap_name: Snapshot name to delete
214
215
Raises:
216
- NoDataFoundException: If snapshot doesn't exist
217
"""
218
219
def list_snapshots(self):
220
"""
221
List all available snapshots.
222
223
Returns:
224
List of snapshot information dictionaries
225
"""
226
```
227
228
### Information and Audit
229
230
Methods for retrieving detailed information about symbols, versions, and audit trails.
231
232
```python { .api }
233
def get_info(self, symbol, as_of=None):
234
"""
235
Get detailed information about symbol.
236
237
Parameters:
238
- symbol: Symbol name
239
- as_of: Get info as of specific datetime
240
241
Returns:
242
dict: Comprehensive symbol information including size, versions, metadata
243
244
Raises:
245
- NoDataFoundException: If symbol doesn't exist
246
"""
247
248
def get_arctic_version(self, symbol, as_of=None):
249
"""
250
Get Arctic version used to store symbol.
251
252
Parameters:
253
- symbol: Symbol name
254
- as_of: Check version as of specific datetime
255
256
Returns:
257
str: Arctic version string
258
"""
259
260
def read_audit_log(self, symbol=None, message=None):
261
"""
262
Read audit trail for operations.
263
264
Parameters:
265
- symbol: Filter by specific symbol (default: all)
266
- message: Filter by message content
267
268
Returns:
269
List of audit log entries
270
"""
271
272
def stats(self):
273
"""
274
Get version store statistics.
275
276
Returns:
277
dict: Store statistics including symbol counts, storage usage, etc.
278
"""
279
```
280
281
## Types
282
283
### VersionedItem
284
285
Container for versioned data with metadata and version information.
286
287
```python { .api }
288
class VersionedItem:
289
"""
290
Container for versioned data with complete metadata.
291
292
Attributes:
293
- symbol: Symbol name
294
- library: Library reference
295
- data: Actual data (pandas DataFrame/Series, numpy array, etc.)
296
- version: Version number
297
- metadata: Metadata dictionary
298
- host: Host information
299
"""
300
301
def __init__(self, symbol, library, data, version, metadata, host=None):
302
"""
303
Initialize versioned item.
304
305
Parameters:
306
- symbol: Symbol name
307
- library: Library reference
308
- data: Data payload
309
- version: Version identifier
310
- metadata: Metadata dictionary
311
- host: Optional host information
312
"""
313
314
def metadata_dict(self):
315
"""
316
Get metadata as dictionary.
317
318
Returns:
319
dict: Complete metadata information
320
"""
321
```
322
323
## Usage Examples
324
325
### Basic Read/Write Operations
326
327
```python
328
from arctic import Arctic, VERSION_STORE
329
import pandas as pd
330
import numpy as np
331
332
# Setup
333
arctic_conn = Arctic('mongodb://localhost:27017')
334
arctic_conn.initialize_library('prices', VERSION_STORE)
335
lib = arctic_conn['prices']
336
337
# Create sample data
338
dates = pd.date_range('2020-01-01', periods=1000, freq='min')
339
data = pd.DataFrame({
340
'price': np.random.randn(1000).cumsum() + 100,
341
'volume': np.random.randint(100, 1000, 1000)
342
}, index=dates)
343
344
# Write data with metadata
345
metadata = {'source': 'market_feed', 'currency': 'USD'}
346
lib.write('AAPL', data, metadata=metadata)
347
348
# Read data back
349
result = lib.read('AAPL')
350
print(f"Data shape: {result.data.shape}")
351
print(f"Metadata: {result.metadata}")
352
print(f"Version: {result.version}")
353
```
354
355
### Version Management
356
357
```python
358
# Create multiple versions
359
lib.write('AAPL', data[:500], metadata={'note': 'partial data'})
360
lib.append('AAPL', data[500:], metadata={'note': 'complete data'})
361
362
# List all versions
363
versions = lib.list_versions('AAPL')
364
for version in versions:
365
print(f"Version {version['version']}: {version['date']}")
366
367
# Read specific version
368
old_data = lib.read('AAPL', from_version=1)
369
print(f"Version 1 shape: {old_data.data.shape}")
370
371
# Restore to previous version
372
lib.restore_version('AAPL', as_of=1)
373
```
374
375
### Snapshot Operations
376
377
```python
378
# Write multiple symbols
379
symbols = ['AAPL', 'GOOGL', 'MSFT']
380
for symbol in symbols:
381
symbol_data = data * np.random.uniform(0.8, 1.2) # Simulate different prices
382
lib.write(symbol, symbol_data)
383
384
# Create snapshot
385
lib.snapshot('end_of_day_2020', metadata={'note': 'EOD snapshot'})
386
387
# List snapshots
388
snapshots = lib.list_snapshots()
389
for snap in snapshots:
390
print(f"Snapshot: {snap['name']}, Created: {snap['date']}")
391
392
# Read from snapshot
393
snap_symbols = lib.list_symbols(snapshot='end_of_day_2020')
394
snap_data = lib.read('AAPL', snapshot='end_of_day_2020')
395
```
396
397
### Temporal Data Access
398
399
```python
400
from arctic.date import DateRange
401
from datetime import datetime
402
403
# Read data for specific date range
404
date_filter = DateRange(datetime(2020, 1, 1), datetime(2020, 1, 31))
405
jan_data = lib.read('AAPL', date_range=date_filter)
406
print(f"January data: {jan_data.data.shape}")
407
408
# Read as of specific time
409
as_of_data = lib.read('AAPL', as_of=datetime(2020, 1, 15))
410
print(f"Data as of Jan 15: {as_of_data.data.shape}")
411
412
# Append with date range validation
413
new_data = pd.DataFrame({
414
'price': [105.0, 106.0],
415
'volume': [1200, 1300]
416
}, index=pd.date_range('2020-02-01', periods=2, freq='min'))
417
418
lib.append('AAPL', new_data)
419
```
420
421
### Audit and Information
422
423
```python
424
# Get detailed symbol information
425
info = lib.get_info('AAPL')
426
print(f"Symbol info: {info}")
427
428
# Check Arctic version
429
version = lib.get_arctic_version('AAPL')
430
print(f"Stored with Arctic version: {version}")
431
432
# Read audit log
433
audit_entries = lib.read_audit_log('AAPL')
434
for entry in audit_entries[-5:]: # Last 5 entries
435
print(f"{entry['date']}: {entry['message']}")
436
437
# Get store statistics
438
stats = lib.stats()
439
print(f"Store stats: {stats}")
440
```