0
# N-Dimensional DTW
1
2
DTW algorithms optimized for multi-dimensional time series where each time point contains multiple features. Uses Euclidean distance for point-wise comparisons and supports the same constraint and optimization options as 1D DTW, enabling analysis of complex temporal data like sensor readings, motion capture, or multi-variate signals.
3
4
## Capabilities
5
6
### Multi-Dimensional Distance Calculation
7
8
Compute DTW distance between multi-dimensional time series using Euclidean distance between corresponding feature vectors at each time point.
9
10
```python { .api }
11
def distance(s1, s2, window=None, max_dist=None, max_step=None,
12
max_length_diff=None, penalty=None, psi=None, use_c=False):
13
"""
14
DTW distance for N-dimensional sequences using Euclidean distance.
15
16
Each time point in the sequences is treated as a feature vector, and
17
the local distance between time points is computed as Euclidean distance
18
between the corresponding feature vectors.
19
20
Parameters:
21
- s1, s2: array-like, N-dimensional sequences of shape (length, features)
22
- window: int, warping window constraint
23
- max_dist: float, early stopping threshold
24
- max_step: float, maximum step size
25
- max_length_diff: int, maximum length difference
26
- penalty: float, penalty for compression/expansion
27
- psi: int, psi relaxation parameter
28
- use_c: bool, use C implementation if available
29
30
Returns:
31
float: DTW distance between multi-dimensional sequences
32
"""
33
```
34
35
### Multi-Dimensional Warping Paths
36
37
Compute warping paths for multi-dimensional sequences, providing the same path analysis capabilities as 1D DTW but for complex feature spaces.
38
39
```python { .api }
40
def warping_paths(s1, s2, window=None, max_dist=None, max_step=None,
41
max_length_diff=None, penalty=None, psi=None):
42
"""
43
Warping paths for N-dimensional sequences.
44
45
Computes the full accumulated cost matrix for multi-dimensional DTW,
46
where local distances are Euclidean distances between feature vectors.
47
48
Parameters:
49
- s1, s2: array-like, N-dimensional sequences of shape (length, features)
50
- window: int, warping window constraint
51
- max_dist: float, early stopping threshold
52
- max_step: float, maximum step size
53
- max_length_diff: int, maximum length difference
54
- penalty: float, penalty for compression/expansion
55
- psi: int, psi relaxation parameter
56
57
Returns:
58
tuple: (distance, paths_matrix)
59
- distance: float, optimal DTW distance
60
- paths_matrix: 2D array, accumulated cost matrix
61
"""
62
```
63
64
### Multi-Dimensional Distance Matrix
65
66
Efficient computation of distance matrices for collections of multi-dimensional time series with parallel processing support.
67
68
```python { .api }
69
def distance_matrix(s, max_dist=None, max_length_diff=None, window=None,
70
max_step=None, penalty=None, psi=None, block=None,
71
parallel=False, use_c=False, show_progress=False):
72
"""
73
Distance matrix for N-dimensional sequences.
74
75
Computes pairwise DTW distances between all multi-dimensional sequences
76
in a collection, using Euclidean distance for local comparisons.
77
78
Parameters:
79
- s: list/array, collection of N-dimensional sequences
80
- max_dist: float, early stopping threshold
81
- max_length_diff: int, maximum length difference
82
- window: int, warping window constraint
83
- max_step: float, maximum step size
84
- penalty: float, penalty for compression/expansion
85
- psi: int, psi relaxation parameter
86
- block: tuple, memory blocking configuration
87
- parallel: bool, enable parallel computation
88
- use_c: bool, use C implementation
89
- show_progress: bool, display progress bar
90
91
Returns:
92
array: distance matrix of shape (n, n) where n is number of sequences
93
"""
94
```
95
96
## Usage Examples
97
98
### Basic Multi-Dimensional DTW
99
100
```python
101
from dtaidistance import dtw_ndim
102
import numpy as np
103
104
# Create 3D time series (e.g., accelerometer data: x, y, z)
105
np.random.seed(42)
106
107
# Sequence 1: 50 time points with 3 features each
108
t = np.linspace(0, 4*np.pi, 50)
109
s1 = np.column_stack([
110
np.sin(t) + 0.1*np.random.randn(50), # X component
111
np.cos(t) + 0.1*np.random.randn(50), # Y component
112
np.sin(2*t) + 0.1*np.random.randn(50) # Z component
113
])
114
115
# Sequence 2: 45 time points with same 3 features (different timing)
116
t2 = np.linspace(0, 4*np.pi, 45)
117
s2 = np.column_stack([
118
np.sin(t2 * 1.1) + 0.1*np.random.randn(45),
119
np.cos(t2 * 1.1) + 0.1*np.random.randn(45),
120
np.sin(2*t2 * 1.1) + 0.1*np.random.randn(45)
121
])
122
123
print(f"Sequence 1 shape: {s1.shape}")
124
print(f"Sequence 2 shape: {s2.shape}")
125
126
# Compute multi-dimensional DTW distance
127
distance = dtw_ndim.distance(s1, s2)
128
print(f"Multi-dimensional DTW distance: {distance:.3f}")
129
130
# Compare with 1D DTW on individual components
131
from dtaidistance import dtw
132
133
distances_1d = []
134
for i in range(3):
135
dist_1d = dtw.distance(s1[:, i], s2[:, i])
136
distances_1d.append(dist_1d)
137
print(f"1D DTW distance for component {i}: {dist_1d:.3f}")
138
139
print(f"Sum of 1D distances: {sum(distances_1d):.3f}")
140
print(f"Multi-dimensional distance: {distance:.3f}")
141
```
142
143
### Motion Capture Data Analysis
144
145
```python
146
from dtaidistance import dtw_ndim
147
import numpy as np
148
import matplotlib.pyplot as plt
149
150
def create_motion_sequence(motion_type, length=100, noise_level=0.05):
151
"""Create synthetic motion capture data."""
152
t = np.linspace(0, 2*np.pi, length)
153
154
if motion_type == 'walking':
155
# Simulate walking motion (periodic)
156
x = 0.5 * np.sin(4*t) + noise_level * np.random.randn(length)
157
y = 0.3 * np.sin(8*t) + noise_level * np.random.randn(length)
158
z = 0.8 + 0.2 * np.cos(4*t) + noise_level * np.random.randn(length)
159
160
elif motion_type == 'running':
161
# Simulate running motion (faster, more variation)
162
x = 0.8 * np.sin(6*t) + noise_level * np.random.randn(length)
163
y = 0.5 * np.sin(12*t) + noise_level * np.random.randn(length)
164
z = 1.0 + 0.4 * np.cos(6*t) + noise_level * np.random.randn(length)
165
166
elif motion_type == 'jumping':
167
# Simulate jumping motion (sporadic vertical movement)
168
x = 0.1 * np.sin(2*t) + noise_level * np.random.randn(length)
169
y = 0.1 * np.cos(2*t) + noise_level * np.random.randn(length)
170
z = 1.0 + np.maximum(0, 0.8 * np.sin(3*t)) + noise_level * np.random.randn(length)
171
172
return np.column_stack([x, y, z])
173
174
# Generate motion sequences
175
np.random.seed(42)
176
walking1 = create_motion_sequence('walking', 80)
177
walking2 = create_motion_sequence('walking', 75)
178
running1 = create_motion_sequence('running', 60)
179
jumping1 = create_motion_sequence('jumping', 70)
180
181
motions = [walking1, walking2, running1, jumping1]
182
motion_labels = ['Walking 1', 'Walking 2', 'Running', 'Jumping']
183
184
# Compute distance matrix for motion comparison
185
distances = dtw_ndim.distance_matrix(motions, parallel=True)
186
187
print("Motion similarity matrix:")
188
print(" ", " ".join(f"{label:>8}" for label in motion_labels))
189
for i, label in enumerate(motion_labels):
190
row_str = f"{label:>8}: "
191
for j in range(len(motion_labels)):
192
row_str += f"{distances[i, j]:8.2f} "
193
print(row_str)
194
195
# Visualize the motion sequences
196
fig, axes = plt.subplots(2, 2, figsize=(12, 10), subplot_kw={'projection': '3d'})
197
axes = axes.flatten()
198
199
for i, (motion, label) in enumerate(zip(motions, motion_labels)):
200
ax = axes[i]
201
ax.plot(motion[:, 0], motion[:, 1], motion[:, 2], linewidth=2)
202
ax.set_title(f'{label} (3D Motion)')
203
ax.set_xlabel('X')
204
ax.set_ylabel('Y')
205
ax.set_zlabel('Z')
206
207
plt.tight_layout()
208
plt.show()
209
```
210
211
### Sensor Data Analysis
212
213
```python
214
from dtaidistance import dtw_ndim
215
import numpy as np
216
import matplotlib.pyplot as plt
217
218
# Simulate multi-sensor time series data
219
def generate_sensor_data(pattern_type, length=120, n_sensors=5):
220
"""Generate synthetic multi-sensor data."""
221
t = np.linspace(0, 10, length)
222
sensors = []
223
224
for sensor_id in range(n_sensors):
225
if pattern_type == 'normal':
226
# Normal operation pattern
227
signal = np.sin(0.5*t + sensor_id*0.2) + 0.1*np.random.randn(length)
228
elif pattern_type == 'anomaly':
229
# Anomalous pattern with spikes
230
signal = np.sin(0.5*t + sensor_id*0.2) + 0.1*np.random.randn(length)
231
# Add anomalous spikes
232
spike_indices = np.random.choice(length, size=5, replace=False)
233
signal[spike_indices] += 2.0 * np.random.randn(5)
234
elif pattern_type == 'drift':
235
# Pattern with sensor drift
236
drift = 0.02 * sensor_id * t
237
signal = np.sin(0.5*t + sensor_id*0.2) + drift + 0.1*np.random.randn(length)
238
239
sensors.append(signal)
240
241
return np.array(sensors).T # Shape: (time_points, sensors)
242
243
# Generate different sensor patterns
244
np.random.seed(42)
245
normal1 = generate_sensor_data('normal', 100, 4)
246
normal2 = generate_sensor_data('normal', 95, 4)
247
anomaly1 = generate_sensor_data('anomaly', 100, 4)
248
drift1 = generate_sensor_data('drift', 105, 4)
249
250
sensor_data = [normal1, normal2, anomaly1, drift1]
251
data_labels = ['Normal 1', 'Normal 2', 'Anomaly', 'Drift']
252
253
# Analyze sensor data similarities
254
print("Sensor data analysis:")
255
for i, data in enumerate(sensor_data):
256
print(f"{data_labels[i]}: shape {data.shape}")
257
258
# Compute DTW distances with constraints suitable for sensor data
259
distances = dtw_ndim.distance_matrix(
260
sensor_data,
261
window=10, # Reasonable temporal constraint
262
max_dist=100.0, # Early stopping for very different patterns
263
parallel=True
264
)
265
266
print("\\nSensor data similarity matrix:")
267
print(" ", " ".join(f"{label:>8}" for label in data_labels))
268
for i, label in enumerate(data_labels):
269
row_str = f"{label:>8}: "
270
for j in range(len(data_labels)):
271
row_str += f"{distances[i, j]:8.2f} "
272
print(row_str)
273
274
# Visualize sensor readings
275
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
276
axes = axes.flatten()
277
278
for i, (data, label) in enumerate(zip(sensor_data, data_labels)):
279
ax = axes[i]
280
for sensor_idx in range(data.shape[1]):
281
ax.plot(data[:, sensor_idx], label=f'Sensor {sensor_idx+1}', linewidth=1.5)
282
283
ax.set_title(f'{label} - Multi-Sensor Data')
284
ax.set_xlabel('Time')
285
ax.set_ylabel('Sensor Value')
286
ax.legend()
287
ax.grid(True)
288
289
plt.tight_layout()
290
plt.show()
291
```
292
293
### Feature Analysis and Dimensionality Effects
294
295
```python
296
from dtaidistance import dtw_ndim, dtw
297
import numpy as np
298
import matplotlib.pyplot as plt
299
300
def analyze_dimensionality_effects():
301
"""Analyze how dimensionality affects DTW distance calculation."""
302
303
np.random.seed(42)
304
base_length = 50
305
306
# Create base 1D signal
307
t = np.linspace(0, 4*np.pi, base_length)
308
base_signal = np.sin(t)
309
310
# Create variations with different numbers of dimensions
311
dimensions = [1, 2, 3, 5, 10, 20]
312
sequences_per_dim = []
313
314
for n_dim in dimensions:
315
# Create two similar sequences with n_dim features
316
seq1_features = []
317
seq2_features = []
318
319
for dim_idx in range(n_dim):
320
# Each dimension is the base signal with some variation
321
feature1 = base_signal + 0.1 * np.random.randn(base_length)
322
feature2 = base_signal + 0.15 * np.random.randn(base_length)
323
324
seq1_features.append(feature1)
325
seq2_features.append(feature2)
326
327
seq1 = np.column_stack(seq1_features) if n_dim > 1 else np.array(seq1_features[0])
328
seq2 = np.column_stack(seq2_features) if n_dim > 1 else np.array(seq2_features[0])
329
330
sequences_per_dim.append((seq1, seq2))
331
332
# Compute DTW distances for different dimensionalities
333
distances = []
334
for i, (seq1, seq2) in enumerate(sequences_per_dim):
335
if dimensions[i] == 1:
336
# Use 1D DTW
337
dist = dtw.distance(seq1, seq2)
338
else:
339
# Use N-dimensional DTW
340
dist = dtw_ndim.distance(seq1, seq2)
341
distances.append(dist)
342
print(f"Dimensionality {dimensions[i]:2d}: DTW distance = {dist:.3f}")
343
344
# Plot dimensionality vs distance
345
plt.figure(figsize=(10, 6))
346
plt.plot(dimensions, distances, 'bo-', linewidth=2, markersize=8)
347
plt.xlabel('Number of Dimensions')
348
plt.ylabel('DTW Distance')
349
plt.title('DTW Distance vs Dimensionality')
350
plt.grid(True)
351
plt.show()
352
353
analyze_dimensionality_effects()
354
```
355
356
### Integration with Clustering
357
358
```python
359
from dtaidistance import dtw_ndim, clustering
360
import numpy as np
361
import matplotlib.pyplot as plt
362
363
# Generate multi-dimensional time series clusters
364
np.random.seed(42)
365
366
def create_multidim_cluster(cluster_type, n_sequences=5, length=60, n_features=3):
367
"""Create a cluster of similar multi-dimensional sequences."""
368
sequences = []
369
370
for seq_idx in range(n_sequences):
371
t = np.linspace(0, 4*np.pi, length)
372
features = []
373
374
for feature_idx in range(n_features):
375
if cluster_type == 'sine':
376
# Sine-based cluster
377
base_freq = 1.0 + 0.1 * seq_idx
378
signal = np.sin(base_freq * t + feature_idx * 0.5) + 0.1 * np.random.randn(length)
379
elif cluster_type == 'cosine':
380
# Cosine-based cluster
381
base_freq = 1.2 + 0.1 * seq_idx
382
signal = np.cos(base_freq * t + feature_idx * 0.3) + 0.1 * np.random.randn(length)
383
elif cluster_type == 'linear':
384
# Linear trend cluster
385
slope = 0.5 + 0.1 * seq_idx + 0.05 * feature_idx
386
signal = slope * t + 0.2 * np.random.randn(length)
387
388
features.append(signal)
389
390
sequence = np.column_stack(features)
391
sequences.append(sequence)
392
393
return sequences
394
395
# Create three clusters of multi-dimensional sequences
396
cluster1 = create_multidim_cluster('sine', n_sequences=4, n_features=3)
397
cluster2 = create_multidim_cluster('cosine', n_sequences=4, n_features=3)
398
cluster3 = create_multidim_cluster('linear', n_sequences=3, n_features=3)
399
400
all_sequences = cluster1 + cluster2 + cluster3
401
true_labels = [0]*4 + [1]*4 + [2]*3
402
403
print(f"Created {len(all_sequences)} multi-dimensional sequences")
404
print(f"Sequence shapes: {[seq.shape for seq in all_sequences[:3]]}...")
405
406
# Perform clustering using multi-dimensional DTW
407
clusterer = clustering.Hierarchical(
408
dists_fun=dtw_ndim.distance_matrix,
409
dists_options={'window': 10, 'parallel': True},
410
show_progress=True
411
)
412
413
cluster_result = clusterer.fit(all_sequences)
414
print(f"Clustering completed with {len(cluster_result)} nodes")
415
416
# Visualize some of the multi-dimensional sequences
417
fig, axes = plt.subplots(3, 3, figsize=(15, 12))
418
419
for cluster_idx in range(3):
420
start_idx = sum([4, 4, 3][:cluster_idx])
421
end_idx = start_idx + [4, 4, 3][cluster_idx]
422
423
for seq_idx in range(3): # Show first 3 sequences from each cluster
424
if start_idx + seq_idx < end_idx:
425
ax = axes[cluster_idx, seq_idx]
426
sequence = all_sequences[start_idx + seq_idx]
427
428
for feature_idx in range(sequence.shape[1]):
429
ax.plot(sequence[:, feature_idx],
430
label=f'Feature {feature_idx+1}', linewidth=1.5)
431
432
ax.set_title(f'Cluster {cluster_idx+1}, Sequence {seq_idx+1}')
433
ax.legend()
434
ax.grid(True)
435
436
plt.tight_layout()
437
plt.show()
438
439
print("Multi-dimensional clustering analysis completed")
440
```
441
442
## Performance Considerations
443
444
### Memory Usage
445
Multi-dimensional DTW requires more memory due to:
446
- Larger sequence storage (length × features)
447
- Euclidean distance computations for each time point pair
448
- Feature vector operations in warping path calculations
449
450
### Computational Complexity
451
- **Time complexity**: O(n × m × d) where n, m are sequence lengths and d is number of features
452
- **Space complexity**: O(n × m) for the warping paths matrix (same as 1D)
453
- **Feature scaling**: Consider normalizing features if they have different scales
454
455
### Optimization Strategies
456
1. **Feature selection**: Remove irrelevant or redundant features
457
2. **Dimensionality reduction**: Use PCA or other techniques if appropriate
458
3. **Constraint usage**: Apply window and distance constraints more aggressively
459
4. **Parallel processing**: Enable parallel computation for distance matrices
460
5. **Feature normalization**: Ensure features are on similar scales
461
462
The N-dimensional DTW module extends all the powerful capabilities of standard DTW to complex multi-feature temporal data, enabling sophisticated analysis of sensor arrays, motion capture, financial indicators, and other multi-variate time series.