or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

alignment.mdclustering.mdcore-dtw.mddistance-matrices.mdindex.mdndim-dtw.mdvisualization.mdwarping-paths.mdweighted-dtw.md

weighted-dtw.mddocs/

0

# Weighted DTW and Machine Learning

1

2

Advanced DTW with custom weighting functions and machine learning integration for learning optimal feature weights from labeled data. This module enables domain-specific DTW customization through learned weights, decision tree-based feature importance, and constraint incorporation from must-link/cannot-link relationships.

3

4

## Capabilities

5

6

### Weighted DTW Core Functions

7

8

DTW computation with custom weighting functions that modify the local distance calculations based on learned or domain-specific importance patterns.

9

10

```python { .api }

11

def warping_paths(s1, s2, weights=None, window=None, **kwargs):

12

"""

13

DTW with custom weight functions.

14

15

Applies position-dependent or feature-dependent weights to modify

16

the local distance computation during DTW alignment.

17

18

Parameters:

19

- s1, s2: array-like, input sequences

20

- weights: array-like/function, weight values or weighting function

21

- window: int, warping window constraint

22

- **kwargs: additional DTW parameters

23

24

Returns:

25

tuple: (distance, paths_matrix)

26

- distance: float, weighted DTW distance

27

- paths_matrix: 2D array, accumulated cost matrix with weights applied

28

"""

29

30

def distance_matrix(s, weights, window=None, show_progress=False, **kwargs):

31

"""

32

Distance matrix computation with weights.

33

34

Computes pairwise weighted DTW distances between all sequences

35

in a collection using the specified weighting scheme.

36

37

Parameters:

38

- s: list/array, collection of sequences

39

- weights: array-like/function, weights to apply during distance computation

40

- window: int, warping window constraint

41

- show_progress: bool, display progress bar

42

- **kwargs: additional DTW parameters

43

44

Returns:

45

array: weighted distance matrix

46

"""

47

```

48

49

### Machine Learning Integration

50

51

Learn optimal weights from labeled time series data using decision tree algorithms specifically designed for temporal data analysis.

52

53

```python { .api }

54

def compute_weights_using_dt(series, labels, prototypeidx, **kwargs):

55

"""

56

Learn weights using decision trees.

57

58

Trains decision tree classifiers to identify discriminative time points

59

or features for distinguishing between different time series classes.

60

61

Parameters:

62

- series: list/array, collection of time series sequences

63

- labels: array-like, class labels for each sequence

64

- prototypeidx: int, index of prototype sequence for each class

65

- **kwargs: additional parameters for decision tree training

66

67

Returns:

68

tuple: (weights, importances)

69

- weights: array, learned importance weights for time points/features

70

- importances: array, feature importance scores from decision trees

71

"""

72

73

def series_to_dt(series, labels, prototypeidx, classifier=None, max_clfs=None,

74

min_ig=0, **kwargs):

75

"""

76

Convert time series to decision tree features.

77

78

Extracts features from time series data and prepares them for

79

decision tree classification, enabling weight learning.

80

81

Parameters:

82

- series: list/array, time series collection

83

- labels: array-like, class labels

84

- prototypeidx: int, prototype sequence indices

85

- classifier: classifier object, optional pre-configured classifier

86

- max_clfs: int, maximum number of classifiers to train

87

- min_ig: float, minimum information gain threshold

88

- **kwargs: additional feature extraction parameters

89

90

Returns:

91

tuple: (ml_values, cl_values, classifiers, importances)

92

- ml_values: array, must-link constraint values

93

- cl_values: array, cannot-link constraint values

94

- classifiers: list, trained decision tree classifiers

95

- importances: array, feature importance scores

96

"""

97

```

98

99

### Weight Computation from Constraints

100

101

Convert must-link and cannot-link constraints into weight values for DTW distance computation.

102

103

```python { .api }

104

def compute_weights_from_mlclvalues(serie, ml_values, cl_values, only_max=False,

105

strict_cl=True, **kwargs):

106

"""

107

Compute weights from must-link/cannot-link values.

108

109

Converts constraint information (which time points should be linked

110

vs separated) into weight values for biasing DTW computations.

111

112

Parameters:

113

- serie: array-like, reference time series sequence

114

- ml_values: array, must-link constraint strengths

115

- cl_values: array, cannot-link constraint strengths

116

- only_max: bool, use only maximum constraint values

117

- strict_cl: bool, apply cannot-link constraints strictly

118

- **kwargs: additional weight computation parameters

119

120

Returns:

121

array: computed weight values for DTW distance modification

122

"""

123

```

124

125

### Visualization Integration

126

127

Specialized plotting functions for visualizing learned weights and their effects on time series analysis.

128

129

```python { .api }

130

def plot_margins(serie, weights, filename=None, ax=None, origin=(0, 0),

131

scaling=(1, 1), y_limit=None, importances=None):

132

"""

133

Plot weight margins on time series.

134

135

Visualizes the learned or assigned weights overlaid on the time series,

136

showing which time points or regions are considered most important.

137

138

Parameters:

139

- serie: array-like, time series sequence to plot

140

- weights: array-like, weight values corresponding to time points

141

- filename: str, optional file path to save plot

142

- ax: matplotlib axis, optional axis for plotting

143

- origin: tuple, plot origin coordinates

144

- scaling: tuple, scaling factors for axes

145

- y_limit: tuple, y-axis limits

146

- importances: array, optional feature importance values

147

148

Returns:

149

tuple: (figure, axes) matplotlib objects

150

"""

151

```

152

153

### Decision Tree Classifier

154

155

Custom decision tree implementation optimized for time series weight learning with temporal-specific splitting criteria.

156

157

```python { .api }

158

class DecisionTreeClassifier:

159

"""

160

Custom decision tree for DTW weight learning.

161

162

Specialized decision tree that considers temporal relationships

163

and DTW-specific constraints when learning feature importance.

164

"""

165

166

def __init__(self):

167

"""Initialize decision tree classifier."""

168

169

def fit(self, features, targets, use_feature_once=True,

170

ignore_features=None, min_ig=0):

171

"""

172

Train decision tree classifier.

173

174

Parameters:

175

- features: array, feature matrix from time series

176

- targets: array, target labels for classification

177

- use_feature_once: bool, prevent reusing features in same path

178

- ignore_features: list, features to exclude from consideration

179

- min_ig: float, minimum information gain for splits

180

181

Returns:

182

self: fitted classifier

183

"""

184

185

def score(self, max_kd):

186

"""

187

Calculate classifier score.

188

189

Parameters:

190

- max_kd: float, maximum k-distance threshold

191

192

Returns:

193

float: classifier performance score

194

"""

195

196

@staticmethod

197

def entropy(targets):

198

"""

199

Calculate entropy of target distribution.

200

201

Parameters:

202

- targets: array, target labels

203

204

Returns:

205

float: entropy value

206

"""

207

208

@staticmethod

209

def informationgain_continuous(features, targets, threshold):

210

"""

211

Calculate information gain for continuous features.

212

213

Parameters:

214

- features: array, feature values

215

- targets: array, target labels

216

- threshold: float, split threshold

217

218

Returns:

219

float: information gain value

220

"""

221

222

@staticmethod

223

def kdistance(point1, point2):

224

"""

225

Calculate k-distance between points.

226

227

Parameters:

228

- point1, point2: array-like, data points

229

230

Returns:

231

float: k-distance value

232

"""

233

234

class Tree:

235

"""

236

Decision tree representation for weight learning.

237

238

Represents the structure of learned decision trees with

239

nodes, splits, and importance information.

240

"""

241

242

def add(self):

243

"""

244

Add new node to the tree.

245

246

Returns:

247

int: new node identifier

248

"""

249

250

@property

251

def nb_nodes(self):

252

"""

253

Get number of nodes in tree.

254

255

Returns:

256

int: node count

257

"""

258

259

@property

260

def used_features(self):

261

"""

262

Get set of features used in tree.

263

264

Returns:

265

set: feature indices used in decision tree

266

"""

267

268

@property

269

def depth(self):

270

"""

271

Get tree depth.

272

273

Returns:

274

int: maximum depth of decision tree

275

"""

276

```

277

278

## Usage Examples

279

280

### Basic Weighted DTW

281

282

```python

283

from dtaidistance import dtw_weighted

284

import numpy as np

285

import matplotlib.pyplot as plt

286

287

# Create time series with known important regions

288

np.random.seed(42)

289

t = np.linspace(0, 4*np.pi, 100)

290

291

# Base sequences

292

s1 = np.sin(t) + 0.1 * np.random.randn(100)

293

s2 = np.sin(t * 1.1) + 0.1 * np.random.randn(100)

294

295

# Define custom weights (higher weights = more important)

296

# Make the middle section more important

297

weights = np.ones(100)

298

weights[30:70] = 3.0 # Emphasize middle region

299

weights[45:55] = 5.0 # Highly emphasize center

300

301

# Compute weighted DTW

302

weighted_distance, weighted_paths = dtw_weighted.warping_paths(s1, s2, weights=weights)

303

304

# Compare with unweighted DTW

305

from dtaidistance import dtw

306

unweighted_distance, unweighted_paths = dtw.warping_paths(s1, s2)

307

308

print(f"Unweighted DTW distance: {unweighted_distance:.3f}")

309

print(f"Weighted DTW distance: {weighted_distance:.3f}")

310

311

# Visualize the effect of weighting

312

fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(12, 10))

313

314

# Plot sequences with weights

315

ax1.plot(s1, 'b-', label='Sequence 1', linewidth=2)

316

ax1.plot(s2, 'r-', label='Sequence 2', linewidth=2)

317

ax1_twin = ax1.twinx()

318

ax1_twin.fill_between(range(len(weights)), 0, weights, alpha=0.3, color='green', label='Weights')

319

ax1.set_title('Time Series with Weight Distribution')

320

ax1.legend(loc='upper left')

321

ax1_twin.legend(loc='upper right')

322

ax1.grid(True)

323

324

# Plot unweighted warping paths

325

ax2.imshow(unweighted_paths, cmap='viridis', origin='lower')

326

ax2.set_title('Unweighted DTW Warping Paths')

327

ax2.set_xlabel('Sequence 2 Index')

328

ax2.set_ylabel('Sequence 1 Index')

329

330

# Plot weighted warping paths

331

ax3.imshow(weighted_paths, cmap='viridis', origin='lower')

332

ax3.set_title('Weighted DTW Warping Paths')

333

ax3.set_xlabel('Sequence 2 Index')

334

ax3.set_ylabel('Sequence 1 Index')

335

336

plt.tight_layout()

337

plt.show()

338

```

339

340

### Learning Weights from Labeled Data

341

342

```python

343

from dtaidistance import dtw_weighted

344

import numpy as np

345

import matplotlib.pyplot as plt

346

347

# Generate synthetic labeled time series data

348

np.random.seed(42)

349

350

def generate_class_data(class_type, n_samples=10, length=80):

351

"""Generate time series data for different classes."""

352

t = np.linspace(0, 4*np.pi, length)

353

sequences = []

354

355

for i in range(n_samples):

356

if class_type == 'sine':

357

# Sine waves with characteristic frequency

358

freq = 1.0 + 0.1 * np.random.randn()

359

signal = np.sin(freq * t) + 0.1 * np.random.randn(length)

360

# Add discriminative spike in middle region

361

spike_pos = length // 2 + np.random.randint(-5, 6)

362

signal[spike_pos] += 1.5

363

364

elif class_type == 'cosine':

365

# Cosine waves with characteristic frequency

366

freq = 1.2 + 0.1 * np.random.randn()

367

signal = np.cos(freq * t) + 0.1 * np.random.randn(length)

368

# Add discriminative dip in first quarter

369

dip_pos = length // 4 + np.random.randint(-5, 6)

370

signal[dip_pos] -= 1.0

371

372

elif class_type == 'linear':

373

# Linear trends with characteristic slope

374

slope = 0.5 + 0.2 * np.random.randn()

375

signal = slope * np.linspace(0, 1, length) + 0.1 * np.random.randn(length)

376

# Add discriminative oscillation in last quarter

377

osc_region = slice(3*length//4, length)

378

signal[osc_region] += 0.5 * np.sin(8 * t[osc_region])

379

380

sequences.append(signal)

381

382

return sequences

383

384

# Generate training data

385

class_sine = generate_class_data('sine', n_samples=8)

386

class_cosine = generate_class_data('cosine', n_samples=8)

387

class_linear = generate_class_data('linear', n_samples=6)

388

389

all_sequences = class_sine + class_cosine + class_linear

390

all_labels = [0] * 8 + [1] * 8 + [2] * 6

391

392

print(f"Generated {len(all_sequences)} labeled sequences")

393

print(f"Class distribution: {np.bincount(all_labels)}")

394

395

# Select prototype sequences (representative of each class)

396

prototype_indices = [0, 8, 16] # First sequence from each class

397

398

# Learn weights using decision trees

399

try:

400

weights, importances = dtw_weighted.compute_weights_using_dt(

401

all_sequences,

402

all_labels,

403

prototype_indices,

404

max_clfs=5,

405

min_ig=0.01

406

)

407

408

print(f"Learned weights shape: {weights.shape}")

409

print(f"Weight statistics: min={np.min(weights):.3f}, max={np.max(weights):.3f}, mean={np.mean(weights):.3f}")

410

411

# Visualize learned weights for prototype sequences

412

fig, axes = plt.subplots(3, 2, figsize=(14, 12))

413

414

class_names = ['Sine', 'Cosine', 'Linear']

415

for class_idx in range(3):

416

proto_seq = all_sequences[prototype_indices[class_idx]]

417

418

# Plot prototype sequence

419

axes[class_idx, 0].plot(proto_seq, 'b-', linewidth=2)

420

axes[class_idx, 0].set_title(f'{class_names[class_idx]} Class - Prototype Sequence')

421

axes[class_idx, 0].grid(True)

422

423

# Plot learned weights (assuming weights correspond to time points)

424

if weights.ndim > 1:

425

class_weights = weights[class_idx] if weights.shape[0] == 3 else weights[0]

426

else:

427

class_weights = weights

428

429

axes[class_idx, 1].plot(class_weights, 'r-', linewidth=2)

430

axes[class_idx, 1].set_title(f'{class_names[class_idx]} Class - Learned Weights')

431

axes[class_idx, 1].set_ylabel('Weight Importance')

432

axes[class_idx, 1].grid(True)

433

434

plt.tight_layout()

435

plt.show()

436

437

except Exception as e:

438

print(f"Weight learning failed: {e}")

439

print("Using uniform weights for demonstration")

440

weights = np.ones(len(all_sequences[0]))

441

```

442

443

### Must-Link/Cannot-Link Constraints

444

445

```python

446

from dtaidistance import dtw_weighted

447

import numpy as np

448

449

# Create sequences with known constraint relationships

450

np.random.seed(42)

451

452

# Reference sequence

453

reference = np.sin(np.linspace(0, 4*np.pi, 60)) + 0.1 * np.random.randn(60)

454

455

# Sequence that should be similar (must-link)

456

similar_seq = reference + 0.2 * np.random.randn(60)

457

458

# Sequence that should be different (cannot-link)

459

different_seq = np.cos(np.linspace(0, 6*np.pi, 60)) + 0.1 * np.random.randn(60)

460

461

# Define must-link and cannot-link constraint values

462

# Higher values indicate stronger constraints

463

ml_values = np.zeros(len(reference))

464

cl_values = np.zeros(len(reference))

465

466

# Strong must-link constraints in middle region (these points should align)

467

ml_values[20:40] = 2.0

468

ml_values[28:32] = 5.0 # Very strong constraint

469

470

# Strong cannot-link constraints at the ends (these should not align)

471

cl_values[0:10] = 3.0

472

cl_values[50:60] = 3.0

473

474

# Compute weights from constraints

475

constraint_weights = dtw_weighted.compute_weights_from_mlclvalues(

476

reference,

477

ml_values,

478

cl_values,

479

only_max=False,

480

strict_cl=True

481

)

482

483

print(f"Constraint weights shape: {constraint_weights.shape}")

484

print(f"Weight range: [{np.min(constraint_weights):.3f}, {np.max(constraint_weights):.3f}]")

485

486

# Apply constraint-based weights to DTW computations

487

from dtaidistance import dtw

488

489

# Regular DTW distances

490

dist_ref_similar = dtw.distance(reference, similar_seq)

491

dist_ref_different = dtw.distance(reference, different_seq)

492

493

# Weighted DTW distances (if implementation supports it)

494

try:

495

weighted_dist_similar, _ = dtw_weighted.warping_paths(reference, similar_seq, weights=constraint_weights)

496

weighted_dist_different, _ = dtw_weighted.warping_paths(reference, different_seq, weights=constraint_weights)

497

498

print("\\nDistance Comparison:")

499

print(f"Reference vs Similar (regular): {dist_ref_similar:.3f}")

500

print(f"Reference vs Similar (weighted): {weighted_dist_similar:.3f}")

501

print(f"Reference vs Different (regular): {dist_ref_different:.3f}")

502

print(f"Reference vs Different (weighted): {weighted_dist_different:.3f}")

503

504

except Exception as e:

505

print(f"Weighted distance computation failed: {e}")

506

507

# Visualize constraints and weights

508

import matplotlib.pyplot as plt

509

510

fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(14, 10))

511

512

# Plot sequences

513

ax1.plot(reference, 'b-', label='Reference', linewidth=2)

514

ax1.plot(similar_seq, 'g--', label='Similar (Must-Link)', linewidth=2)

515

ax1.plot(different_seq, 'r:', label='Different (Cannot-Link)', linewidth=2)

516

ax1.set_title('Time Series with Constraint Relationships')

517

ax1.legend()

518

ax1.grid(True)

519

520

# Plot must-link constraints

521

ax2.fill_between(range(len(ml_values)), 0, ml_values, alpha=0.7, color='green')

522

ax2.set_title('Must-Link Constraints')

523

ax2.set_ylabel('Constraint Strength')

524

ax2.grid(True)

525

526

# Plot cannot-link constraints

527

ax3.fill_between(range(len(cl_values)), 0, cl_values, alpha=0.7, color='red')

528

ax3.set_title('Cannot-Link Constraints')

529

ax3.set_ylabel('Constraint Strength')

530

ax3.grid(True)

531

532

# Plot computed weights

533

ax4.plot(constraint_weights, 'purple', linewidth=2)

534

ax4.set_title('Computed Constraint Weights')

535

ax4.set_ylabel('Weight Value')

536

ax4.set_xlabel('Time Point')

537

ax4.grid(True)

538

539

plt.tight_layout()

540

plt.show()

541

```

542

543

### Custom Decision Tree Weight Learning

544

545

```python

546

from dtaidistance.dtw_weighted import DecisionTreeClassifier, Tree

547

import numpy as np

548

549

# Generate training data with clear discriminative patterns

550

np.random.seed(42)

551

552

def create_discriminative_series(class_id, n_samples=15, length=50):

553

"""Create series with class-specific discriminative patterns."""

554

series_list = []

555

556

for i in range(n_samples):

557

t = np.linspace(0, 2*np.pi, length)

558

559

if class_id == 0:

560

# Class 0: Peak in first third

561

signal = 0.2 * np.random.randn(length)

562

peak_pos = length // 3 + np.random.randint(-3, 4)

563

signal[peak_pos] = 2.0 + 0.3 * np.random.randn()

564

565

elif class_id == 1:

566

# Class 1: Peak in middle third

567

signal = 0.2 * np.random.randn(length)

568

peak_pos = length // 2 + np.random.randint(-3, 4)

569

signal[peak_pos] = 2.0 + 0.3 * np.random.randn()

570

571

else:

572

# Class 2: Peak in last third

573

signal = 0.2 * np.random.randn(length)

574

peak_pos = 2 * length // 3 + np.random.randint(-3, 4)

575

signal[peak_pos] = 2.0 + 0.3 * np.random.randn()

576

577

series_list.append(signal)

578

579

return series_list

580

581

# Generate training data

582

class0_series = create_discriminative_series(0, n_samples=10)

583

class1_series = create_discriminative_series(1, n_samples=10)

584

class2_series = create_discriminative_series(2, n_samples=8)

585

586

all_training_series = class0_series + class1_series + class2_series

587

training_labels = [0] * 10 + [1] * 10 + [2] * 8

588

589

print(f"Training data: {len(all_training_series)} series")

590

print(f"Class distribution: {np.bincount(training_labels)}")

591

592

# Extract features for decision tree (simple: use sequence values as features)

593

feature_matrix = np.array(all_training_series)

594

print(f"Feature matrix shape: {feature_matrix.shape}")

595

596

# Train custom decision tree

597

dt_classifier = DecisionTreeClassifier()

598

599

try:

600

dt_classifier.fit(

601

feature_matrix,

602

training_labels,

603

use_feature_once=False, # Allow reusing time points

604

min_ig=0.1 # Require reasonable information gain

605

)

606

607

# Get classifier score

608

score = dt_classifier.score(max_kd=1.0)

609

print(f"Decision tree classifier score: {score:.3f}")

610

611

# Create and analyze tree structure

612

tree = Tree()

613

for i in range(5): # Add some nodes for demonstration

614

node_id = tree.add()

615

print(f"Added node {node_id}")

616

617

print(f"Tree statistics:")

618

print(f" Number of nodes: {tree.nb_nodes}")

619

print(f" Tree depth: {tree.depth}")

620

print(f" Used features: {len(tree.used_features)} out of {feature_matrix.shape[1]}")

621

622

except Exception as e:

623

print(f"Decision tree training failed: {e}")

624

625

# Visualize the discriminative patterns

626

import matplotlib.pyplot as plt

627

628

fig, axes = plt.subplots(3, 1, figsize=(12, 10))

629

630

class_names = ['Early Peak', 'Middle Peak', 'Late Peak']

631

class_data = [class0_series, class1_series, class2_series]

632

633

for class_idx, (class_series, class_name) in enumerate(zip(class_data, class_names)):

634

ax = axes[class_idx]

635

636

# Plot all series in the class

637

for i, series in enumerate(class_series[:5]): # Show first 5

638

ax.plot(series, alpha=0.6, linewidth=1)

639

640

# Plot class average

641

class_mean = np.mean(class_series, axis=0)

642

ax.plot(class_mean, 'k-', linewidth=3, label='Class Average')

643

644

ax.set_title(f'Class {class_idx}: {class_name}')

645

ax.legend()

646

ax.grid(True)

647

648

plt.tight_layout()

649

plt.show()

650

```

651

652

This comprehensive weighted DTW module enables sophisticated customization of DTW distance computation through learned weights, constraint incorporation, and machine learning integration, making it possible to adapt DTW for domain-specific applications with prior knowledge or labeled training data.