or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

classical-models.mddata-utilities.mddeep-learning-models.mdensemble-models.mdindex.mdmodern-models.md

deep-learning-models.mddocs/

0

# Deep Learning Models

1

2

Neural network-based outlier detection methods that excel with high-dimensional data and complex patterns. PyOD provides a comprehensive PyTorch-based framework with 12+ deep learning models for anomaly detection.

3

4

## Capabilities

5

6

### Autoencoder

7

8

Uses reconstruction error from a neural autoencoder as the outlier score. Normal data should reconstruct well, while outliers will have high reconstruction errors.

9

10

```python { .api }

11

class AutoEncoder:

12

def __init__(self, hidden_neuron_list=[64, 32], hidden_activation='relu',

13

output_activation='sigmoid', loss='mse', optimizer='adam',

14

epochs=100, batch_size=32, dropout_rate=0.2,

15

l2_regularizer=0.1, validation_size=0.1, preprocessing=True,

16

verbose=1, random_state=None, contamination=0.1):

17

"""

18

Parameters:

19

- hidden_neuron_list (list): Number of neurons per hidden layer

20

- hidden_activation (str): Activation function for hidden layers

21

- output_activation (str): Activation function for output layer

22

- loss (str): Loss function ('mse', 'mae')

23

- optimizer (str): Optimizer ('adam', 'sgd', 'rmsprop')

24

- epochs (int): Number of training epochs

25

- batch_size (int): Training batch size

26

- dropout_rate (float): Dropout rate for regularization

27

- l2_regularizer (float): L2 regularization strength

28

- validation_size (float): Fraction of data for validation

29

- contamination (float): Proportion of outliers in dataset

30

"""

31

```

32

33

Usage example:

34

```python

35

from pyod.models.auto_encoder import AutoEncoder

36

from pyod.utils.data import generate_data

37

38

X_train, X_test, y_train, y_test = generate_data(

39

n_train=500, n_test=200, n_features=10, contamination=0.1, random_state=42

40

)

41

42

clf = AutoEncoder(

43

hidden_neuron_list=[64, 32, 16, 32, 64],

44

epochs=100,

45

batch_size=32,

46

contamination=0.1

47

)

48

clf.fit(X_train)

49

y_pred = clf.predict(X_test)

50

```

51

52

### Variational Autoencoder (VAE)

53

54

Uses the reconstruction probability from a variational autoencoder, incorporating uncertainty in the latent representation for more robust anomaly detection.

55

56

```python { .api }

57

class VAE:

58

def __init__(self, encoder_neurons=[32, 16], decoder_neurons=[16, 32],

59

latent_dim=2, hidden_activation='relu', output_activation='sigmoid',

60

loss='mse', optimizer='adam', epochs=100, batch_size=32,

61

dropout_rate=0.2, l2_regularizer=0.1, validation_size=0.1,

62

preprocessing=True, verbose=1, random_state=None,

63

contamination=0.1, gamma=1.0, capacity=0.0):

64

"""

65

Parameters:

66

- encoder_neurons (list): Neurons in encoder layers

67

- decoder_neurons (list): Neurons in decoder layers

68

- latent_dim (int): Dimensionality of latent space

69

- gamma (float): Weight for KL divergence loss

70

- capacity (float): Capacity parameter for β-VAE

71

- Other parameters same as AutoEncoder

72

"""

73

```

74

75

### Deep Support Vector Data Description (DeepSVDD)

76

77

Trains a neural network to map normal data to a hypersphere with minimal volume. Points far from the hypersphere center are considered outliers.

78

79

```python { .api }

80

class DeepSVDD:

81

def __init__(self, c=None, use_ae=False, hidden_neurons=[64, 32],

82

hidden_activation='relu', output_activation='linear',

83

optimizer='adam', epochs=100, batch_size=32, dropout_rate=0.2,

84

l2_regularizer=0.1, validation_size=0.1, preprocessing=True,

85

verbose=1, random_state=None, contamination=0.1):

86

"""

87

Parameters:

88

- c (array): Center of hypersphere (computed automatically if None)

89

- use_ae (bool): Whether to pre-train with autoencoder

90

- hidden_neurons (list): Number of neurons per hidden layer

91

- Other parameters same as AutoEncoder

92

"""

93

```

94

95

### Single-Objective Generative Adversarial Active Learning (SO_GAAL)

96

97

Uses generative adversarial networks with active learning to improve outlier detection by generating synthetic outliers.

98

99

```python { .api }

100

class SO_GAAL:

101

def __init__(self, contamination=0.1, stop_epochs=20, lr_d=0.01, lr_g=0.0001,

102

decay=1e-6, momentum=0.9, verbose=0):

103

"""

104

Parameters:

105

- contamination (float): Proportion of outliers in dataset

106

- stop_epochs (int): Number of epochs for early stopping

107

- lr_d (float): Learning rate for discriminator

108

- lr_g (float): Learning rate for generator

109

- decay (float): Learning rate decay

110

- momentum (float): Momentum for optimization

111

- verbose (int): Verbosity level

112

"""

113

```

114

115

### Multi-Objective Generative Adversarial Active Learning (MO_GAAL)

116

117

Extends SO_GAAL with multiple objectives to improve the diversity and quality of generated outliers.

118

119

```python { .api }

120

class MO_GAAL:

121

def __init__(self, k=10, stop_epochs=20, lr_d=0.01, lr_g=0.0001,

122

decay=1e-6, momentum=0.9, contamination=0.1, verbose=0):

123

"""

124

Parameters:

125

- k (int): Number of sub-generators

126

- stop_epochs (int): Number of epochs for early stopping

127

- lr_d (float): Learning rate for discriminator

128

- lr_g (float): Learning rate for generator

129

- contamination (float): Proportion of outliers in dataset

130

"""

131

```

132

133

### Adversarially Learned Anomaly Detection (ALAD)

134

135

Bidirectional generative adversarial network that learns to map data to latent space and back, using reconstruction errors for anomaly detection.

136

137

```python { .api }

138

class ALAD:

139

def __init__(self, contamination=0.1, preprocessing=True, lr_d=0.0001,

140

lr_g=0.0001, decay=1e-6, momentum=0.9, epoch_num=500,

141

verbose=0, device=None):

142

"""

143

Parameters:

144

- contamination (float): Proportion of outliers in dataset

145

- preprocessing (bool): Whether to preprocess data

146

- lr_d (float): Learning rate for discriminator

147

- lr_g (float): Learning rate for generator

148

- epoch_num (int): Number of training epochs

149

- device (str): PyTorch device ('cpu', 'cuda')

150

"""

151

```

152

153

### Anomaly Detection with Generative Adversarial Networks (AnoGAN)

154

155

Uses a GAN trained on normal data and detects anomalies based on reconstruction error and discrimination scores.

156

157

```python { .api }

158

class AnoGAN:

159

def __init__(self, contamination=0.1, preprocessing=True, lr_d=0.0001,

160

lr_g=0.0001, decay=1e-6, momentum=0.9, epoch_num=500,

161

verbose=0, device=None):

162

"""

163

Parameters:

164

- contamination (float): Proportion of outliers in dataset

165

- preprocessing (bool): Whether to preprocess data

166

- lr_d (float): Learning rate for discriminator

167

- lr_g (float): Learning rate for generator

168

- epoch_num (int): Number of training epochs

169

- device (str): PyTorch device ('cpu', 'cuda')

170

"""

171

```

172

173

### DevNet

174

175

Deep anomaly detection network that uses deviation loss to explicitly optimize for anomaly detection rather than reconstruction.

176

177

```python { .api }

178

class DevNet:

179

def __init__(self, contamination=0.1, preprocessing=True, lr_d=0.0001,

180

epochs=100, verbose=0, device=None):

181

"""

182

Parameters:

183

- contamination (float): Proportion of outliers in dataset

184

- preprocessing (bool): Whether to preprocess data

185

- lr_d (float): Learning rate

186

- epochs (int): Number of training epochs

187

- device (str): PyTorch device ('cpu', 'cuda')

188

"""

189

```

190

191

### Deep Isolation Forest (DIF)

192

193

Combines the benefits of Isolation Forest with deep learning by using neural networks to create better splitting criteria.

194

195

```python { .api }

196

class DIF:

197

def __init__(self, n_ensemble=50, n_estimators=6, max_samples=256,

198

max_depth=8, contamination=0.1, random_state=None,

199

device=None):

200

"""

201

Parameters:

202

- n_ensemble (int): Number of ensemble models

203

- n_estimators (int): Number of estimators per ensemble

204

- max_samples (int): Maximum samples per estimator

205

- max_depth (int): Maximum depth of isolation trees

206

- contamination (float): Proportion of outliers in dataset

207

- device (str): PyTorch device ('cpu', 'cuda')

208

"""

209

```

210

211

### Additional Deep Learning Models

212

213

```python { .api }

214

class AE1SVM:

215

"""Autoencoder + One-Class SVM combination"""

216

def __init__(self, contamination=0.1, preprocessing=True, **kwargs): ...

217

218

class XGBOD:

219

"""Extreme Gradient Boosting Outlier Detection"""

220

def __init__(self, contamination=0.1, max_depth=3, learning_rate=0.1, **kwargs): ...

221

```

222

223

## Usage Patterns

224

225

Deep learning models require more careful parameter tuning and longer training times:

226

227

```python

228

from pyod.models.auto_encoder import AutoEncoder

229

from pyod.utils.data import generate_data

230

import numpy as np

231

232

# Generate higher-dimensional data for deep learning

233

X_train, X_test, y_train, y_test = generate_data(

234

n_train=1000, n_test=300, n_features=20,

235

contamination=0.1, random_state=42

236

)

237

238

# Configure autoencoder architecture

239

clf = AutoEncoder(

240

hidden_neuron_list=[32, 16, 8, 16, 32], # Encoder-decoder architecture

241

hidden_activation='relu',

242

output_activation='sigmoid',

243

loss='mse',

244

optimizer='adam',

245

epochs=100,

246

batch_size=32,

247

dropout_rate=0.1,

248

l2_regularizer=0.01,

249

validation_size=0.1,

250

preprocessing=True,

251

contamination=0.1,

252

verbose=1

253

)

254

255

# Fit with early stopping based on validation loss

256

clf.fit(X_train)

257

258

# Get predictions and scores

259

y_pred = clf.predict(X_test)

260

scores = clf.decision_function(X_test)

261

```

262

263

## Model Selection Guidelines

264

265

### AutoEncoder

266

- **Best for**: High-dimensional data, images, when reconstruction-based detection is appropriate

267

- **Architecture**: Symmetric encoder-decoder with bottleneck layer

268

- **Training time**: Medium, depends on architecture complexity

269

270

### VAE

271

- **Best for**: When uncertainty quantification is important, probabilistic modeling

272

- **Architecture**: Encoder to latent distribution, decoder from samples

273

- **Training time**: Medium to high due to KL divergence computation

274

275

### DeepSVDD

276

- **Best for**: When you want to learn a compact representation of normal data

277

- **Architecture**: Deep network mapping to hypersphere

278

- **Training time**: Medium, can benefit from autoencoder pre-training

279

280

### SO_GAAL/MO_GAAL

281

- **Best for**: When you have limited labeled anomalies and want active learning

282

- **Architecture**: GAN with discriminator for anomaly detection

283

- **Training time**: High due to adversarial training

284

285

### DevNet

286

- **Best for**: When you have some labeled anomalies for supervised training

287

- **Architecture**: Deep network optimized specifically for anomaly detection

288

- **Training time**: Medium, more stable than GAN-based methods

289

290

## Hardware Requirements

291

292

Deep learning models benefit significantly from GPU acceleration:

293

294

```python

295

# Enable GPU if available

296

import torch

297

device = 'cuda' if torch.cuda.is_available() else 'cpu'

298

299

clf = AutoEncoder(

300

hidden_neuron_list=[64, 32, 16, 32, 64],

301

epochs=100,

302

device=device, # Specify device for PyTorch models

303

contamination=0.1

304

)

305

```

306

307

## Best Practices

308

309

1. **Data preprocessing**: Deep models often benefit from standardization

310

2. **Architecture design**: Start with simple architectures and increase complexity

311

3. **Regularization**: Use dropout and L2 regularization to prevent overfitting

312

4. **Validation monitoring**: Monitor validation loss for early stopping

313

5. **Hyperparameter tuning**: Learning rate and architecture are most critical

314

6. **Ensemble methods**: Combine multiple deep models for better robustness