or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

configuration.mdcore-analysis.mdindex.mdreport-display.md

configuration.mddocs/

0

# Configuration

1

2

Configuration system for controlling feature type detection, analysis parameters, and report customization. Enables fine-tuned control over which features to analyze and how they should be interpreted.

3

4

## Capabilities

5

6

### Feature Configuration

7

8

Controls how individual features are processed during analysis. Allows overriding automatic type detection and excluding features from analysis.

9

10

```python { .api }

11

class FeatureConfig:

12

def __init__(self,

13

skip: Union[str, List[str], Tuple[str]] = None,

14

force_cat: Union[str, List[str], Tuple[str]] = None,

15

force_text: Union[str, List[str], Tuple[str]] = None,

16

force_num: Union[str, List[str], Tuple[str]] = None):

17

"""

18

Configure feature processing behavior.

19

20

Parameters:

21

- skip: Features to exclude from analysis

22

- force_cat: Features to treat as categorical

23

- force_text: Features to treat as text

24

- force_num: Features to treat as numerical

25

26

All parameters accept single strings, lists, or tuples of feature names.

27

"""

28

29

def get_predetermined_type(self, feature_name: str) -> FeatureType:

30

"""

31

Get the predetermined type for a feature.

32

33

Parameters:

34

- feature_name: Name of the feature

35

36

Returns:

37

FeatureType enum value indicating predetermined type

38

"""

39

40

def get_all_mentioned_features(self) -> List[str]:

41

"""

42

Get list of all features mentioned in configuration.

43

44

Returns:

45

List of all feature names in any configuration category

46

"""

47

```

48

49

#### Usage Examples

50

51

```python

52

import sweetviz as sv

53

54

# Skip specific features

55

config = sv.FeatureConfig(skip=['id', 'timestamp'])

56

report = sv.analyze(df, feat_cfg=config)

57

58

# Force feature types

59

config = sv.FeatureConfig(

60

skip='user_id',

61

force_cat=['status', 'category'],

62

force_num=['year', 'rating'],

63

force_text='description'

64

)

65

report = sv.analyze(df, feat_cfg=config)

66

67

# Multiple ways to specify features

68

config = sv.FeatureConfig(

69

skip=['id', 'created_at'], # List

70

force_cat=('status', 'type'), # Tuple

71

force_num='rating' # Single string

72

)

73

74

# Check configuration

75

config = sv.FeatureConfig(skip=['id'], force_cat=['status'])

76

feature_type = config.get_predetermined_type('status') # Returns FeatureType.TYPE_CAT

77

all_features = config.get_all_mentioned_features() # Returns ['id', 'status']

78

```

79

80

### Global Configuration

81

82

System-wide settings controlled through INI configuration files. Allows customizing default behavior, appearance, and performance parameters.

83

84

```python { .api }

85

import configparser

86

87

config_parser: configparser.ConfigParser

88

```

89

90

#### Usage Examples

91

92

```python

93

import sweetviz as sv

94

95

# Load custom configuration

96

sv.config_parser.read("my_config.ini")

97

98

# Must be called before creating reports

99

report = sv.analyze(df)

100

```

101

102

#### Configuration File Structure

103

104

Create custom INI files to override defaults:

105

106

```ini

107

[General]

108

default_verbosity = progress_only

109

use_cjk_font = 1

110

111

[Output_Defaults]

112

html_layout = vertical

113

html_scale = 0.9

114

notebook_layout = widescreen

115

notebook_scale = 0.8

116

notebook_width = 100%

117

notebook_height = 700

118

119

[Layout]

120

show_logo = 0

121

122

[comet_ml_defaults]

123

html_layout = vertical

124

html_scale = 0.85

125

```

126

127

## Feature Type Control

128

129

### Automatic Type Detection

130

131

Sweetviz automatically detects feature types:

132

133

- **Numerical**: Integer and float columns

134

- **Categorical**: String columns and low-cardinality numerics

135

- **Boolean**: Binary columns (True/False, 1/0, Yes/No)

136

- **Text**: High-cardinality string columns

137

138

### Type Override Examples

139

140

```python

141

# Common override scenarios

142

143

# Treat year as categorical instead of numerical

144

config = sv.FeatureConfig(force_cat=['year'])

145

146

# Treat encoded categories as numerical

147

config = sv.FeatureConfig(force_num=['category_encoded'])

148

149

# Treat long strings as text features

150

config = sv.FeatureConfig(force_text=['comments', 'description'])

151

152

# Skip features that shouldn't be analyzed

153

config = sv.FeatureConfig(skip=['id', 'uuid', 'internal_code'])

154

155

# Combined configuration

156

config = sv.FeatureConfig(

157

skip=['id', 'created_at', 'updated_at'],

158

force_cat=['zip_code', 'product_code'],

159

force_num=['rating_1_to_5'],

160

force_text=['user_comments']

161

)

162

```

163

164

## Configuration Categories

165

166

### General Settings

167

168

```ini

169

[General]

170

# Verbosity levels: full, progress_only, off, default

171

default_verbosity = progress_only

172

173

# Enable CJK (Chinese/Japanese/Korean) font support

174

use_cjk_font = 1

175

```

176

177

### Output Defaults

178

179

```ini

180

[Output_Defaults]

181

# HTML report defaults

182

html_layout = widescreen # widescreen or vertical

183

html_scale = 1.0

184

185

# Notebook display defaults

186

notebook_layout = vertical

187

notebook_scale = 0.9

188

notebook_width = 100% # Use %% for literal %

189

notebook_height = 700

190

```

191

192

### Layout Customization

193

194

```ini

195

[Layout]

196

# Remove Sweetviz logo

197

show_logo = 0

198

199

# Custom styling options (advanced)

200

# See sweetviz_defaults.ini for full options

201

```

202

203

### Comet.ml Integration

204

205

```ini

206

[comet_ml_defaults]

207

# Defaults for Comet.ml logging

208

html_layout = vertical

209

html_scale = 0.85

210

```

211

212

## Special Handling

213

214

### Index Column Renaming

215

216

Features named "index" are automatically renamed to "df_index" to avoid conflicts:

217

218

```python

219

# If DataFrame has column named 'index'

220

df = pd.DataFrame({'index': [1,2,3], 'value': [10,20,30]})

221

222

# Sweetviz automatically renames to 'df_index'

223

config = sv.FeatureConfig(skip=['df_index']) # Use 'df_index', not 'index'

224

report = sv.analyze(df, feat_cfg=config)

225

```

226

227

### Target Feature Constraints

228

229

```python

230

# Target features must be boolean or numerical

231

config = sv.FeatureConfig(force_num=['encoded_target'])

232

report = sv.analyze(df, target_feat='encoded_target', feat_cfg=config)

233

234

# This will raise ValueError - categorical targets not supported

235

try:

236

report = sv.analyze(df, target_feat='category_column')

237

except ValueError as e:

238

print("Use force_num to convert categorical to numerical if appropriate")

239

```

240

241

## Performance Configuration

242

243

### Pairwise Analysis Threshold

244

245

Control when correlation analysis prompts for confirmation:

246

247

```python

248

# Large datasets - control pairwise analysis

249

report = sv.analyze(large_df, pairwise_analysis='off') # Skip correlations

250

report = sv.analyze(large_df, pairwise_analysis='on') # Force correlations

251

report = sv.analyze(large_df, pairwise_analysis='auto') # Auto-decide (default)

252

```

253

254

### Memory Optimization

255

256

```python

257

# For large datasets, skip expensive computations

258

config = sv.FeatureConfig(skip=list_of_high_cardinality_features)

259

report = sv.analyze(df,

260

feat_cfg=config,

261

pairwise_analysis='off')

262

263

# Use smaller scale for large reports

264

report.show_html(scale=0.7)

265

```

266

267

## Error Handling

268

269

```python

270

# Handle configuration errors

271

try:

272

config = sv.FeatureConfig(skip=['nonexistent_column'])

273

report = sv.analyze(df, feat_cfg=config)

274

except Exception as e:

275

print(f"Configuration warning: {e}")

276

277

# Handle INI file errors

278

try:

279

sv.config_parser.read("nonexistent.ini")

280

except FileNotFoundError:

281

print("Configuration file not found, using defaults")

282

283

# Validate feature names exist

284

available_features = set(df.columns)

285

skip_features = ['id', 'timestamp']

286

valid_skip = [f for f in skip_features if f in available_features]

287

config = sv.FeatureConfig(skip=valid_skip)

288

```

289

290

## Configuration Best Practices

291

292

### Common Patterns

293

294

```python

295

# Standard data science workflow

296

config = sv.FeatureConfig(

297

skip=['id', 'uuid', 'created_at', 'updated_at'], # Skip metadata

298

force_cat=['zip_code', 'product_id'], # IDs as categories

299

force_num=['rating', 'score'], # Ordinal as numeric

300

force_text=['comments', 'description'] # Long text fields

301

)

302

303

# Time series data

304

config = sv.FeatureConfig(

305

skip=['timestamp', 'date'], # Skip time columns

306

force_cat=['day_of_week'], # Cyclical as categorical

307

force_num=['month', 'quarter'] # Temporal as numeric

308

)

309

310

# Survey data

311

config = sv.FeatureConfig(

312

force_cat=['satisfaction_level', 'education'], # Ordinal categories

313

force_num=['age_group', 'income_bracket'], # Ranked as numeric

314

force_text=['feedback_text'] # Open responses

315

)

316

```

317

318

### Configuration Management

319

320

```python

321

# Save configuration for reuse

322

def create_standard_config():

323

return sv.FeatureConfig(

324

skip=['id', 'timestamp'],

325

force_cat=['category', 'status'],

326

force_num=['rating']

327

)

328

329

# Use across multiple analyses

330

config = create_standard_config()

331

train_report = sv.analyze(train_df, feat_cfg=config)

332

test_report = sv.analyze(test_df, feat_cfg=config)

333

```