Tessl Tile for pypi/sweetviz@2.3.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

configuration.md core-analysis.md index.md report-display.md

configuration.mddocs/

0
# Configuration
1

2
Configuration system for controlling feature type detection, analysis parameters, and report customization. Enables fine-tuned control over which features to analyze and how they should be interpreted.
3

4
## Capabilities
5

6
### Feature Configuration
7

8
Controls how individual features are processed during analysis. Allows overriding automatic type detection and excluding features from analysis.
9

10
```python { .api }
11
class FeatureConfig:
12
    def __init__(self,
13
                 skip: Union[str, List[str], Tuple[str]] = None,
14
                 force_cat: Union[str, List[str], Tuple[str]] = None,
15
                 force_text: Union[str, List[str], Tuple[str]] = None,
16
                 force_num: Union[str, List[str], Tuple[str]] = None):
17
        """
18
        Configure feature processing behavior.
19
        
20
        Parameters:
21
        - skip: Features to exclude from analysis
22
        - force_cat: Features to treat as categorical
23
        - force_text: Features to treat as text
24
        - force_num: Features to treat as numerical
25
        
26
        All parameters accept single strings, lists, or tuples of feature names.
27
        """
28
    
29
    def get_predetermined_type(self, feature_name: str) -> FeatureType:
30
        """
31
        Get the predetermined type for a feature.
32
        
33
        Parameters:
34
        - feature_name: Name of the feature
35
        
36
        Returns:
37
        FeatureType enum value indicating predetermined type
38
        """
39
    
40
    def get_all_mentioned_features(self) -> List[str]:
41
        """
42
        Get list of all features mentioned in configuration.
43
        
44
        Returns:
45
        List of all feature names in any configuration category
46
        """
47
```
48

49
#### Usage Examples
50

51
```python
52
import sweetviz as sv
53

54
# Skip specific features
55
config = sv.FeatureConfig(skip=['id', 'timestamp'])
56
report = sv.analyze(df, feat_cfg=config)
57

58
# Force feature types
59
config = sv.FeatureConfig(
60
    skip='user_id',
61
    force_cat=['status', 'category'],
62
    force_num=['year', 'rating'],
63
    force_text='description'
64
)
65
report = sv.analyze(df, feat_cfg=config)
66

67
# Multiple ways to specify features
68
config = sv.FeatureConfig(
69
    skip=['id', 'created_at'],              # List
70
    force_cat=('status', 'type'),           # Tuple  
71
    force_num='rating'                      # Single string
72
)
73

74
# Check configuration
75
config = sv.FeatureConfig(skip=['id'], force_cat=['status'])
76
feature_type = config.get_predetermined_type('status')  # Returns FeatureType.TYPE_CAT
77
all_features = config.get_all_mentioned_features()      # Returns ['id', 'status']
78
```
79

80
### Global Configuration
81

82
System-wide settings controlled through INI configuration files. Allows customizing default behavior, appearance, and performance parameters.
83

84
```python { .api }
85
import configparser
86

87
config_parser: configparser.ConfigParser
88
```
89

90
#### Usage Examples
91

92
```python
93
import sweetviz as sv
94

95
# Load custom configuration
96
sv.config_parser.read("my_config.ini")
97

98
# Must be called before creating reports
99
report = sv.analyze(df)
100
```
101

102
#### Configuration File Structure
103

104
Create custom INI files to override defaults:
105

106
```ini
107
[General]
108
default_verbosity = progress_only
109
use_cjk_font = 1
110

111
[Output_Defaults]  
112
html_layout = vertical
113
html_scale = 0.9
114
notebook_layout = widescreen
115
notebook_scale = 0.8
116
notebook_width = 100%
117
notebook_height = 700
118

119
[Layout]
120
show_logo = 0
121

122
[comet_ml_defaults]
123
html_layout = vertical
124
html_scale = 0.85
125
```
126

127
## Feature Type Control
128

129
### Automatic Type Detection
130

131
Sweetviz automatically detects feature types:
132

133
- **Numerical**: Integer and float columns
134
- **Categorical**: String columns and low-cardinality numerics  
135
- **Boolean**: Binary columns (True/False, 1/0, Yes/No)
136
- **Text**: High-cardinality string columns
137

138
### Type Override Examples
139

140
```python
141
# Common override scenarios
142

143
# Treat year as categorical instead of numerical
144
config = sv.FeatureConfig(force_cat=['year'])
145

146
# Treat encoded categories as numerical 
147
config = sv.FeatureConfig(force_num=['category_encoded'])
148

149
# Treat long strings as text features
150
config = sv.FeatureConfig(force_text=['comments', 'description'])
151

152
# Skip features that shouldn't be analyzed
153
config = sv.FeatureConfig(skip=['id', 'uuid', 'internal_code'])
154

155
# Combined configuration
156
config = sv.FeatureConfig(
157
    skip=['id', 'created_at', 'updated_at'],
158
    force_cat=['zip_code', 'product_code'], 
159
    force_num=['rating_1_to_5'],
160
    force_text=['user_comments']
161
)
162
```
163

164
## Configuration Categories
165

166
### General Settings
167

168
```ini
169
[General]
170
# Verbosity levels: full, progress_only, off, default
171
default_verbosity = progress_only
172

173
# Enable CJK (Chinese/Japanese/Korean) font support
174
use_cjk_font = 1
175
```
176

177
### Output Defaults
178

179
```ini
180
[Output_Defaults]
181
# HTML report defaults
182
html_layout = widescreen    # widescreen or vertical
183
html_scale = 1.0
184

185
# Notebook display defaults  
186
notebook_layout = vertical
187
notebook_scale = 0.9
188
notebook_width = 100%       # Use %% for literal %
189
notebook_height = 700
190
```
191

192
### Layout Customization
193

194
```ini
195
[Layout]
196
# Remove Sweetviz logo
197
show_logo = 0
198

199
# Custom styling options (advanced)
200
# See sweetviz_defaults.ini for full options
201
```
202

203
### Comet.ml Integration
204

205
```ini
206
[comet_ml_defaults]
207
# Defaults for Comet.ml logging
208
html_layout = vertical
209
html_scale = 0.85
210
```
211

212
## Special Handling
213

214
### Index Column Renaming
215

216
Features named "index" are automatically renamed to "df_index" to avoid conflicts:
217

218
```python
219
# If DataFrame has column named 'index'
220
df = pd.DataFrame({'index': [1,2,3], 'value': [10,20,30]})
221

222
# Sweetviz automatically renames to 'df_index'
223
config = sv.FeatureConfig(skip=['df_index'])  # Use 'df_index', not 'index'
224
report = sv.analyze(df, feat_cfg=config)
225
```
226

227
### Target Feature Constraints
228

229
```python
230
# Target features must be boolean or numerical
231
config = sv.FeatureConfig(force_num=['encoded_target'])
232
report = sv.analyze(df, target_feat='encoded_target', feat_cfg=config)
233

234
# This will raise ValueError - categorical targets not supported
235
try:
236
    report = sv.analyze(df, target_feat='category_column')
237
except ValueError as e:
238
    print("Use force_num to convert categorical to numerical if appropriate")
239
```
240

241
## Performance Configuration
242

243
### Pairwise Analysis Threshold
244

245
Control when correlation analysis prompts for confirmation:
246

247
```python
248
# Large datasets - control pairwise analysis
249
report = sv.analyze(large_df, pairwise_analysis='off')    # Skip correlations
250
report = sv.analyze(large_df, pairwise_analysis='on')     # Force correlations
251
report = sv.analyze(large_df, pairwise_analysis='auto')   # Auto-decide (default)
252
```
253

254
### Memory Optimization
255

256
```python
257
# For large datasets, skip expensive computations
258
config = sv.FeatureConfig(skip=list_of_high_cardinality_features)
259
report = sv.analyze(df, 
260
                   feat_cfg=config,
261
                   pairwise_analysis='off')
262

263
# Use smaller scale for large reports
264
report.show_html(scale=0.7)
265
```
266

267
## Error Handling
268

269
```python
270
# Handle configuration errors
271
try:
272
    config = sv.FeatureConfig(skip=['nonexistent_column'])
273
    report = sv.analyze(df, feat_cfg=config)
274
except Exception as e:
275
    print(f"Configuration warning: {e}")
276

277
# Handle INI file errors
278
try:
279
    sv.config_parser.read("nonexistent.ini")
280
except FileNotFoundError:
281
    print("Configuration file not found, using defaults")
282

283
# Validate feature names exist
284
available_features = set(df.columns)
285
skip_features = ['id', 'timestamp']
286
valid_skip = [f for f in skip_features if f in available_features]
287
config = sv.FeatureConfig(skip=valid_skip)
288
```
289

290
## Configuration Best Practices
291

292
### Common Patterns
293

294
```python
295
# Standard data science workflow
296
config = sv.FeatureConfig(
297
    skip=['id', 'uuid', 'created_at', 'updated_at'],  # Skip metadata
298
    force_cat=['zip_code', 'product_id'],             # IDs as categories
299
    force_num=['rating', 'score'],                    # Ordinal as numeric
300
    force_text=['comments', 'description']            # Long text fields
301
)
302

303
# Time series data
304
config = sv.FeatureConfig(
305
    skip=['timestamp', 'date'],     # Skip time columns
306
    force_cat=['day_of_week'],      # Cyclical as categorical
307
    force_num=['month', 'quarter']  # Temporal as numeric
308
)
309

310
# Survey data
311
config = sv.FeatureConfig(
312
    force_cat=['satisfaction_level', 'education'],  # Ordinal categories
313
    force_num=['age_group', 'income_bracket'],      # Ranked as numeric
314
    force_text=['feedback_text']                    # Open responses
315
)
316
```
317

318
### Configuration Management
319

320
```python
321
# Save configuration for reuse
322
def create_standard_config():
323
    return sv.FeatureConfig(
324
        skip=['id', 'timestamp'],
325
        force_cat=['category', 'status'],
326
        force_num=['rating']
327
    )
328

329
# Use across multiple analyses
330
config = create_standard_config()
331
train_report = sv.analyze(train_df, feat_cfg=config)
332
test_report = sv.analyze(test_df, feat_cfg=config)
333
```

Version

Tile

Files

configuration.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

configuration.mddocs/